Linear Algebra
1. What is stochastic gradient descent in contrast to gradient descent? Why might you choose one versus the other?
Linear Algebra 1. What is stochastic gradient descent in contrast to gradient descent? Why might you...
Briefly explain the difference between batch gradient descent and stochastic gradient descent. When would you prefer one over the other?
In the lectures, we introduced Gradient Descent, an optimization method to find the minimum value of a function. In this problem we try to solve a fairly simple optimization problem: min f(x) = x2 TER That is, finding the minimum value of x2 over the real line. Of course you know it is when x = 0, but this time we do it with gradient descent. Recall that to perform gradient descent, you start at an arbitrary initial point xo,...
linear algebra Another use of the transition matrix is in stochastic modeling. An example is the following: Imagine a park with three locations: a lake; a picnic area; and a playground. Every hour, on the hour, the parkgoers move according to the following rulcs: Half of those at the lake move to the picnic area, and one-quarter of those at the lake move to the playground. Half of those at the picnic area go to the lake, and the other...
Problem 3. (30 pts.) Let f(x) 32-1 (a) Calculate the derivative (the gradient) (r) and the second derivative (the Hessian) "() (4pts) (b) Using ro = 10, iterate the gradient descent method (you choose your ok) until s(k10-6 (11 pts) (c) Using zo = 10, iterate Newton's method (you choose your 0k ) until Irk-rk-1 < 10-6. (15 pts) Problem 4. (30 pts.) Let D ), (1,2), (3,2), (4,3),(4,4)] be a collection of data points. Your task is to find...
machine learning/ stats questions 1. Choose all the valid answers to the description about linear regression and logistic regression from the options below: A. Linear regression is an unsupervised learning problem; logistic regression is a super- vised learning problem. B. Linear regression deals with the prediction of co ontinuous values; logistic regression deals with the prediction of class labe C. We cannot use gradient descent to solve linear regression: we must resort to least square estimation to compute a closed-form...
need help with linear algebra problem #15 will thumbs up! thank you 13. The gradient Vh(x,y) = 14. For the function h(x, y), what is the isoparametric curve h(3, 0)? 15. For the function s(x, y) = x2 + y2, sketch the contour s(x, y) = 4. 16. Find the bilinear interpolant to the following four points at (0.5, 0.5). b0,0 = b1,0 = 0 b1,1 = 1 b0,1 = 1 17. What are the isosurfaces of the trivariate function...
Gradient descent weight update rule for a tanh unit. (2 pts) Assume throughout this exercise that we are using gradient descent to minimize the error as defined in formula (4.2) on p.89 in the textbook: td -od Recall that the corresponding weight update rule for a sigmoid unit like the one in Figure 4.6 on p.96 in the textbook is: td - od) od (1-od) i,d ded Let us replace the sigmoid function σ in Figure 4.6 by the function...
Linear Algebra Explain why the nullspace of a matrix A is always nonempty. What is the definition of the column space of a matrix A? Briefly explain why this is different from the nullspace.
What is the geometry of water? Explain why it is not linear, as one might suspect. Illustrations are permissible.
We have a dataset that has real-valued labels and one feature. The dataset contains three training examples. (ro is the intercept.) 10.40.21 1 0.80.86 1-1.20.35 In all calculation below. keep four decimal digits for all intermediate results and use those Part 1: Stochastic Gradient Descent (12%) Perform linear regression with stochastic gradient descent algorithm for three iterations and rounded results for next-step calculation. fill in the blanks in the following tables. The hypothesis i For simplicity, let's process the three...