Gradient Descent with Numpy

Gradient Descent with Numpy COMP 4332 Tutorial 3 Feb 25

Outline • Main idea of Gradient Descent • 1 dimension example • 2 dimension example • Useful tools in Scipy • Simply ploting

Main idea of Gradient Descent • Goal: optimize a function • To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. • Result: A local minimum

1 dimension example • Local minimum is • (-1,2)

1 dimension example • Randomly assign a value to , • Find the gradient at () • Repeat this process • Let lengh of a step be 0.2 • … after many iterations

1 dimension example • Python code """ 1 dimension example to optimize Created on Feb 1, 2012 @author: kxmo """ import numpy def f(x): y = x*x + 2*x +3 return y def diff(x): y = 2*x+2 return y defgradient_descent(): print "gradient_descent \n" x = 3 step = 0.2 for iter in xrange(100): dfx = diff(x) x = x-dfx*step y = f(x) print iter print x,y if __name__ == '__main__': print "Begin" gradient_descent() print "End"

1 dimension example • Python code result 0 1.4 7.76 1 0.44 4.0736 2 -0.136 2.746496 3 -0.4816 2.26873856 4 -0.68896 2.0967458816 5 -0.813376 2.03482851738 6 -0.8880256 2.01253826626 7 -0.93281536 2.00451377585 8 -0.959689216 2.00162495931 9 -0.9758135296 2.00058498535 10 -0.98548811776 2.00021059473 11 -0.991292870656 2.0000758141 12 -0.994775722394 2.00002729308 13 -0.996865433436 2.00000982551 … 48 -0.999999999946 2.0 49 -0.999999999968 2.0 50 -0.999999999981 2.0 51 -0.999999999988 2.0 52 -0.999999999993 2.0 53 -0.999999999996 2.0 54 -0.999999999997 2.0 55 -0.999999999998 2.0 56 -0.999999999999 2.0 57 -0.999999999999 2.0 58 -1.0 2.0 59 -1.0 2.0 60 -1.0 2.0

2 dimension example • Rosenbrockfunction • Minimum at

2 dimension example • Input: a vector

2 dimension example • Let • … repeat and repeat • Choosing steps should be very • careful.

2 dimension example • Python code defgradient_descent(): print "gradient_descent \n" x = array([1,-1.5]) step = 0.002 for iter in xrange(10000): dfx = diff(x) x = x-dfx*step # x -= dfx*step y = f(x) print iter, print x,y if __name__ == '__main__': print "Begin" gradient_descent() print "End" """ 2 dimension example to optimize Rosenbrock function Created on Feb 2, 2012 @author: kxmo ""“ deff(x): y = (1-x[0])**2+100*((x[1]-x[0]**2)**2) return y def diff(x): ## diff on x[0] and x[1] x1 = -2*(1-x[0])-400*(x[1]-x[0]**2)*x[0] x2 = 200*(x[1]-x[0]**2) y = array([x1,x2]) return y

2 dimension example • Python code result 6571 [ 0.99850149 0.99699922] 2.24914102097e-06 6572 [ 0.99850269 0.99700162] 2.24554097151e-06 6573 [ 0.99850389 0.99700402] 2.2419466913e-06 6574 [ 0.99850508 0.99700642] 2.23835817108e-06 6575 [ 0.99850628 0.99700881] 2.2347754016e-06 …. 0 [-1. -0.5] 229.0 1 [ 0.208 0.1 ] 0.9491613696 2 [ 0.22060887 0.0773056 ] 0.689460178665 3 [ 0.22878055 0.06585067] 0.613031790076 4 [ 0.23433811 0.06044662] 0.589298719311 5 [ 0.2384379 0.05823371] 0.580167571748 6 [ 0.24174759 0.05768128] 0.575004572635 7 [ 0.2446335 0.05798553] 0.570924522072 8 [ 0.24729094 0.05872954] 0.567158149611 9 [ 0.24982238 0.05969885] 0.563502163823 10 [ 0.252281 0.0607838] 0.559902757103 …

Useful tools in Scipy • We can use a L-BFGS solver in Scipy • In put of a L-BFGS solver is • Objective function: F • Gradient of F • Initial guess of a value • x, f, d = fmin_l_bfgs_b(f, x0, fprime=diff)

Using L-BFGS solver in scipy from numpy import * from scipy.optimize import *; def f(x): y = (1-x[0])**2+100*((x[1]-x[0]**2)**2) return y def diff(x): ## diff on x[0] and x[1] x1 = -2*(1-x[0])-400*(x[1]-x[0]**2)*x[0] x2 = 200*(x[1]-x[0]**2) y = array([x1,x2]) return y if __name__ == '__main__': print "Begin" ##gradient_descent(f,diff) x0 = array([1,-1.5]) x, f, d = fmin_l_bfgs_b(f, x0, fprime=diff, iprint = 1) print "The result is", x print "smallest value is", f print "End"

Result >>> Begin The result is [ 1.00000001 1.00000002] smallest value is 1.05243324104e-16 End >>>

Simple Ploting import numpy import pylab # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2 mu, sigma = 2, 0.5 v = numpy.random.normal(mu,sigma,10000) # Plot a normalized histogram with 50 bins pylab.hist(v, bins=50, normed=1) # matplotlib version (plot) pylab.show() # Compute the histogram with numpy and then plot it (n, bins) = numpy.histogram(v, bins=50, normed=True) # NumPy version (no plot) pylab.plot(.5*(bins[1:]+bins[:-1]), n) pylab.show()

Result

Gradient Descent with Numpy

Gradient Descent with Numpy

Presentation Transcript

Online convex optimization Gradient descent without a gradient

Blind online optimization Gradient descent without a gradient

numpy

Semi-Stochastic Gradient Descent Methods

Semi-Stochastic Gradient Descent Methods

Gradient descent

Better Data Assimilation through Gradient Descent

Stochastic Gradient Descent and Tree Parameterizations in SLAM

Gradient Descent Rule Tuning

Efficient Logistic Regression with Stochastic Gradient Descent: The Continuing Saga

Efficient Logistic Regression with Stochastic Gradient Descent

The Gradient Descent Algorithm

Efficient Logistic Regression with Stochastic Gradient Descent – part 2

Efficient Logistic Regression with Stochastic Gradient Descent

Python NumPy Tutorial | NumPy Array | Edureka

Descent with Modification

Gradient Descent

Numpy Tutorial

Better Data Assimilation through Gradient Descent

ECE471-571 – Pattern Recognition Lecture 14 – Gradient Descent

Descent with Modification