1 / 11

RECITATION 2 APRIL 28

RECITATION 2 APRIL 28. Spline and Kernel method Gaussian Processes Mixture Modeling for Density Estimation. Penalized Cubic Regression Splines. gam() in library “ mgcv ” gam( y ~ s(x, bs =“ cr ”, k= n.knots ) , knots=list(x=c(…)), data = dataset)

brandy
Download Presentation

RECITATION 2 APRIL 28

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RECITATION 2APRIL 28 Spline and Kernel method Gaussian Processes Mixture Modeling for Density Estimation

  2. Penalized Cubic Regression Splines • gam() in library “mgcv” • gam( y ~ s(x, bs=“cr”, k=n.knots) , knots=list(x=c(…)), data = dataset) • By default, the optimal smoothing parameter selected by GCV • R Demo 1

  3. Kernel Method • Nadaraya-Watson locally constant model • locally linear polynomial model • How to define “local”? • By Kernel function, e.g. Gaussian kernel • R Demo 1 • R package: “locfit” • Function: locfit(y~x, kern=“gauss”, deg= , alpha= ) • Bandwidth selected by GCV: gcvplot(y~x, kern=“gauss”, deg= , alpha= bandwidth range)

  4. Gaussian Processes • Distribution on functions • f~ GP(m,κ) • m: mean function • κ: covariance function • p(f(x1), . . . , f(xn)) ∼ Nn(μ, K) • μ = [m(x1),...,m(xn)] • Kij = κ (xi,xj) • Idea: If xi, xjare similar according to the kernel, then f(xi) is similar to f(xj)

  5. Gaussian Processes – Noise free observations • Example task: • learn a function f(x) to estimate y, from data (x, y) • A function can be viewed as a random variable of infinite dimensions • GP provides a distribution over functions.

  6. Gaussian Processes – Noise free observations • Model • (x, f) are the observed locations and values (training data) • (x*, f*) are the test or prediction data locations and values. • After observing some noise free data (x, f), • Length-scale • R Demo 2

  7. Gaussian Processes – Noisy observations(GP for Regression) • Model • (x, y) are the observed locations and values (training data) • (x*, f*) are the test or prediction data locations and values. • After observing some noisy data (x, y), • R Demo 3

  8. Reference • Chapter 2 from Gaussian Processes for Machine Learning Carl Edward Rasmussen and Christopher K. I. Williams • 527 lecture notes by Emily Fox

  9. Mixture Models – Density Estimation • EM algorithm vs. Bayesian Markov Chain Monte Carlo (MCMC) • Remember: • EM algorithm = iterative algorithm that MAXIMIZES LIKELIHOOD • MCMC DRAWS FROM POSTERIOR (i.e. likelihood+prior)

  10. EM algorithm • Iterative procedure that attempts to maximize log-likelihood ---> MLE estimates of the mixture model parameters. • I.e. one final density estimate

  11. Bayesian Mixture Modeling (MCMC) • Uses an iterative procedure to DRAW SAMPLES from posterior (then you can average draws, etc.) • Don’t need to understand fine details but know that every iteration you get a set of parameter estimates from your posterior distribution.

More Related