1 / 13

The Role of Learning in Vision

The Role of Learning in Vision. 3.30pm: Rob Fergus 3.40pm: Andrew Ng 3 .50pm: Kai Yu 4.00pm: Yann LeCun 4.10pm: Alan Yuille 4.20pm: Deva Ramanan 4.30pm: Erik Learned-Miller 4 .40pm: Erik Sudderth 4.50pm: Spotlights - Qiang Ji , M-H Yang 4 .55pm: Discussion

cheryl
Download Presentation

The Role of Learning in Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Role of Learning in Vision 3.30pm: Rob Fergus 3.40pm: Andrew Ng 3.50pm: Kai Yu 4.00pm:YannLeCun 4.10pm: Alan Yuille 4.20pm: Deva Ramanan 4.30pm: Erik Learned-Miller 4.40pm: Erik Sudderth 4.50pm: Spotlights - QiangJi, M-H Yang 4.55pm: Discussion 5.30pm: End Overview Feature / Deep Learning Compositional Models Learning Representations Low-level Representations Learning on the fly

  2. An Overview of Hierarchical Feature Learning and Relations to Other Models Rob Fergus Dept. of Computer Science, Courant Institute, New York University

  3. Motivation • Multitude of hand-designed features currently in use • SIFT, HOG, LBP, MSER, Color-SIFT…………. • Maybe some way of learning the features? • Also, just capture low-level edge gradients Yan & Huang (Winner of PASCAL 2010 classification competition) Felzenszwalb, Girshick, McAllester and Ramanan, PAMI 2007

  4. Beyond Edges? • Mid-level cues Continuation Parallelism Junctions Corners “Tokens” from Vision by D.Marr: • High-level object parts: • Difficult to hand-engineer  What about learning them?

  5. Deep/Feature Learning Goal • Build hierarchy of feature extractors (≥ 1 layers) • All the way from pixels  classifier • Homogenous structure per layer • Unsupervised training Image/Video Pixels Layer 1 Layer 2 Layer 3 Simple Classifier • Numerous approaches: • Restricted Boltzmann Machines (Hinton, Ng, Bengio,…) • Sparse coding (Yu, Fergus, LeCun) • Auto-encoders (LeCun, Bengio) • ICA variants (Ng, Cottrell) • & many more….

  6. Single Layer Architecture Input: Image Pixels / Features Filter Details in the boxes matter (especially in a hierarchy) Links to neuroscience Normalize Pool Output: Features / Classifier

  7. Example Feature Learning Architectures Filter with Dictionary (patch/tiled/convolutional) + Non-linearity Pixels / Features Normalizationbetween feature responses (Group) Sparsity Max / Softmax Local Contrast Normalization (Subtractive / Divisive) Spatial/Feature (Sum or Max) Features

  8. SIFT Descriptor Image Pixels ApplyGabor filters Spatial pool (Sum) Feature Vector Normalize to unit length

  9. Spatial Pyramid Matching Lazebnik, Schmid, Ponce [CVPR 2006] SIFTFeatures Filter with Visual Words Max Multi-scalespatial pool (Sum) Classifier

  10. Role of Normalization • Lots of different mechanisms (max, sparsity, LCN etc.) • All induce local competition between features to explain input • “Explaining away” • Just like top-down models • But more local mechanism |.|1 |.|1 |.|1 |.|1 Convolution Filters Example: Convolutional Sparse Coding Zeiler et al. [CVPR’10/ICCV’11], Kavakouglou et al. [NIPS’10], Yang et al. [CVPR’10]

  11. Role of Pooling • Spatial pooling • Invariance to small transformations • Larger receptive fields • Pooling across feature groups • Gives AND/OR type behavior • Compositional models of Zhu, Yuille Zeiler, Taylor, Fergus [ICCV 2011] • Pooling with latent variables (& springs) • Pictorial structures models Felzenszwalb, Girshick, McAllester, Ramanan[PAMI 2009] Chen, Zhu, Lin, Yuille, Zhang [NIPS 2007]

  12. Object Detection with Discriminatively Trained Part-Based Models Felzenszwalb, Girshick, McAllester, Ramanan [PAMI 2009] HOGPyramid Apply objectpart filters Pool part responses (latent variables & springs) Non-maxSuppression (Spatial) Score + +

More Related