1 / 12

Unsupervised Joint Alignment of Complex Images Gary B Huang, Vidit Jain, Erik Learned-Miller

Unsupervised Joint Alignment of Complex Images Gary B Huang, Vidit Jain, Erik Learned-Miller. Joint Face Alignment. The Recognition Pipeline Most systems ignore the middle stage, relying on the initial detector to do a rough alignment

fabrizio
Download Presentation

Unsupervised Joint Alignment of Complex Images Gary B Huang, Vidit Jain, Erik Learned-Miller

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unsupervised Joint Alignment of Complex ImagesGary B Huang, Vidit Jain, Erik Learned-Miller

  2. Joint Face Alignment • The Recognition Pipeline • Most systems ignore the middle stage, relying on the initial detector to do a rough alignment • Alignment reduces variability and allows for conditioning on spatial position and analysis of structure • Two major drawbacks to current alignment methods • Designed for a single class • Require manually labeling of either specific features or pose • More involved than simple discrete labels for detection and recognition • AAM - ~80 landmarks for >100 training images • Unsupervised method with congealing • No manually selected landmarks or hand selected parts • No image explicitly labeled as canonical pose • End result entirely determined by data

  3. Congealing update distribution field from transformed images increase likelihood of image with respect to distribution field • Intuition • Intra-class images have similar structure and shape • Thus, low variability of pixel values at specific location • Distribution Field • Distribution over alphabet ({0,1} for binary images) at each pixel • Set of images defines an empirical distribution field • Congealing

  4. Congealing • How to align a new image after congealing? • Insert into training set, re-run algorithm • More efficient to save sequence of distribution fields from congealing • High entropy to low entropy sequence  “Image Funnel” • Funneling: increase likelihood of new image at each iteration according to corresponding distribution field Image Funnel New Image Aligned Image

  5. Congealing Complex Images • Congealing has proven to work well on certain object classes • Traditionally applied directly to pixel values • Applied successfully to binary handwritten digits and MRI volumes • Our goal: Extend congealing to deal with noise in real world images • Complex and variable lighting effects • Occlusions • Highly varied foreground objects (hair, hats, glasses…) • Highly varied backgrounds

  6. Congealing Complex Images • Extending Congealing to Complex Images • Traditionally congealing is done on pixel intensities • High variation due to lighting and variable foreground  high entropy even when correctly aligned • Congealing on edge values • No “basin of attraction”, plateaus in optimization landscape • Integrate over window  SIFT descriptor at each pixel • Each descriptor is 32 dimensional vector, too large to estimate entropy

  7. Congealing Complex Images • Extending Congealing to Face Images (cont) • Cluster SIFT descriptors using kmeans • Congealing on hard assignments forces pixels to take relatively small number of values • Similar local minima problems as with edge values • Initial experiments with hard assignments led to congealing terminating early with no significant changes from initial alignment • Use soft assignment of pixels to clusters • Each pixel is multinomial distribution, with probabilities equal to probability of belonging to each cluster • Does not change nature of distribution field • Distribution field is still a set of distributions, one at each pixel, over the possible clusters • Analogy with grayscale using binary alphabet • Gray pixels are treated as mixtures of underlying black and white “subpixels”

  8. Congealing Complex Images Window around pixel SIFT vector and clusters Posterior distribution

  9. Results (faces) • Congealed with 300 images from “Faces in the Wild” • Realistic data set of news photos with different people, complex backgrounds, variable illumination and foreground appearance

  10. Results (cars) • Congealed with 125 rear car images (variable background/lighting) • Achieved with no labeling and no changes to code

  11. Results on Recognition • Tested effect on recognition • Used trained hyper-feature based recognizer (Jain et al) • Tested using outputs of Viola-Jones, Zhou (supervised), and funneling • Congealing improves recognition with no added supervision

  12. Future Work • Two-tier alignment process • Score alignment results based on likelihood under final distribution field, align low scoring images in separate stage

More Related