1 / 16

Machine Learning: Foundations Course TAU – 2012A Prof. Yishay Mansour

Machine Learning: Foundations Course TAU – 2012A Prof. Yishay Mansour. TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton * , J. Winn † , C. Rother † , and A. Criminisi † * University of Cambridge

mio
Download Presentation

Machine Learning: Foundations Course TAU – 2012A Prof. Yishay Mansour

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning: Foundations CourseTAU – 2012AProf. YishayMansour • TextonBoost:Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton*, J. Winn†, C. Rother†, and A. Criminisi† • * University of Cambridge • † Microsoft Research Ltd, Cambridge, UK Yaniv Bar March 2013

  2. Goal • Simultaneous recognition and segmentation: • Efficiently detect a large number of object classes and give a pixel-perfect segmentation of an image into these classes.

  3. Data and Classes • Original Paper: 3 DBs. Main DB: MSRC 21. • MSRC 21-Class Object Recognition Database • 591 hand-labelled images • Original main DB was updated to MSRC 23. • MSRC 23-Class Object Recognition Database • 592 hand-labelled images

  4. High Level Approach • High-level description of approach: Learn classifier based on relative texture locations for each class. Classification is then refined. Given an image, for each pixel: - Texture-Layout features are calculated - A boosting classifier gives the probability of the pixel belonging to each class - The discriminative model combines the boosting output with low-level color, location, and edge information; image receives final label.

  5. Texture layout Features • Most important part of the model is the Shape/Context Potential – it is significant for object recognition and very rough segmentation results. • Other potential such as Edge and Color refine the segmentation results. (a) Original image, (b) Shape, (c) (b)+edge, (d) (c)+color

  6. For modeling object shape, appearance and context we use a New texton-based features. This feature (texton) compact and efficient characterisation of local texture.

  7. What Are Textons • The task is to recognize surfaces made from different materials on the basis of their texture appearance. • Different materials show different texture appearance. • Moreover, texture appearance of the same material changes dramatically due to different viewpoint/lighting settings (specularities, shadows, and occlusions).

  8. Input image Calculating Texture-Layout features • Computing texton maps: Clustering  • Responses are clustered with K-means Texton map Colours  Texton Indices Filter Bank • Each pixel is assigned a texton number • Convolve 17-D filter bank (composed of gaussians, dogs, logs) with all training images

  9. Capturing appearance:

  10. How Texture-Layout features jointly model texture and layout:

  11. Learning • Learning is done with Joint Boost algorithm – A version of Multi class gentle boost algorithm. • I’ve used both AdaBoost.M1 and AdaBoost.Mh (multiclass reduction to binary which is due to the fact that AdaBoosting is only for binary classification).

  12. The Good and Bad • The Good: Provides reasonable recognition + segmentation for many classes. Also, combines several good ideas. Most of previous works didn’t tackle the problem as a whole – rather, problems were treated separately. • The Bad: Does not beat past work (in terms of quantitative recognition results) and a bit hacky.

  13. Code-Sequence of execution 1. imagesTextonization.m (extract efficient images characterization) 2. calcModelFeatures.m (calculate the appearance (shape) potential context) 3. trainModel.m (build a classification model) 4. testModel.m (test the classification model with test data)

  14. Results

More Related