1 / 51

Programme

Programme. 2pm Introduction Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results Mark Everingham (Oxford) 2.40pm Session 1: The Classification Task Frederic Jurie presenting work by Jianguo Zhang (INRIA) 20 mins Frederic Jurie (INRIA) 20 mins

tori
Download Presentation

Programme

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programme • 2pm Introduction • Andrew Zisserman, Chris Williams • 2.10pm Overview of the challenge and results • Mark Everingham (Oxford) • 2.40pm Session 1: The Classification Task • Frederic Jurie presenting work by • Jianguo Zhang (INRIA) 20 mins • Frederic Jurie (INRIA) 20 mins • Thomas Deselaers (Aachen) 20 mins • Jason Farquhar (Southampton) 20 mins • 4-4.30pm Coffee break • 4.30pm Session 2: The Detection Task • Stefan Duffner/Christophe Garcia (France Telecom) 30 mins • Mario Fritz (Darmstadt) 30 mins • 5.30pm Discussion • Lessons learnt, and future challenges

  2. The PASCAL Visual Object Classes Challenge Mark EveringhamLuc Van GoolChris WilliamsAndrew Zisserman

  3. Challenge • Four object classes • Motorbikes • Bicycles • People • Cars • Classification • Predict object present/absent • Detection • Predict bounding boxes of objects

  4. Competitions • Train on any (non-test) data • How well do state-of-the-art methods perform on these problems? • Which methods perform best? • Train on supplied data • Which methods perform best given specified training data?

  5. Data sets • train, val, test1 • Sampled from the same distribution of images • Images taken from PASCAL image databases • “Easier” challenge • test2 • Freshly collected for the challenge (mostly Google Images) • “Harder” challenge

  6. Training and first test set train+val test1

  7. Example images

  8. Example images

  9. Example images

  10. Example images

  11. Second test set test2

  12. Example images

  13. Example images

  14. Example images

  15. Example images

  16. Annotation for training • Object class present/absent • Sub-class labels (partial) • Car side, Car rear, etc. • Bounding boxes • Segmentation masks (partial)

  17. Issues in ground truth • What objects should be considered detectable? • Subjective judgement by size in image, level of occlusion, detection without ‘inference’ • Disagreements will cause noise in evaluation i.e. incorrectly-judged false positives • “Errors” in training data • Un-annotated objects • Requires machine learning algorithms robust to noise on class labels • Inaccurate bounding boxes • Hard to specify for some instances e.g. bicycles • Detection threshold was set “liberally”

  18. Results:Classification

  19. Participants

  20. Methods • Interest points (LoG/Harris) + patches/SIFT • Histogram of clustered descriptors • SVM: INRIA: Dalal, INRIA: Zhang • Log-linear model: Aachen • Logistic regression: Edinburgh • Other: METU • No clustering step • SVM with other kernels: MPITuebingen, Southampton • Additional features • Color: METU, moments: Southampton

  21. Methods • Image segmentation and region features: HUT • MPEG-7 color, shape, etc. • Self organizing map • Classification by detection: Darmstadt • Generalized Hough transform/SVM verification

  22. EER AUC Evaluation • Receiver Operating Characteristic (ROC) • Equal Error Rate (EER) • Area Under Curve (AUC)

  23. 1.1: Motorbikes Max EER: 0.977 (INRIA: Jurie) Competition 1: train+val/test1

  24. Competition 1: train+val/test1 • 1.2: Bicycles • Max EER: 0.930 (INRIA: Jurie, INRIA: Zhang)

  25. Competition 1: train+val/test1 • 1.3: People • Max EER: 0.917 (INRIA: Jurie, INRIA: Zhang)

  26. Competition 1: train+val/test1 • 1.4: Cars • Max EER: 0.961 (INRIA: Jurie)

  27. Competition 2: train+val/test2 • 2.1: Motorbikes • Max EER: 0.798 (INRIA: Zhang)

  28. Competition 2: train+val/test2 • 2.2: Bicycles • Max EER: 0.728 (INRIA: Zhang)

  29. Competition 2: train+val/test2 • 2.3: People • Max EER: 0.719 (INRIA: Zhang)

  30. Competition 2: train+val/test2 • 2.4: Cars • Max EER: 0.720 (INRIA: Zhang)

  31. Classes and test1 vs. test2 • Mean EER of ‘best’ results across classes • test1: 0.946, test2: 0.741

  32. Conclusions? • Interest points + SIFT + clustering (histogram) + SVM did ‘best’ • Log-linear model (Aachen) a close second • Results with SVM (INRIA) significantly better than with logistic regression (Edinburgh) • Method using detection (Darmstadt) did not do so well • Cannot exploit context (= unintended bias?) of image • Used subset of training data and is able to localize

  33. Competitions 3 & 4 • Classification • Any (non-test) training data to be used • No entries submitted

  34. Results:Detection

  35. Participants

  36. Methods • Generalized Hough Transform • Interest points, clustered patches/descriptors, GHT • Darmstadt: (SVM verification stage), side views with segmentation mask used for training • INRIA: Dorko: SIFT features, semi-supervised clustering, single detection per image • “Sliding window” classifiers • Exhaustive search over translation and scale • FranceTelecom: Convolutional neural network • INRIA: Dalal: SVM with SIFT-based input representation

  37. Methods • Baselines: Edinburgh • Detection confidence • class prior probability • Whole-image classifier (SIFT + logistic regression) • Bounding box • Entire image • Scale-normalized mean bounding box from training data • Bounding box of all interest points • Bounding box of interest points weighted by ‘class purity’

  38. Measured Interpolated Evaluation • Correct detection: 50% overlap in bounding boxes • Multiple detections considered as (one true + ) false positives • Precision/Recall • Average Precision (AP) as defined by TREC • Mean precision interpolated at recall = 0,0.1,…,0.9,1

  39. Competition 5: train+val/test1 • 5.1: Motorbikes • Max AP: 0.886 (Darmstadt)

  40. Competition 5: train+val/test1 • 5.2: Bicycles • Max AP: 0.119 (Edinburgh)

  41. Competition 5: train+val/test1 • 5.3: People • Max AP: 0.013 (INRIA: Dalal)

  42. Competition 5: train+val/test1 • 5.4: Cars • Max AP: 0.613 (INRIA: Dalal)

  43. Competition 6: train+val/test2 • 6.1: Motorbikes • Max AP: 0.341 (Darmstadt)

  44. Competition 6: train+val/test2 • 6.2: Bicycles • Max AP: 0.113 (Edinburgh)

  45. Competition 6: train+val/test2 • 6.3: People • Max AP: 0.021 (INRIA: Dalal)

  46. Competition 6: train+val/test2 • 6.4: Cars • Max AP: 0.304 (INRIA: Dalal)

  47. Classes and test1 vs. test2 • Mean AP of ‘best’ results across classes • test1: 0.408, test2: 0.195

  48. Conclusions? • GHT (Darmstadt) method did ‘best’ on classes entered • SVM verification stage effective • Limited to lower recall (by use of only side views) • SVM (INRIA: Dalal) comparable for cars, better on test2 • Smaller objects?, higher recall • Performance on bicycles, people was ‘poor’ • “Non-solid” objects, articulation?

  49. Competition 7: any train/test1 • One entry: 7.3: people (INRIA: Dalal) • AP: 0.416 • Use of own training data improved results dramatically(AP: 0.013)

  50. Competition 8: any train/test2 • One entry: 8.3: people (INRIA: Dalal) • AP: 0.438 • Use of own training data improved results dramatically(AP: 0.021)

More Related