1 / 76

Coherent Scene Understanding with 3D Geometric Reasoning

Coherent Scene Understanding with 3D Geometric Reasoning. Jiyan Pan 12/3/2012. Task. Detect objects. Identify surface regions. Geometrically coherent in the 3D world. Estimate ground plane. Infer gravity direction. 3D geometric context. Coordinate system.

hila
Download Presentation

Coherent Scene Understanding with 3D Geometric Reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coherent Scene Understanding with 3D Geometric Reasoning Jiyan Pan 12/3/2012

  2. Task Detect objects Identify surface regions Geometrically coherent in the 3D world Estimate ground plane Infer gravity direction 3D geometric context

  3. Coordinate system Variables of global 3D geometries: ng, np, hp (inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Deterministic relationships ground plane height hp np ground plane orientation ground plane

  4. Coordinate system (inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Probabilistic relationships ground plane height hp Derived from appearance np ground plane orientation Prior knowledge ground plane

  5. Can we solve them all for a coherent solution? • Non-linear • Non-deterministic • Even invalid equations from false detections

  6. Global 3D context √ Local 3D context √ X √ √

  7. Global 3D context √ Local 3D context √ X ? √ • “Chicken and egg” problem: • Local entities could be validated by global 3D context • Global 3D context is induced from local entities √

  8. Possible solution (All in PGM) • Put both global 3D geometries and local entities in a PGM [1] • Precision issue: Have to quantize continuous variables • Complexity issue: Pairwise potential would contain up to ~1e6 entries • 100(pitch) × 100 (roll) × 100 (height) Ground Gravity o1 ok o2 [1] D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

  9. Possible solution (Fixed global geometries as hypotheses) • Task much easier under a fixed hypothesis of global 3D geometries Ground Gravity × × × × × × o1 ok o2

  10. Possible solution (Fixed global geometries as hypotheses) • Task much easier under a fixed hypothesis of global 3D geometries How to generate global 3D geometry hypotheses? ω1 ω3 ω2 o1 ok o2

  11. Possible solution(Hypotheses by exhaustive search) • Exhaustive search over the quantized space of global 3D geometries [2] • Computational complexity tends to limit search space [2] S. Baoet al. Toward coherent object detection and scene layout understanding. IVC, 2011

  12. Possible solution(Hypotheses by Hough voting) • Each local entity casts vote to the Hough voting space of the global 3D geometries and peaks are selected[3] • False detections could corrupt the votes • Would applying EM help? Not likely, if false detections overwhelm L1 L4 L5 L2 L6 L7 L3 [3] M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010

  13. Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3

  14. Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3

  15. Our solution • We take a RANSAC-like approach: Randomly mix the contributions of local entities • Compared to averaging over all local entities: More robust against outliers • Compared to directly using estimates from each single local entity: More robust against noise L1 L4 L5 L2 L6 L7 L3

  16. Gravity Direction 3 Individual Mixture 2.8 Average 2.6 2.4 Minimum hypothesis error 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures

  17. Ground Plane Orientation Individual 3.2 Mixture Average 3 2.8 2.6 Minimum hypothesis error 2.4 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures

  18. Global 3D context √ Local 3D context √ X √ √

  19. 3D geometric context #1: Common ground (global) invalid (#1) valid ground plane orientation valid invalid (#1) invalid (#1) ground plane

  20. 3D geometric context #2: Gravity direction (global) (inverse) gravity ground plane orientation invalid (#2) ground plane

  21. 3D geometric context #3: Depth ordering (local) (inverse) gravity ground plane orientation incompatible (#3) ground plane

  22. 3D geometric context #4: Space occupancy (local) (inverse) gravity ground plane orientation incompatible (#4) ground plane

  23. 6 5 4 3 2 1

  24. Given a global 3D geometry hypothesis Global geometric compatibility for an object: Orientation: 6 5 4 3 2 1

  25. Given a global 3D geometry hypothesis Global geometric compatibility for an object: Orientation: Height: 6 5 4 3 2 1

  26. Given a global 3D geometry hypothesis Global geometric compatibility for a surface: Orientation: local estimates vs. or Location: horizontal surface region vs. ground horizon 6 5 4 3 2 1

  27. Given a global 3D geometry hypothesis Local geometric compatibility for two objects: Depth ordering: Space occupancy: 6 5 4 3 2 1

  28. Given a global 3D geometry hypothesis Objective function of the CRF: 6 5 4 3 2 1

  29. Global 3D context √ Local 3D context √ X √ Best hypothesis √

  30. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  31. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  32. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  33. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  34. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  35. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  36. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  37. 3D reasoning agrees with raw detector 3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector

  38. 3D geometric reasoning improves object detection performance Deformable Part Model Detector 0.7 Baseline Hoiem 0.6 Ours 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

  39. 3D geometric reasoning improves object detection performance Dalal-Triggs Detector 0.8 Baseline Hoiem 0.7 Ours 0.6 0.5 True Positive Rate 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

  40. 3D geometric reasoning improves object detection performance D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010

  41. D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010

  42. Global 3D context √ Local 3D context √ X √ Best hypothesis √

  43. Contributions of different geometric context Detection ROC Curve Det 0.7 Det+IdvlGeo Det+PairGeo 0.6 Det+FullGeo 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image

  44. Benefit is mutual

More Related