Create Presentation
Download Presentation

Download Presentation

Coherent Scene Understanding with 3D Geometric Reasoning

Download Presentation
## Coherent Scene Understanding with 3D Geometric Reasoning

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Coherent Scene Understanding with 3D Geometric Reasoning**Jiyan Pan 12/3/2012**Task**Detect objects Identify surface regions Geometrically coherent in the 3D world Estimate ground plane Infer gravity direction 3D geometric context**Coordinate system**Variables of global 3D geometries: ng, np, hp (inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Deterministic relationships ground plane height hp np ground plane orientation ground plane**Coordinate system**(inverse) gravity image plane ng object vertical orientation focal length nv f dt xt α real world height H object depth camera center xb γ db θ object landmarks object pitch and roll angles Probabilistic relationships ground plane height hp Derived from appearance np ground plane orientation Prior knowledge ground plane**Can we solve them all for a coherent solution?**• Non-linear • Non-deterministic • Even invalid equations from false detections**Global 3D context**√ Local 3D context √ X √ √**Global 3D context**√ Local 3D context √ X ? √ • “Chicken and egg” problem: • Local entities could be validated by global 3D context • Global 3D context is induced from local entities √**Possible solution (All in PGM)**• Put both global 3D geometries and local entities in a PGM [1] • Precision issue: Have to quantize continuous variables • Complexity issue: Pairwise potential would contain up to ~1e6 entries • 100(pitch) × 100 (roll) × 100 (height) Ground Gravity o1 ok o2 [1] D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008**Possible solution (Fixed global geometries as hypotheses)**• Task much easier under a fixed hypothesis of global 3D geometries Ground Gravity × × × × × × o1 ok o2**Possible solution (Fixed global geometries as hypotheses)**• Task much easier under a fixed hypothesis of global 3D geometries How to generate global 3D geometry hypotheses? ω1 ω3 ω2 o1 ok o2**Possible solution(Hypotheses by exhaustive search)**• Exhaustive search over the quantized space of global 3D geometries [2] • Computational complexity tends to limit search space [2] S. Baoet al. Toward coherent object detection and scene layout understanding. IVC, 2011**Possible solution(Hypotheses by Hough voting)**• Each local entity casts vote to the Hough voting space of the global 3D geometries and peaks are selected[3] • False detections could corrupt the votes • Would applying EM help? Not likely, if false detections overwhelm L1 L4 L5 L2 L6 L7 L3 [3] M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010**Our solution**• We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3**Our solution**• We take a RANSAC-like approach: Randomly mix the contributions of local entities L1 L4 L5 L2 L6 L7 L3**Our solution**• We take a RANSAC-like approach: Randomly mix the contributions of local entities • Compared to averaging over all local entities: More robust against outliers • Compared to directly using estimates from each single local entity: More robust against noise L1 L4 L5 L2 L6 L7 L3**Gravity Direction**3 Individual Mixture 2.8 Average 2.6 2.4 Minimum hypothesis error 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures**Ground Plane Orientation**Individual 3.2 Mixture Average 3 2.8 2.6 Minimum hypothesis error 2.4 2.2 2 1.8 1.6 0 5 10 15 20 25 30 35 40 45 50 Number of random mixtures**Global 3D context**√ Local 3D context √ X √ √**3D geometric context**#1: Common ground (global) invalid (#1) valid ground plane orientation valid invalid (#1) invalid (#1) ground plane**3D geometric context**#2: Gravity direction (global) (inverse) gravity ground plane orientation invalid (#2) ground plane**3D geometric context**#3: Depth ordering (local) (inverse) gravity ground plane orientation incompatible (#3) ground plane**3D geometric context**#4: Space occupancy (local) (inverse) gravity ground plane orientation incompatible (#4) ground plane**6**5 4 3 2 1**Given a global 3D geometry hypothesis**Global geometric compatibility for an object: Orientation: 6 5 4 3 2 1**Given a global 3D geometry hypothesis**Global geometric compatibility for an object: Orientation: Height: 6 5 4 3 2 1**Given a global 3D geometry hypothesis**Global geometric compatibility for a surface: Orientation: local estimates vs. or Location: horizontal surface region vs. ground horizon 6 5 4 3 2 1**Given a global 3D geometry hypothesis**Local geometric compatibility for two objects: Depth ordering: Space occupancy: 6 5 4 3 2 1**Given a global 3D geometry hypothesis**Objective function of the CRF: 6 5 4 3 2 1**Global 3D context**√ Local 3D context √ X √ Best hypothesis √**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D reasoning agrees with raw detector**3D reasoning recovers detection rejected by raw detector 3D reasoning rejects detection accepted by raw detector**3D geometric reasoning improves object detection performance**Deformable Part Model Detector 0.7 Baseline Hoiem 0.6 Ours 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008**3D geometric reasoning improves object detection performance**Dalal-Triggs Detector 0.8 Baseline Hoiem 0.7 Ours 0.6 0.5 True Positive Rate 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 False Positive per Image D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008**3D geometric reasoning improves object detection performance**D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010**D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in**perspective. IJCV, 2008 M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010**Global 3D context**√ Local 3D context √ X √ Best hypothesis √**Contributions of different geometric context**Detection ROC Curve Det 0.7 Det+IdvlGeo Det+PairGeo 0.6 Det+FullGeo 0.5 0.4 True Positive Rate 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 False Positive per Image