1 / 47

Scene Interpretation in images and videos

This paper discusses scene interpretation in computer vision and robotics, including scene reconstruction and recognition. It also presents a convex optimization-based algorithm for 3D reconstruction and monocular terrain recognition using a single camera.

kdoris
Download Presentation

Scene Interpretation in images and videos

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scene Interpretation in images and videos Chetan Jakkoju 200402009 CVIT

  2. Scene interpretation Human can answer: • How many taxis ? • How many cars ? • What type of cars ? • How many buildings ? • How tall are buildings ? • What type of road junction ? But machine cannot!

  3. Scene Interpretation Computer vision Robotics

  4. Some aspects

  5. Our interests(1) • Scene reconstruction ( planar scenes )

  6. Our interests(2) • Scene recognition ( Outdoor roads )

  7. Piecewise Planar Reconstruction using Convex OptimizationACCV 2009

  8. Road Map • Introduction • Applications • Existing Solutions & Issues • New formulation using Convex Optimization

  9. Introduction Output Input (Ri,ti) • Input: Set of images of a piecewise planar scene. • Output: 3D model (normal, perp. distance) and camera parameters (rotation, translation).

  10. Applications • Robot navigation • Path planning • Inserting virtual objects • 3D reconstruction • A. Davison, I. Reid, N. Molton, and O. Stasse. MonoSLAM:Real-Time Single Camera SLAM.PAMI 2007 • R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, and B. MacIntyre. Recent advances in augmented reality. IEEE Computer Graphics and Applications, 21(6):34–47, 2001. • N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. SIGGRAPH 2006.

  11. Homography • Simple scenario

  12. Existing solutions • SVD based methods (Decompose Homography Matrix) • Faugeras & Zhang methods • Problem: Very much sensitive to noise • Bundle Adjustment methods • Problem: • Iterative non-linear method • huge time and space requirement apart from correctness.

  13. Our Solution • New formulation in convex optimization framework. • Advantages • Better solution than Bundle adjustment. • Standard efficient solvers exist. (proposed in past 5 years)

  14. Advances in Vision using Convex optimization • Optimization algorithms in Vision (MVG) • Optimal solutions exist for • H from point correspondences • Pose from Essential matrix • Convex optimization is matured enough! • F. Kahl. Multiple view geometry and the l-infinity norm. ICCV 2005 • R. Hartley and F. Kahl. Global optimization through searching rotation space and optimal estimation of the essential matrix. ICCV 2007 • S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.

  15. Basic formulation • H matrix • Highly non-linear. • Observation: Fixing pose parameters or plane parameters makes H linear H = [ d R – t nT]

  16. Formulation • Given H, decompose it to (n,d) and (R,t). • Calculate • H != H’ in general • Goal: Vary (n,d) and (R,t) so that they close to H

  17. Algorithm • Given H • Decompose H to R,t,n,d • While • Optimize F(n,d) (update n,d) • Optimize F(R,t) (update t) • end

  18. Extensions • Extension to multiple views • All planes may not be visible in all views! • Sol: We use inter homographies • ( H23,H34,…)

  19. Sample reconstructions Synthetic House showing “visual accuracy” Oxford model house Baity Hill

  20. Summary • Presented convex optimization based algorithm for reconstruction • Applicable for videos. • Synthetic and real experiments show promising results • Much better optimization frameworks in future.

  21. Part 2Monocular Terrain recognitionICPR 2010 & IROS 2010

  22. Problem • Grass • Mud • Hard mud • Road • Other Classify

  23. Applications • Autonomous robot navigation • Path planning • Advanced driver assistance systems • Obstacle @ 18mts • Obstacle @ 10mts

  24. Existing solutions(1) ( In Robotics ) • Solve only sub-problem • Obstacle VS non-obstacle • Use multiple costly sensors • (lasers, ladars etc.,) • Though they perform well, they can’t “feel” the terrain surface.

  25. Existing solutions(2) ( In Robotics ) • Good solution is to use IMU sensors • Advantages: • Solve much wider problem of recognizing various types of terrains. • Problems: • They can only recognize the terrain after they traverse. – “Short-sightedness” • IMU sensors are also costlier.

  26. Ultimate goal • Solving the terrain recognition problem without using costly sensors • Just using single camera • Advantages: • Light weight • Low power • No “short sightedness” • Direct applications • in mini-robots • in Driver assistance systems.

  27. Dataset collection • Camera attached on top of the car

  28. Sample dataset • 25 videos each of 1 min involving different kind of scenarios

  29. Base method • Prepare Training set and Testing set • In each image, 16x16 image block acts as training sample. • Extract feature-F from the block, and train a classifier-C.

  30. Base method • Error rates on color features and base classifiers • Naïve Bayes (NB) • Artificial neural networks • K- Nearest neighbours • Support vector machines (linear) (SVM-L) • Support vector machines (Kernel) (SVM-K) • Random forest (RF)

  31. Interesting observations of data • Relative position of different terrains • Eg: Probability of grass area near mud area is greater than that of the grass area near the road area. • Scale of texture varies majorly in vertical direction.

  32. Proposed method • Previously we trained one classifier on whole image. • Training different classifier on different partition must “capture” the previous observations. • Note: Partitions increase in squares {22,32,42,…}

  33. Experiment-1 • Always decreases the error by ~10%! • ~10 %

  34. Experiment-2 • Error decreased from 25% to 15%! • (Using 4-8 classifier sets is desirable)

  35. Experiment -3 (Smoothness test)

  36. Other enhancementLabel Transfer • Track features from previous frames using optical flow • Transfer the labels • Result: ~45% of image is transferred

  37. Cons • Memory less • Doesn’t perform well when appearance of terrain varies.

  38. Adaptive algorithm • Track patches in the recent frames. • New training data

  39. Adaptive algorithm

  40. Experiment • Closed loop test • ~5% decrease in error ie,~20% error rate reduction • Road Run 1 Run 2

  41. Demo

  42. Summary • Presented fast-terrain classification method. • Extended the method to adapt online. • More video processing methods in future.

  43. Conclusions and Future work • New techniques in scene reconstruction and scene recognition. • Reconstruction of piece wise planar scenes. • Main Advantages • All the planes may not be visible in all views. • We also add inter homographies in our framework. • Next we address Terrain recognition. • Own challenging dataset. • We conducted various empirical studies. • Proposed two algorithms • Partition based method & Adaptive algorithm • Conducted several experiments to validate them.

  44. Conclusions and Future work • Quasi-convex objective functions to Convex objective functions. • Handling outliers • In partition based algorithm, one could replace the simple mode operator with weighted map. • Adaptive algorithm could be enhanced using state-of-the-art semi-supervised ML algorithms.

  45. Publications • Visesh Chari, Anil Nelakanti, Chetan Jakkoju and C. V. Jawahar. ``Piecewise Planar Reconstruction using Convex Optimization.'' In proceedings of Asian Conference on Computer Vision (ACCV'09). • Chetan J., Madhava Krishna and C. V. Jawahar. ``Fast and Spatially-smooth Terrain Classification using Monocular Camera.'' In proceedings of International Conference on Pattern Recognition. ( ICPR 2010 ) • Chetan J., Madhava Krishna and C. V. Jawahar. ``An Adaptive Outdoor Terrain Classification Methodology using Monocular Camera'' In proceedings of International Conference on Intelligent Robots and Systems. ( IROS 2010 )

  46. Thank you  chetan@research.iiit.ac.in

More Related