1 / 44

AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION

AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION. – Paul Fitzpatrick – MIT CSAIL. flexible perception. Humanoid form is general-purpose, mechanically flexible

leander
Download Presentation

AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION –Paul Fitzpatrick – MIT CSAIL

  2. flexible perception • Humanoid form is general-purpose, mechanically flexible • Robots that really live and work amongst us will need to be as general-purpose and adaptive perceptually as they are mechanically

  3. flexible visual perception • In robotics, vision is often used to guide manipulation • But manipulation can also guide vision • Important for… • Correction – detecting and recovering from incorrect perception • Experimentation – disambiguating inconclusive perception • Development – creating or improving perceptual abilities through experience

  4. flexible visual object perception Tsikos, Bajcsy, 1991 “Segmentation via manipulation” Simplify cluttered scenes by moving overlapping objects Sandini et al, 1993 “Vision during action” Interpret motion during manipulation to deduce object boundaries

  5. segmentation via manipulation Tsikos, Bajcsy, 1991

  6. vision during action Sandini et al, 1993

  7. active segmentation • Object boundaries are not always easy to detect visually • Solution: Cog sweeps arm through ambiguous area • Any resulting object motion helps segmentation • Robot can learn to recognize and segment object without further contact

  8. active segmentation

  9. evidence for segmentation • Areas where motion is observed upon contact • classify as ‘foreground’ • Areas where motion is observed immediately before contact • classify as ‘background’ • Textured areas where no motion was observed • classify as ‘background’ • Textureless areas where no motion was observed • no information • No need to model the background!

  10. minimum cut foreground node allegiance to foreground • “allegiance” = cost of assigning two nodes to different layers (foreground versus background) pixel-to-pixel allegiance pixel nodes allegiance to background background node

  11. minimum cut foreground node allegiance to foreground • “allegiance” = cost of assigning two nodes to different layers (foreground versus background) pixel-to-pixel allegiance pixel nodes allegiance to background background node

  12. grouping (on synthetic data) proposed segmentation “figure” points (known motion) “ground” points (stationary, or gripper)

  13. point of contact 1 2 3 4 5 6 7 8 9 10 Motion spreads suddenly, faster than the arm itself  contact Motion spreads continuously (arm or its shadow)

  14. point of contact

  15. segmentation examples Side tap Back slap Impact event Motion caused (red = novel, Purple/blue = discounted) Segmentation (green/yellow)

  16. segmentation examples

  17. car table segmentation examples

  18. 0.2 cube car bottle ball 0.16 0.12 0.08 0.04 0 0 0.05 0.1 0.15 0.2 0.25 0.3 boundary fidelity measure of anisiotropy (square root of) second Hu moment measure of size (area at poking distance)

  19. signal to noise

  20. active segmentation • Not always practical! • No good for objects the robot can view but not touch • No good for very big or very small objects • But fine for objects the robot is expected to manipulate Head segmentation the hard way!

  21. human manipulation, first person perspective (Charlie Kemp) human manipulation, external perspective (Artur Arsenio) other approaches robot manipulation, first person perspective (Paul Fitzpatrick, Giorgio Metta)

  22. segmentation catalog affordance exploitation (rolling) manipulator detection (robot, human) edge catalog object detection (recognition, localization, contact-free segmentation) from first contact to close encounters active segmentation

  23. object recognition • Geometry-based • Objects and images modeled as set of point/surface/volume elements • Example real-time method: store geometric relationships in hash table • Appearance-based • Objects and images modeled as set of features closer to raw image • Example real-time method: use histograms of simple features (e.g. color)

  24. geometry+appearance • Advantages: more selective; fast • Disadvantages: edges can be occluded; 2D method • Property: no need for offline training angles + colors + relative sizes invariant to scale, translation, In-plane rotation

  25. details of features • Distinguishing elements: • Angle between regions (edges) • Position of regions relative to their projected intersection point (normalized for scale, orientation) • Color at three sample points along line between region centroids • Output of feature match: • Predicts approximate center and scale of object if match exists • Weighting for combining features: • Summed at each possible position of center; consistency check for scale • Weighted by frequency of occurrence of feature in object examples, and edge length

  26. real object in real images

  27. other examples

  28. other examples

  29. other examples

  30. just for fun look for this… …in this result

  31. multiple objects camera image implicated edges found and grouped response for each object

  32. yellow on yellow

  33. first time seeing a ball robot’s current view recognized object (as seen during poking) pokes, segments ball sees ball, “thinks” it is cube correctly differentiates ball and cube

  34. open object recognition

  35. the attention system low-level salience filters poking sequencer motor control (arm) object recognition/ localization (wide) tracker motor control (eyes, head, neck) object recognition/ localization (foveal) egocentric map

  36. attention

  37. and so on…

  38. finding manipulators • Analogous to finding objects • Object • Definition: physically coherent structure • How to find one: poke around and see what moves together • Actor • Definition: something that acts on objects • How to find one: see what pokes objects

  39. similar human and robot actions Object connects robot and human action

  40. catching manipulators in the act manipulator approaches object contact!

  41. modeling manipulators

  42. manipulator recognition

  43. use constraint of familiar activity to discover unfamiliar entity used within it reveal the structure of unfamiliar activities by tracking familiar entities into and through them theoretical goal: a virtuous circle familiar activities familiar entities (objects, actors, properties, …)

  44. Ow! conclusions • Active segmentation can make any segmentation strategy better • Ideal for learning about manipulable objects – an important class of object for humanoid robots • Doesn’t require a database of objects to be built up by hand • (at least, not by human hand)

More Related