Computational Vision Jitendra Malik University of California at Berkeley
Taxonomy of Vision Problems • Reconstruction: • estimate parameters of external 3D world. • Visual Control: • visually guided locomotion and manipulation. • Segmentation: • partition I(x,y,t) into subsets of separate objects. • Recognition: • classes: face vs. non-face, • activities: gesture, expression.
Reconstruction • Computer graphics is the forward problem: given scene geometry, reflectances and lighting, synthesize an image. • Computer vision must address the inverse problem: given an image/multiple images, reconstruct the scene geometry, reflectacnes and illumination.
Recovering geometry • Historical roots in photogrammetry and analysis of 3D cues in human vision • Single images adequate given knowledge of object class • Multiple images make the problem easier, but not trivial as corresponding points must be identified.
Taj Mahal modeled from one photograph by G. Borshukov
Recovered Campus Model Campanile + 40 Buildings (Debevec et al)
Inverse Global Illumination (Yu et al) Reflectance Properties Radiance Maps Light Sources Geometry
Challenges in Reconstruction • Finding correspondences automatically • Optimal estimation of structure from n views under perspective projection • Models of reflectance and texture for natural materials and objects
Control • Visual feedback signal for control of manipulation tasks such as grasping, moving and assembly • Visual feedback for guiding locomotion • Obstacle avoidance for a moving robot • Lateral and longitudinal control of driving
Challenges in control • Delay in feedback loop due to visual processing • Hierarchies in sensory motor control • Open loop or closed loop • Discrete planning or continuous control
Boundaries of image regions defined by a number of attributes • Brightness/color • Texture • Motion • Stereoscopic depth • Familiar configuration
Approaches • Fitting a piecewise smooth surface to the image e.g. Mumford and Shah • Probabilistic Inference using Markov Random Field model of image e.g. Geman and Geman • Graph partitioning using spectral techniques e.g. Shi and Malik
Image Segmentation as Graph Partitioning Build a weighted graph G=(V,E) from image V: image pixels E: connections between pairs of nearby pixels Partition graph so that similarity within group is large and similarity between groups is small -- Normalized Cuts [Shi&Malik 97]
Challenges in Segmentation • Interaction of multiple cues • Local measurements to global percepts • Interplay of image-driven and object model driven processing
Possible for both instances or object classes (Mona Lisa vs. faces or Beetle vs. cars) • Tolerant to changes in pose and illumination, and occlusion Recognition
Recognition of Gait and Gesture run measurement recognition animation
Challenges in recognition • Unified framework for segmentation and recognition • Representing shape variability in a category • Interplay of discriminative vs generative models
Core disciplines • Geometry • Differential geometry • Projective geometry • Probability and Statistics • Reconstruction = estimation • Control = decision theory • Segmentation = clustering • Recognition = classification