OBJECT RECOGNITION

OBJECT RECOGNITION Presented by Vishal Dalmiya Department of Electrical Engineering University of Texas at Arlington Computer Vision EE6358

Milestones • Introduction • System components • Complexity of object recognition • Object Representation • Object-centered representation • Observer-centered representation • Feature Detection • Recognition Strategies • Classification • Nearest Neighbor Classifiers • Bayesian classifier • Matching • Feature matching • Symbolic matching • Feature Indexing • Verification • Template • Morphological approach • Symbolic • Analog methods • Demos • Recent paper on object recognition Computer Vision EE6358

Introduction • It can be defined as a labeling problem based on models of known objects. Object Class Image Feature Detector Features Hypothesis Formation Hypothesis Verification Candidate Objects Modelbase Different components of an object recognition Computer Vision EE6358

System Components • Model database( also called modelbase) • Qualitative or functional description, geometric surface information. • Feature detector • Size, color, shape are common features. • Feature detector applies operators to images and identifies locations of features helping in forming object hypothesis. • Hypothesizer • It assigns likelihoods to object present in the scene. • The above step reduces search space for recognizer. • Hypothesis verifier • Helps in selecting the object with highest likelihood. Computer Vision EE6358

Various factors for selection of appropriate methods in the process of object recognition • In some applications like pattern recognition the hypothesis formation is enough for identifying the objects. • In applications like template matching hypothesis verifier play the most crucial role. • The central issues that should be considered in designing an object recognition system are: • Object or model representation • Feature extraction • Feature model matching • Hypothesis formation • Object verification • Depending on the above criteria's the importance of each block of object recognition is identified. Computer Vision EE6358

Complexity of object recognition • Scene constancy • Depends on the similar conditions like illumination, background parameters and view point in which the images are acquired. • Image-model spaces • Depends on the nature of image which could be 2-D or 3-D in nature and affects the feature detection. • Number of objects in model database • Determines the requirement for the hypothesis formation stage. • Number of objects in an image and possibility of occlusion • More the number of objects in the image more is the chance of occlusion. • Occluded image leads to absence of expected feature and generation of new features. • It affects the hypothesis formation stage. Computer Vision EE6358

Object Representation • Observer-centered representation: • Applicable to objects which appear in a relatively few stable positions with respect to the camera. • Useful in recognizing remote objects in an image with relatively different characteristics. • Object-centered representation: • Uses description of objects in a coordinate system attached to the objects. • Constructive solid geometry • Uses simple volumetric primitives such as blocks, cones, cylinders and spheres and a set of boolean operations: union, intersection and difference. • Spatial occupancy • There are many variants of this representation such as voxel representation, octree and tetrahedral cell decomposition. • Gives a detailed low-level description of the object. • Multiple view representation • A 3-D object can be represented by using its aspect graph. • An aspect graph represents all stable views of an object. Computer Vision EE6358

CSG and Spatial Occupancy A voxel representation of an object A CSG representation of an object Computer Vision EE6358

Surface Boundary representation, Sweep representation • Surface boundary representation • A solid object can be represented by defining the surfaces that bound the object. • These representations vary from triangular patches to non- uniform rational B-splines (NURBS) • Sweep representation: Generalized Cylinders • Object shapes can be represented by a 3 dimensional space curve. • It acts as the spine or axis of the cylinder, a two-dimensional cross-sectional figure, and a sweeping rule that defines how the cross section is to be swept along the space curve. An object and its aspect graph Computer Vision EE6358

Cylindrical Representation An object and its generalized cylindrical representation Computer Vision EE6358

Feature detection • Global Features • Area (size), perimeter, Fourier descriptors and moments. • Can be evaluated for all points in a region or boundary. • Establishes location of points, intensity characteristics, spatial relations. • Local Features • Distinguishable small area of a region. • Curvature, boundary, segments and corners are commonly used local features. • Relational Features • Based on relative positions of different entities like regions, closed contours or local features. • These features include distance between features and relative orientation measurements. Demo Computer Vision EE6358

Local and global representations • In most cases, the relative position of the entities defines the object An object and its partial representation using multiple local and global features Computer Vision EE6358

Recognition strategies • It is the sequence of steps to be performed after appropriate feature detection Hypothesizer Classifier Features Object Features Verifier Sequential matching Object Hypothesizer Verifier Features Object Computer Vision EE6358

Classification • Nearest Neighbor Classifiers • Each object class assumes N features and total number of such object class be denoted by M. • The ideal feature values for each object class i is denoted by fij. • The features of the unknown object U be represented by uj. • The decision of the class is taken based on the measure of similarity between the object class and the unknown object. • The empirical formula for the calculation of similarity is given by the following equation: Computer Vision EE6358

Contd… • Two common approaches in such a situation are: • The centroid of the cluster is considered as the prototype object’s feature point, and the distance to this is computed • The distance to the closest point is considered. F2 F1 Each class represented by a cluster of points due to more than one object belonging to the same class Computer Vision EE6358

Bayesian Classifier • Bayesian Classifier • This approach plays an important role where the feature values of different object overlap each other. • Probabilistic knowledge about the feature of objects and frequency of objects is used in this type of approach. • Let the probability of the object of class j is P(wj). The object with maximum P(wj) is assigned to class j. • The conditional probability p(x/wj) tells us that based on the probabilistic information provided, it is known that if the feature value is observed to be x, then the probability that the object belongs to class j is p(x/wj). • The unknown object should be assigned to the class with the highest posteriori probability P(wj/x). • The same concept can be further extended to multiple features. Computer Vision EE6358

Conditional probability The conditional density function A posteriori probabilities for two different values of a priori probabilities for objects Computer Vision EE6358

Off-Line Computations • Off-line Computations • The above classification approaches consider the feature space. • Based on the knowledge of the feature characteristics of objects, a method is used to partition the feature space. • After that a class decision is assigned to each point in the feature space. • To assign a class to each point in the feature space, all computations are done before the recognition of unknown objects begin. • This is called off-line computation. • These off-line computation reduce the computations at the run time. • The recognition process can be effectively converted to a look-up table and hence can be implemented very quickly. Computer Vision EE6358

Matching • Feature Matching • The similarity of the object with the ith class is given by • We is the weight for the jth feature. It is selected based on the relative importance of the feature. • The similarity value of the jth feature is sj. • The object is labeled as belonging to class k if Sk is the highest similarity value. Computer Vision EE6358

Symbolic Matching • Symbolic Matching • An object can be represented by the relations among different features. • An object in these type of cases is represented as a graph with each node of the graph representing as a node and the edges connecting the nodes represents the relation between objects. • Rijk defines the relation between nodes j and k of graph i. A similarity measure for the graphs is measured that consider the similarities of all nodes and functions. • This type of approach is very useful in applications, where objects to be recognized are partially visible. Computer Vision EE6358

Feature Indexing • Feature Indexing • This approach is adopted when the number of objects is very high. • Feature indexing approach use features of objects to structure the model base. • When a feature from the indexing set is detected in an image, this feature helps in reducing search space. • More than one feature detection can further be followed reducing the search space. • In the indexed database, in addition to the names of the objects and their models, information about the orientation and pose of the object in which the indexing feature appears should always be kept. This information helps in the verification stage. Computer Vision EE6358

Verification • Template Matching • The instance of the template g[i,j] has to be matched in an image f[i,j]. • The template is placed at a certain point in the image and the intensity values in the template is compared with those in the image. • Sum of squared errors is the most popular measure for template matching. • A reasonable strategy for obtaining all locations and instances of the template is to shift the template and use the match measure at every point in the image. Thus for an m x n template, • The normalized cross-correlation can be computed as: Demo Computer Vision EE6358

Template Matching A (a), an image (b), the result of the template matching computations, (c) and the threshold result to find the match locations (d), T=240 Computer Vision EE6358

Morphological opening in object recognition A structuring element (a), an image (b), and the result of morphological opening (c) Computer Vision EE6358

Symbolic • Graph Isomorphism • Given two graphs (V1,E1) and (V2,E2), a 1:1 and onto mapping(an isomorphism) f between V1 and V2 such that for D1,D2 belongs to V1,V2, f(D1)=D2 and for each edge of E1 connecting any pair of nodes D1 and D1’ belonging to V1, there is an edge connecting f(D1) and f(D1’) • This approach can be used in case of completely visible objects. • Sub graph Isomorphism • It finds isomorphism between a graph (V1,E1) and sub graphs of another graph (V2,E2) • Many heuristics have been proposed to solve graph matching problem which consider: • Variability in properties and relations. • Absence of properties or relations. • The fact that a model is an abstraction of a class of objects. • One way to formulate similarity is to consider the arcs in the graph as springs connecting two masses at the nodes. • The quality of the match is then a function of the goodness of fit of the template locally and the amount of energy needed to stretch the springs to force the unknown onto the modelence data. Computer Vision EE6358

Analogical methods • Analogical Methods • A measure of similarity between two curves can be obtained by comparing them on the same frame of reference and directly measuring the difference between them at every point. Matching of the two entities by directly measuring the errors between them. Computer Vision EE6358

Demos • http://bigwww.epfl.ch/demo/templatematching/tm_kernel33/demo.html\ • http://www.betaface.com/image_select.asp • Face tracking: http://nlpr-web.ia.ac.cn/english/irds/demo/demo1.htm • http://visual.ipan.sztaki.hu/corner/corner_click.html • Face Detection: http://demo.pittpatt.com/ Computer Vision EE6358

Distinctive Image Features from Scale-Invariant Key points • The features are invariant to image scaling and rotation, and partially invariant to change in illumination and 3D camera view point. • Features are highly distinctive, which allows a single feature to be matched with high probability against a large database of features. • Four major stages of computation use to generate set of image features: • Scale-space orientation: • Searches over all scales and image locations. • Difference of Gaussian function is used to identify key points. • Key point localization: • At each candidate location, a detailed model is fit to determine location and scale • Key points are selected based on their measures of their stability. • Orientation assignment: • One or more orientations are assigned to each key point based on local image gradient directions. • All future operation are performed on image data that has been transformed relative to the assigned orientation, scale and location for each feature. • Key point descriptor: • The local image gradients are measured at the selected scale in the region around each key point. • These are transformed into a representation that allows for significant levels of local shape distortion and change in illumination. Computer Vision EE6358

Contd…. • Fast nearest-neighbor algorithms is used for matching features and recognition of objects. The probability that a match is correct can be determined by taking the ratio of the distance from the closest neighbor to the distance of the second closest. Using a database of, key points, the solid line shows the PDF of this ratio for correct matches, while the dotted line is for matches that were incorrect Computer Vision EE6358

Results of SIFT (Scale Invariant Feature Transform) Computer Vision EE6358

Contd…. Computer Vision EE6358

Reference: • Book: • “Machine Vision” by Ramesh Jain, Rangachar Kasturi and Brian G. Schunck. • Papers: • David G. Lowe, 2004, Distinctive Image Features from Scale Invariant Key points. International Journal of Computer Vision. • Websites: • http://bigwww.epfl.ch/demo/templatematching/tm_kernel33/demo.html\ • http://www.betaface.com/image_select.asp • Face tracking: http://nlpr-web.ia.ac.cn/english/irds/demo/demo1.htm • http://visual.ipan.sztaki.hu/corner/corner_click.html • Face Detection: http://demo.pittpatt.com/ Computer Vision EE6358

OBJECT RECOGNITION

OBJECT RECOGNITION

Presentation Transcript

Generic Object Recognition

Object recognition

Dense Object Recognition

Statistical Object Recognition

Object Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Visual Object Recognition

Object recognition

Object Recognition

Object Recognition

Object Recognition

Multiclass object recognition

Object recognition

Object Recognition

Object recognition

Object Recognition I

Object Recognition