360 likes | 365 Views
Object Recognition Using Genetic Algorithms. CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition. 2D Case. Recover the geometric transformation that aligns the model(s) with the scene. (affine transformation). Image-Space Approaches.
E N D
Object Recognition UsingGenetic Algorithms CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition
2D Case • Recover the geometric transformation that aligns the model(s) with the scene. (affine transformation)
Image-Space Approaches • Identify a set of features from the unknown scene which approximately match a set of features from a model object. • Recover the geometric transformation that the model object has undergone. • Examples: • Interpretation tree (Grimson & Lozano-Perez, 1987) • Alignment (Huttenlocher and Ullman, 1990) • Geometric hashing (Lamdan et al., 1990)
Transformation-Space Approaches • Search the transformation space. • Find a transformation which aligns a large number of model features with the scene. • Examples: • Hough-transform based methods (Ballard, 1981). • Pose clustering techniques (Cass, 1988)
Why Using GAs for Object Recognition? • Genetic algorithms were designed to efficiently search large solution spaces. • Both the image and transformation spaces are very large! Image space: M3S3 possible alignments Transformation space: much larger! G. Bebis. S. Louis, Y. Varol, and A. Yfantis, "Genetic Object Recognition Using Combinations of Views", IEEE Transactions on Evolutionary Computation, vol 6, no. 2, pp. 132-146, April 2002.
Genetic Algorithms (GAs) Review • What is a GA? • An optimization technique for searching very large spaces. • Inspired by the biological mechanisms of natural selection and reproduction. • What are the main characteristics of a GA? • Global optimization technique. • Uses objective function information, not derivatives. • Searches probabilistically using a population of structures (i.e., candidate solutions using some encoding). • Structures are modified at each iteration using selection, crossover, and mutation.
Structure of GA • 10010110… 10010110… • 01100010… 01100010… • 10100100... 10100100… • 10010010… 01111001… • 01111101… 10011101… Evaluation and Selection Crossover Mutation Current Generation Next Genaration
Encoding and Fitness Evaluation • Encoding scheme • Transforms solutions in parameter space into finite length strings (chromosomes) over some finite set of symbols. • Fitness function • Evaluates the goodness of a solution.
Selection Operator • Probabilistically filters out solutions that perform poorly, choosing high performance solutions to exploit. • Chromosomes with high fitness are copied over to the next generation. fitness
Crossover and Mutation Operators • Generate new solutions for exploration. • Crossover • Allows information exchange between points. • Mutation • Its role is to restore lost genetic material. Mutated bit
Object Recognition Using Gas:Two Approaches • Genetic search in the image space (GA-IS) • Genetic search in the transformation space (GA-TS) • Important issues: • How to encode solutions? • How to modify solutions ? • How to evaluate solutions?
GA-IS: Encoding • At least three model-scene point matches are need to compute the affine transformation. • Chromosome contains the binary encoded identities of the three pairs of points. • Model points: 19 (5 bits) • Scene points: 19 - 45 (6 bits) • Chromosome length: 3 x 5 + 3 x 6 = 33 bits
GA-IS: Decoding • Some encoded solutions might be invalid • 5 bits can encode at most 32 values. • [0, 31] was linearlymapped to [0, 18] • 6 bits can encode at most 64 values. • [0, 63] was linearly mapped to [0, 44]
GA-IS: Fitness Evaluation • Compute affine transformation. • Apply the transformation on all the model points. • Compute the back-projection error (BE) between the model and scene. (dj min distance between the j-th model point and the scene)
GA-TS: Encoding • Need to estimate the range of values that the parameters of affine transformation can assume.
GA-TS: Encoding (cont’d) • Each chromosome contains six fields. • Only the range of each coefficient needs to be represented. • e.g., a11 assumes values in [-0.408, 0.408] • Its range is: 0.408 - (-0.408) = 0.816 • 2 decimal digit accuracy: 82 values must be encoded. • 7 bits are needed to encode 82 values. • Chromosome length: 6 x 7 = 42 bits
GA-TS: Decoding • Some encoded solutions might be invalid • 7 bits can encode at most 128 values. • [0, 127] should be mapped to [0, 81]
GA-TS: Fitness Evaluation • Same as before • Less expensive to compute in this case …
Experiments • Three scenes (S1, S2, S3) of increasing complexity. • S2, S3 are shown below (S1 was the same as model). • 10 trials per scene – tried to find model1 each time
Experiments (cont’d) • We searched for the model below in a number of scenes.
GA Parameters • Two-point crossover (prcoss: 0.95). • Point mutation (pmut: 0.05). • Cross generational selection strategy. • Fitness scaling (scaling factor: 1.2).
Results – Scene 1 • Correct solutions were found in all 10 trials.
Results – Scene 2 • Correct solutions were found in all 10 trials.
Results – Scene 3 • Correct solution was missed once!
3D case – GA-TS • Apply Genetic Search in the space of the AFoVs parameters. • We used 2 views per model
Experiments • Four scenes (S1, S2, S3,S4) of increasing complexity. • S1 was the same as model • 10 trials per scene
Experiments (cont’d) • We searched for the model below in a number of scenes.
GA Parameters • Two-point crossover (prcoss: 0.95). • Point mutation (pmut: 0.05). • Cross generational selection strategy. • Fitness scaling (scaling factor: 1.2). • Population size: 200 • Number of generations: 150
Results - Scene 1 • Correct solutions were found in all 10 trials.
Results – Scene 2 • Correct solutions were found in all 10 trials.
Results – Scene 3 • Correct solutions were found in all 10 trials.
Results – Scene 4 • Correct solutions were found in all 10 trials.
Efficiency: GA-TS ~ 2 x 1015
More Challenging Scene • GAs can find near-exact matches. • Could be used as input to more sophisticated recognition • algorithms.