Model Based Computer Vision

Model Based Computer Vision Arun Agarwal University of Hyderabad aruncs@uohyd.ernet.in

Examples of applications • Outdoor Scenes, Block World, Real World • Autonomous Land Vehicle • Aerial Imaging (Airport Scenes) • Natural Scenes • Biometrics • Diagnostic systems • Military applications • Face recognition, verification, retrieval. • Finger prints recognition. • Medical diagnosis: X-Ray. • Automated Target Recognition (ATR). • Image segmentation and analysis (recognition from aerial or satelite photographs).

Hierarchy of Image Understanding System (IUS) Labeled Objects Objects High level I U S Features,attributes and relationships Intermediate Level Images Low level Input image

DESCRIPTION SEMANTIC INTERPRETATION SCENE MODELS SYMBOLIC REPRESENTATION REGION / EDGE FEATURE EXTRACTION IMAGE BLOCK DIAGRAM OF BOTTOM-UP APPROACH

SCENE DESCRIPTION INTERPRETATION MODEL SYMBOLIC REPRESENTATION REGION / EDGE FEATURE EXTRACTION IMAGE TOP-DOWN APPROACH

SCENE DESCRIPTION INTERPRETATION MODEL SYMBOLIC REPRESENTATION REGION / EDGE FEATURE EXTRACTION IMAGE REPRESENTATIVE BLOCK DIAGRAM OF TOP DOWN BOTTOM UP APPROACH

KNOWLEDGE SOURCES METHODS SCHEDULER BLACKBOARD BLACKBOARD MODEL APPROACH

Approaches • Statistical Model: based on underlying statistical model of patterns and pattern classes. • Syntactic Model: pattern classes represented by means of formal structures as grammars, automata, strings, graphs, constraint matrix etc. • Structural Model: knowledge base,

Statistical PR

An Example • “Sorting incoming Fish on a conveyor according to species using optical sensing” Sea bass Species Salmon

Problem Analysis • Set up a camera and take some sample images to extract features • Length • Lightness • Width • Number and shape of fins • Position of the mouth, etc… • This is the set of all suggested features to explore for use in our classifier!

Preprocessing • Use a segmentation operation to isolate fishes from one another and from the background • Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features • The features are passed to a classifier

Classification • Select the length of the fish as a possible feature for discrimination

The length is a poor feature alone! Select the lightness as a possible feature.

Threshold decision boundary and cost relationship • Move our decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!) Task of decision theory

Adopt the lightness and add the width of the fish Fish xT = [x1, x2] Lightness Width

We might add other features that are not correlated with the ones we already have. A precaution should be taken not to reduce the performance by adding such “noisy features” • Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:

Decision given the posterior probabilities X is an observation for which: if P(1 | x) > P(2 | x) True state of nature = 1 if P(1 | x) < P(2 | x) True state of nature = 2 Therefore: whenever we observe a particular x, the probability of error is : P(error | x) = P(1 | x) if we decide 2 P(error | x) = P(2 | x) if we decide 1

Optimal decision property “If the likelihood ratio exceeds a threshold value independent of the input pattern x, we can take optimal actions”

Syntactic PR

Syntactic Pattern Recognition Technique Syntactic Pattern Recognition consists of three major steps: Preprocessing which improves the quality of an image, e.g. filtering, enhancement, etc. Pattern representation which segments the picture and assigns the segments to the parts in the model Syntax analysis which recognizes the picture according to the syntactic model: once a grammar has been defined, some type of recognition device is required, the application of such a recognizer is called Parsing. Syntactic methods are best suited to problems with clear primitives and stable intermediate structures, with well defined and known alternative. A very important issue in syntactic pattern recognition is that of grammatical interface.

Block Diagram of a Syntactic Pattern Recognition System Syntax Analysis (Parsing) Primitive (and relation) recognition Classification Input Pattern Representation Construction Preprocessing Segmentation Recognition Training Primitive (and relation) selection Automata Construction Grammatical Interface Training Patterns

Grammar • Grammars can be classified to four categories according to their productions: • Unrestricted grammar • Context sensitive grammar • Context free grammar • Regular grammar • Depending on the options available at each stage of the rewriting process, a grammar can be classified as: • Deterministic • Non-deterministic • Stochastic

K J F L G I H A B SEMANTIC PRIMITIVES Scene Background Objects (Subpattern) (Subpatterns) Object B Object A Floor L Wall K  G  F (Subpatterns) Face J Face H Face I Hierarchical Structures

c a d b unit 3 units a a a b d Rectangle 2 units b d c c c c c c a a a d d b b + + + + + + + + + An Example Pattern Primitives Tree Structure String form: aaabbcccdd More explicitly: ‘+’ head-to-tail concatenation a+a+a+b+b+c+c+c+d+d

An Example Pattern and its Structural Description Pattern Subpattern Pattern primitives b + b + c + c c + c + d + a + a + b

For this approach to be advantageous the simplest subpattern selected called pattern primitives, should be much easier to recognize the patterns themselves. Composition Of primitives patterns Accomplish byGrammar Once the primitives of the pattern are identified than recognition is accomplished by performing a syntax analysis or parsing.

a a b b a a b b b b c c d b b e c d b b b b a a An Exhaustive Example telocentric Chromosome Submedian Chromosome Context-free grammar G = (VN, VT, P {<Submedian Chromosome>, <telocentric Chromosome>}) Where VN = {<Submedian Chromosome>, <telocentric Chromosome>, <arm pair>, <left part>, <right part>, <arm>, <side>, <bottom>} VT = a c e b d

Productions {P}: <Submedian Chrom.> <arm pair> <arm pair> <telocentric Chrom.> <bottom> <arm pair> <arm pair> <side> <arm pair> <arm pair> <arm pair> <side> <arm pair> <arm> <right part> <arm pair> <left part> <arm> <left part> <arm> c <right part> c <arm> <bottom> b <bottom> <bottom> <bottom> b <bottom> e <side> b <side> <side> <side> b <side> b <side> d <arm> b <arm> <arm> <arm> b <arm> a

Now Given String babcbabdbabcbabd BOTTOM-UP PARSING OR DATA DRIVEN Submedian Chromosome arm pair arm pair arm pair arm pair left part left part arm arm arm arm arm arm arm arm side arm arm arm arm side a a c a b c d b b b b a b b d b

Structural PR

Example: Outdoor Scenes What is Image Understanding? Image Understanding is a process of understanding the image by identifying the objects in a scene and establishing relationships among the objects. Image Understanding is the task- oriented reconstruction and interpretation of a scene by means of images.

VISION [C.C. Parma] (1980) ARGOS [Steven M. Rubin] (1978) ACRONYM [R. A. Brooks] (1981) MOSAIC [T. Herman and T. Kanade] (1984) SPAM [D. M. McKeown] (1985) SCERPO [Lowe] (1987) SIGMA [Takashi and Vincent Hwang] (1990) Knowledge-Based Medical Image Interpretation [Darryl N. Davis] (1991) and many more … Different Image understanding Systems

N TOP ALFALFA LEFT CORN RIGHT BARLEY BOTTOM A B A A B A A A B A B A B A B B B A B C C B B C B B B B C C C C C C C C C C C C C C C C C C C C C C B A A B A B B B B A B B B B B B A B C B C C B B B B B C B C An example: Labeling of a very small picture (satellite) of agrarian nation Knowledge Network Assume satellite photograph:4  2 pixels Some possible labelling of 4  2 pixels image given the constraints of the knowledge network 1 2 3 4 ie 8 5 6 7 > 6000 possible labellings

Example 2: cars (c) streets (S) houses (H) A1 H2 (arbitrary) (at least one) (arbitrary) on (often) on (seldom) on (always) H1 inters., Junc., adj. S2 A3 S1 A2 other areas (A) (at least two) SCENE SCENE MODEL luggage boat rear wheel H1 above on adj body A1 H2 C before same size on above on adj pess. room adj adj before above Junc. adj adj front wheel hood S1 S2 A2 A3 OBJECT MODEL (CAR) INSTANCE OF THE MODEL

Given: Object Model Image How to match ? Two Approaches:- • Isomorphism • (ii) Largest Clique

HOUSE SCENE ROAD SCENE NO. OF CAR LIKE - A GARAGE ROAD DRIVEWAY ON SIDE TREE NO. OF TELEPHONEPOLES HOUSE GUARDRAILS SCENE CLASSES

OTHER AREAS [AT LEAST TWO] ROADS (R) [AT LEAST ONE] HOUSE (H) [ARBITRARY] TREES [ARBITRARY] CARS (C) [ARBITRARY] TELEPHONEPOLES ON ALWAYS ON RARELY OFTEN ADJACENT TO ADJ ALWAYS BEHIND OFTEN ON ALWAYS ROAD RAIL [ARBITRARY LENGTH] BEHIND ALWAYS CONNECTION ON ALWAYS NETWORK OF GENERAL ROAD_SCENE

Segmentation • Given input image is segmented into different regions by using different segmentation methods. • Cluster based segmentation • K-means clustering algorithm • Porter and Canagarajah Method • Validity Measure Method • Ohlander type segmentation • Comaniciu and Meer proposed Algorithm

By observing the results of above segmentation algorithms, segmented image obtained by Comaniciu and Meer proposed segmentation Algorithm is only used for further processing. • Results obtained by Comaniciu and Meer proposed segmentation algorithm are.. Original Image Segmented Image

Feature Extraction • Two types of features are extracted from the segmented regions. • Primary features: • These features calculation directly deals with image arrays. It includes Area, Mean intensities of R,G&B, contour length, mass centers,Minimum bounding rectangle(MBR) and etc. • 2. Secondary features : • These features are calculated from a set of primary features. It includes compactness of regions, linearity, normalized colors, Hue, saturation and intensity values and etc.

Formulae : r = R/(R+G+B), g = G/(R+G+B), b = B/(R+G+B), Intensity = (R+G+B)/3, Hue = arc tan 2(31/2(G-B),(2R-G-B)), Saturation = 1-3 min(r, g, b), Compactness = 4 (area)/(contour length)2, and many more…

Model Based Computer Vision

Model Based Computer Vision

Presentation Transcript

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Human-Computer Vision-based Communication

Computer Vision

Computer Vision

Computer Vision

Computer Vision

Computer Vision

GraphCut -based Optimisation for Computer Vision

Computer Vision

Computer Vision

Computer Vision