Hierarchical Framework for Efficient Image Retrieval

A Hierarchical Framework for Content-Based Image Retrieval - Dipti Vaidya

Outline of Presentation • Introduction and Motivation • Related Work • Shortcomings of Related Work • Problem Statement • Solution Approach • Details • Future Work

Introduction and Motivation • Gigabytes of images generated and stored everyday • Simple matching methods using text-based retrieval of these images are not appropriate • Make the information organized to allow efficient browsing ,searching and retrieval

Introduction and Motivation • This information or metadata can be split into 3 main categories : • Catalogue Info: type of image,author,date etc • Syntactic Content : Information about primary features like color, texture, shape,spatial relationship • Semantic Content : Information / knowledge about the content of the images, as to what the image represents. ex: a smiling girls represents a happy person.

Introduction and Motivation CBIR systems fall into 2 main categories : 1. Image retrieval by Syntactic Content : • Images can be presented to the system either in form of actual image or by sketch • These images can be processed and closest matches are returned • Image retrieval by Semantic Content : • Queries are posed and images are retrieved by matching the query to the knowledge encoded

Related Work Syntactic Image Retrieval (RIE): RIE – Most of the systems use Color or Texture Feature General Approach : Record a distribution of colors or textures in the images.Images with the smallest difference values from example image are matches. For example they use a simple color histogram to record the predominant colors in the following example image:

Related Work This causes problem when querying. Instead of looking for pictures of a dog, it looks for images which have a brown blob on a green background. Query: results Image not found,cause the system has no idea of semantics

Related Work Retrieval by Semantic Content : • The Knowledge-Based spatial image model defines a 3 – layer model for representing knowledge about the domain specific content of the images – Chu, Hsu, Tiara • Ontology based Photo Annotation -agent,object,action approach using Ontology- University of Amsterdam • Structured Knowledge Representation for Image Retrieval using DL. – Based on Image regions– Meghini,et al

Related Work Retrieval by semantic content has been shown to be successful , but it has the following drawbacks: • Requires significant effort by domain experts when developing • Unlikely to be extensible beyond a specific problem domain

Related Work – Bob’s System Proposes a system which uses the complimentary strengths of Semantic and Syntactic Retrieval methods: • Create a small domain database that will process semantic queries related to the problem domain and will generate a set of example images • These synthesized example images can be sent to a larger Image database, for matching by Example.

Advantages of Hierarchical Framework • Provides a semantically-relevant method of querying an image database • Decouples the knowledge organization from the image matching mechanism • Requires expert involvement to encode knowledge of smaller set of data • Images and image matching algorithms in the large target database can change and improve with no impact to knowledge • Multiple, distributed domain databases can be used against one target database

Hierarchical CBIR Framework Diagram

Related Work – Bob’s System Here’s what Bob’s system uses : • Structured annotation ( agent,action, object) to specify semantic values of interest • Domain specific ontology to represent agents and objects and has an image stored for each of the concept introduced in the ontology • Spatial Relationships to encode actions • A query is posed , and it is semantically processed to generate a set of example images • These images are then sent to Gift for Retrieval by Example

object Two wheelers bike motorbike hat man agent Example (Bob’s system) Sample database Ontology man hat bike • Role • Wears(agent, object) = object above agent • Rides(agent,two-wheeler) = agent above two wheeler

Example (simple queries) Query: man wears hat Query: man rides a bike

Issues With Bob’s System • Scalability and Maintenance: If we want to introduce a new concept in the domain ontology,we have to do it manually • Can not store composite images : There is no method to reuse the synthesized images • The results returned from GIFT ( image matching server ) could be improved such that they are more domain query related.

Problem Statement • We need to build a system such that we are able to represent the image content in a way that is Hierarchical, so that we can make semantic queries; Compositional so that we can build complex terms from simple terms and thus reuse the synthesized images • To be able to retrieve better results from the Image Matching server, given example images.

Problem Formulation • Let DA be a domain database, SDA be semantic values in DA and Q be a query that resolves to a specific set of semantic values in DA: {SDA} QUERY(Q, DA) (1) • Furthermore, let DA have the property that each semantic value can be mapped to a set of image {I} • Then there is a resultant mapping of Q to a set of example images {Iex} that can be represented as: {Iex}  MAP(QUERY( Q , DA ),DA) (3)

Problem Formulation (Cont) • Now suppose an RIE database exists and has a RETRIEVE function which returns all images {i} that match any of a set of example images {Iex} {i}  RETRIEVE( {Iex}, T ) (4) • Combining equations (3) and (4) we get {i}  RETRIEVE( MAP(QUERY(Q, DA),DA) , T ) (5) • Equation (5) shows we can make a semantic query in one collection of knowledge (DA) and retrieve matching images from another (T). • This approach can be considered a hierarchy of RIE and RSC systems.

Problem Formulation • In order to be able realize the hierarchical framework we need to solve several problems: • Method to define and encode domain knowledge such that we can use it for semantic queries • Method to represent the semantic content of the Composite Images and map it to the domain knowledge base • Method to define the actions • Method to be able to reuse the synthesized Images • Develop the way to interface with the RIE such that we get more specific domain related results from the Target database • The solutions of the above problem will be my contribution to the system

Solution Approach Our primary aim is to investigate if Description Logics in general, can be used to represent the contents of the domain specific database in a way that it is hierarchical and compositional Or could we do with using Semantic Networks, thus reducing the computational power ??

Hierarchical CBIR System Diagram Image DB Synthesized Images User Query Racer DL system ( domain knowledge base) RIE - GIFT Image Synthesis Feature selection R.F. Process Query Results Target DB Query Results Returned images

Description Logic What is Description Logic? It is a language that allows reasoning about information in particular supporting the classification of descriptors Description Logic models a domain in terms of 3 things: Individuals – which represent instances of objects which we are modeling Concepts – denoting a collection of individuals or instances Roles – relationships between or attributes of concepts or individuals

Example • Concept Example • person represents all human beings • fruit represents all the fruits • Individual Example • Man, woman are individuals of the concept person • Banana is an individual of the concept fruit • Role Example • Eat(person,fruit) is a relationship describing person and something they are eating

DL • Using these small blocks we can build more complex expressions • Example • Eat(person, fruits) • Eat(Person, fruits) & Sits(Person,Chair) • Example • cool-student = student & drives(student,Ferrari)

Reasoning with DL • Subsumption ( ) • Basic inferencing tool • Checks whether a concept is more general than other • Example: • mother woman

Reasoning with DL Classification • Collection of descriptions can be classified using subsumption, providing a hierarchy of descriptions ranging from general to specific. Example person driving car and wearing hat person driving car person New Concept: person wearing hat ? person driving car and wearing hat person driving car Person person wearing hat Person

Automatic Classification • person person driving car person driving car and wearing hat New concept or query: person wearing hat ???

Architecture of DL • DL is described as being split into 2 parts. T-Box & A-Box T- Box => Subsumption & Classification A-Box => Reasons about relationships between individuals thus providing classification and retrieval Eg: mammal vehicle person dog person wearing cap person driving bus bus bike person wearing cap & driving bus Mary’s neighbor driving nimbus Ted

Details Describing the semantic contents of the image: We must describe three types of spaces: that of images themselves , that of the real world concepts they contain and that of what each action or role means. For E.g. the following Image can be described as : Image instance Image1 Image1 contains ( person driving car,wearing hat) Image1 contains ( person driving car) Image1 contains ( person wearing hat) Image1 contains ( person) ; Image1 contains (car) Image1 contains (hat)

Tools Used Describing the Semantic Content : In order to describe the world concepts type-space and progressively link the image instances with these concepts, we use a DL system called RACER

Racer DL • RACER is a semantic web inference engine for developing ontologies • RACER is a Description Logic reasoning system with support for • TBoxes with generalized concept inclusions • ABoxes

Example of Racer Files T-box: (signature :atomic-concepts (human person female male woman man parent mother father grandmother aunt uncle sister brother only-child pet organization politicalorganisation politician malepolitician image) :roles ((has-descendant :transitive t)(has-pet :domain person :range pet)(covers :transitive t :domain image :range human) (has-child :parent has-descendant :domain parent :range person)

Example of Racer Files A- Box (instance image01 image)(instance image01 malepolitician) (related image01 jonmajor covers) (instance jonmajor malepolitician) (instance alice mother) (related alice betty has-child) (related alice charles has-child) QUERIES: (concept-instances sister) (concept-ancestors mother) (concept-descendants man) (individual-fillers alice has-descendant)

Identifying objects in an image • Segment sections of images and associate them with concepts 2 men standing mountains

Identifying Objects in an Image Implemented an image annotator which: Allows the user the identify the objects in the image and store the information about it’s region of interest This data is stored in the form of an XML file, which can be parsed during the synthesis process.

Resolving queries IN DLS, query language and description language are unified Query: man driving a car and wearing a hat • 1- attempt to find an image describe with the query (no synthesizing needed), if not found then • 2- break query into components (synthesizing needed) • Man driving a car, man wears a hat, if not found • 3- break query into components (synthesizing needed) • man ,car, hats,actions…(for actions, we can have a geo-spatial modeling which maps the action..can use Bob’s definitions here.

Example 2 (composite queries) Query: girl rides a bike and wears a hat

Hierarchical CBIR System Diagram Image DB Synthesized Images User Query Racer DL system ( domain knowledge base) RIE - GIFT Image Synthesis Feature selection R.F. Process Query Results Target DB Query Results Returned images

RIE The query is posed to generate a set of example images These example images will then be sent to the Image Matching Server for Retrieval by Example from the target database This Server uses various image features such as color,texture, shape to retrieve similar images

Problem 2 In order to be able to improve the quality of images retrieved from the Image Matching server, the returned images should be more domain specific Suggestion: use feature selection ( color, texture or shape) for handling queries in RIE

Problem 2 • In the CBIR, where the feature selection is used, the following problems should be solved: • It must select a subset of features that provides the best input algorithm for the server (GIFT) • Since we want the feature selection to take place at every query, it must be time efficient • It must be able to handle example set of size as small as 3 or 5

Proposed Solution • Users can manually assign image feature combination in our user interface for image retrieval. • Query request is computed and processed by query server. • The server processes all procedures, find out the closest images, display them on query viewer, and transmitted user feedbacks to index the images • More accurate query result would be obtained in the next round of search.

Future Work • Need to implement the image synthesis part • Populate the domain knowledge base with more concepts and images • Find a method to implement the improved RIE • Test the retrieval results by giving in different algorithms based on texture, color and shape

Hierarchical Framework for Efficient Image Retrieval

Hierarchical Framework for Efficient Image Retrieval

Presentation Transcript

Content-Based Image Retrieval

Content-based Image Retrieval

Content-Based Image Retrieval

Content-Based Image Retrieval

Content-based Image Retrieval

Private Content Based Image Retrieval

Content-Based Image Retrieval (CBIR)

Bayesian Content-Based Image Retrieval

Content Based Image Retrieval

Content-based Image Retrieval (CBIR)

Content Based Image Retrieval

Content-based Image Retrieval

Content Based Image Retrieval

Content-Based Image Retrieval

Content-Based Image Retrieval

A Study on Content Based Image Retrieval

Content-based Image Retrieval (CBIR)

Hierarchical Framework for Content Based Image Retrieval Status and Implementation

Content Based Image Retrieval

Content Based Image Retrieval

Content based image retrieval Projects