Loading in 5 sec....

An Active Learning Framework for Content-Based Information RetrievalPowerPoint Presentation

An Active Learning Framework for Content-Based Information Retrieval

Download Presentation

An Active Learning Framework for Content-Based Information Retrieval

Loading in 2 Seconds...

- 110 Views
- Uploaded on
- Presentation posted in: General

An Active Learning Framework for Content-Based Information Retrieval

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

An Active Learning Framework for Content-Based Information Retrieval

- An example of typical CBIR system is “Image retrieval system”.
- Here in CBIR we have three major aspects. Of which one that is important is “Feature Extraction”
- There are many features that have been designed for general or specific CBIR systems. Of which few of them showed good retrieval performance and few didn’t.
- Hence this gap between “low-level-features” and “high level semantic meanings of the objects” has been the major obstacle to more successful retrieval system.
- To over come this gap this paper talks about “Relevance Feedback” and “ Hidden annotation”

- So this paper entirely talks about
- How to reduce gap between “low-level-features” and “high level semantic meanings of the objects”
- How to improve the performance of the information retrieval using the framework they defined.

- Relevance Feedback moves the query point towards the relevant objects Or it selectively weighs the features in the low-level-feature space based on user feedback.
- These are the powerful tools for bridging the gap between low-level features and high level semantics.
- However this has got a limitation i.e.
- If the low-level-features of a set of semantically similar objects lie in the space as several clusters, querying with an object in one cluster would not be able to retrieve semantically similar objects in other clusters by reweighing the space.

- Hidden annotation is the preprocessing stage in CBIR. It is said to be learning stage.
- Here we annotate the objects in the entire database or annotate a subset of database manually selected.
- However, Even this framework has got some few challenges that got to be sorted out…
- Challenges are:
- What is the best subset of objects we will have to annotate….?
- How many number of training samples need to be annotated..?

- To overcome those challenges they combined active leaning to hidden annotation, to determine which objects should be annotated in the training database.
- This could be done by providing sample objects to the annotator.
- And the selection of sample objects is based on which object by annotating would give max information and decrease lot of uncertainty.
- So now the question is how to select a particular object which would give max information..??

- We maintain list of probabilities, each of them indicating the probability of this object having the corresponding attribute.
- If an object is annotated then the probabilities are set to be one or zero depending on whether, corresponding attribute will characterize the object or not.
- If the objects are not annotated we would estimate the probability based on its annotated neighbors. And this is done using “KERNEL REGRESSION”.
- With this list of probabilities, we are able to tell which object the system is most uncertain of, and propose it as a sample query to the annotator.

- So now the challenge of which object to annotate is resolved.
- How many number of training samples need to be annotated..?

- The goal of “selective sampling” is to reduce the number of training samples that need to be annotated, by examining objects that are not yet annotated and selecting the most informative ones for the annotator.
- Selective sampling again has many approaches using which one can reduce the uncertainty of the objects.
- However, author has used their own learning algorithm to reduce the number of training samples.

- In order to reduce the number of training samples we need to find a general criterion to measure how much information the annotation can provide to the system i.e. nothing but it has to calculate “Information Gain/Knowledge Gain”
- Let Oi be the objects in the database
- Ak be the k attributes the annotator wants to use for annotation
- Pik be the probability that object “Oi” has attribute “Ak”
Now if Pik=1 then it means the object Oi has been annotated with attribute Ak, Pik=0, if otherwise.

So using these probability values he calculated the expected information gain by proving these values to the “Uncertainty Measurement”

- In order to derive the expected information gain when we annotate a certain object, author has defined “Uncertainty Measurement”, which is given as follows.
- Ui=ψ(Pi1,Pi2,…….,PiK), i=1,2,…….,N
- Where Ui is the uncertainty measurement
- Ψ(.) is a function
- Oi is number of objects in training database

- Ui=ψ(Pi1,Pi2,…….,PiK), i=1,2,…….,N
- Author wanted Uncertainty Measurement to have following properties.
- If Object Oihas been annotated, Ui=0
- If PiK=0.5, for K=1,2,…..K, i.e. we know nothing about the object, Hence uncertainty is maximum
- Given PiK,K=1,2,….,k, if it is uncertain that object Oi, has or does not have some attributes, Ui should be large.

- Since the third property of Ui is not presented in strict sensethey have defined “Entropy”, which is a well known uncertainty measurement.
- For instance if K=1 in third property, then only one attribute is considered. Then in this case “Entropy” is a good Uncertainty measurement.
- Ui=ψ(Pi1)=H(Pi1)= -Pi1logPi1-{(1-Pi1)log(1-Pi1)}
Where

- ‘H’ represents the “Entropy function”

- Ui=ψ(Pi1)=H(Pi1)= -Pi1logPi1-{(1-Pi1)log(1-Pi1)}

- Distribution of the objects in the low-level feature space is one more factor that affects the retrieval performance.
- That’s because, annotating the objects at high probability region and low probability region may give the system different amounts of information, which would lead to low retrieval performance.
- Hence to overcome this limitation, author has defined the “knowledge gain” equation, that the annotator can give to the system by annotating the object Oi
- Gi=qi.Ui=qi.ψ(Pi1,Pi2,……..,PiK), i=1,2,3,……,N
Where

- Gi defined as knowledge gain
- Qi probability density function around the object Oi
- Ui Uncertainty measurement

- Gi=qi.Ui=qi.ψ(Pi1,Pi2,……..,PiK), i=1,2,3,……,N

- The criterion of choosing the next sample object is to find the unlabeled object Oi that has maximum knowledge gain Gi

- We know that annotated models tend to infer knowledge to their nearby neighbors.
- If a model has some of its neighbors annotated, its probability list needs to be updated. Meanwhile, if the objects are far from any of the annotated objects, we do not want to link the semantic meanings b/w them. Such semantic meaning extension fits the framework of “kernel regression” very well.
- An example for the kernel regression,
- Lets consider one of the attributes as Ak.
- Let xmbe the feature vectors of all the currently annotated objects.
- Let Pmk be the corresponding probabilities.
Then Pmk= 1, if the object corresponding to xm has Ak

And Pmk= 0, Otherwise

- Author also proposed a simple biased kernel regression algorithm to estimate an annotated object whose feature vector is X, the probability of this object having attribute Ak
Equation:

P(xεAk) = {[Σm=1mwmPmk + woP(k)] / [Σm=1mwm + wo]}

Where

P(k)is the Prior Probability of any object that belongs to attribute Ak

wo is the tendency of the object towards the prior probability.

If wo= 0 then above equation degenerates to the “Normal Kernel regression.

- Author has performed experiments on both synthetic database and real database.
- After the experiments following were the conclusions drawn.
- Active learning always performs better than random sampling. It would save larger than 50% of annotations to be annotated.
- Adaptive kernel bandwidth does not help improve performance.
- The bias weight wo has almost no impact on the system performance.

- In the experiment they conducted they have chosen database with 3-D model objects, consisting of around 1750 objects in count.
- Here most of them i.e. 1/3rd were aircrafts. From this they have extracted ten features for each object.
- Firstly they have used their active learning algorithm to distinguish between aircrafts and non-aircrafts.
- They measured the annotation efficiency by testing the final retrieval performance of their retrieval system.
- Retrieval performance measurement is defined as.
eq(s)=1/R Σjε(top R results dqj(s)

of query q)

Where

dqj(s) is the semantic distance b/w the query and the jth retrieved object.

eq indicates the average matching error for the top R retrieved objects with respect to query.

- Finally, using the experimental proofs they concluded that, Proposed “active learning algorithm” outperforms the “random sampling algorithm” in all the experiments.
- Hence this shows that Hidden annotation with active learning is a very powerful tool to improve the performance of CBIR systems.

- Why is the annotation termed 'hidden’?
- Is testing on a synthetic database credible?
- How do we know which technique to use I.e. "relevance feedback" or "hidden annotation"..??
- How is uncertainty measure defined for multiple attributes?

- The uncertainty should be defined based on the joint probability of all the attributes.
- For a certain object Oi and a certain attribute Ak, we define the individual entropy as
Hik= - PiklogPik-(1-Pik)log(1-Pik)

The overall uncertainty for an object Oi is defined by a weighted sum of the entropies for all the attributes, i.e.

Ui=Σk=1kwk(s)Hik

where k is the total number of attribute

wk(s) is the semantic weight for each attribute