Mutual Information and Choice of AND and OR

1 / 12

# Mutual Information and Choice of AND and OR - PowerPoint PPT Presentation

Mutual Information and Choice of AND and OR. Dayu 18 Nov 2005. An Example. Query No. 605 Great Britain health care We choose it because it consists of 4 terms Performance in MAP. Using Two terms.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Mutual Information and Choice of AND and OR' - mateo

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Mutual Information and Choice of AND and OR

Dayu

18 Nov 2005

An Example
• Query No.605 Great Britain health care
• We choose it because it consists of 4 terms
• Performance in MAP
Using Two terms
• Based on the performance (in MAP) of “AND” and “OR” two terms, we guess the manner that these two terms affect relevance
• Great Britain health care  G B H C
What does “Yes” mean?
• If “Yes” (i.e. MAPAND> MAPOR), it means that these two terms can complement or disambiguate each other to make more relevant information.
• Denoted by term1-term2
• If “No” (i.e. MAPAND< MAPOR), it means that these two terms
• (1) seldom co-occur or
• (2) more or less synonyms
• Denoted by (term1,term2)
• If MAPAND≈ MAPOR, it means that these two terms always co-occur
Overall Relationships

In conclusion, relationships of each pair of the four terms are

consistent. It’s (G,B)-(H,C)

• (G,B)-(H,C)
• Could we use (G or B) and (H or C)?
• Performance MAP=0.0762
• Compared with:
A Method to estimate the relationship using MI
• By mutual information.
• MI=P(A,B)/P(A)P(B)
• P(A,B)= # of IUs contains both A and B / total # of IUs
• P(A)= # of IUs contains A / total # of IUs
• P(B)= # of IUs contains A / total # of IUs

Hypothesis: The MI is bigger, we have more confidence to use OR

Social

b

0.78

0.78

Tax

1.07

a

1.07

Securities

0.79

c

0.79

Variance of MI = 0.019

Query: SDI Star Wars

b

0. 8

a

Variance of MI = 0.076

1.1

c

0.4

b

0. 56

Variance of MI = 0.017

a

0.41

c

0.23

Future Work
• Investigate on more widespread queries.
• Does the variance of MI between each pair affect to use AND or OR?
• Should we additionally bring MI of two terms into the computation of allo-T edge?