1 / 12

Mutual Information and Choice of AND and OR

Mutual Information and Choice of AND and OR. Dayu 18 Nov 2005. An Example. Query No. 605 Great Britain health care We choose it because it consists of 4 terms Performance in MAP. Using Two terms.

mateo
Download Presentation

Mutual Information and Choice of AND and OR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mutual Information and Choice of AND and OR Dayu 18 Nov 2005

  2. An Example • Query No.605 Great Britain health care • We choose it because it consists of 4 terms • Performance in MAP

  3. Using Two terms • Based on the performance (in MAP) of “AND” and “OR” two terms, we guess the manner that these two terms affect relevance • Great Britain health care  G B H C

  4. What does “Yes” mean? • If “Yes” (i.e. MAPAND> MAPOR), it means that these two terms can complement or disambiguate each other to make more relevant information. • Denoted by term1-term2 • If “No” (i.e. MAPAND< MAPOR), it means that these two terms • (1) seldom co-occur or • (2) more or less synonyms • Denoted by (term1,term2) • If MAPAND≈ MAPOR, it means that these two terms always co-occur

  5. Overall Relationships In conclusion, relationships of each pair of the four terms are consistent. It’s (G,B)-(H,C)

  6. Advanced Boolean Operation • (G,B)-(H,C) • Could we use (G or B) and (H or C)? • Performance MAP=0.0762 • Compared with:

  7. A Method to estimate the relationship using MI • By mutual information. • MI=P(A,B)/P(A)P(B) • P(A,B)= # of IUs contains both A and B / total # of IUs • P(A)= # of IUs contains A / total # of IUs • P(B)= # of IUs contains A / total # of IUs Hypothesis: The MI is bigger, we have more confidence to use OR

  8. Relationship between MI and (MAPor-MAPand)/min(MAPand,MAPor)

  9. Social b 0.78 0.78 Tax 1.07 a 1.07 Securities 0.79 c 0.79 Variance of MI = 0.019

  10. Query: SDI Star Wars b 0. 8 a Variance of MI = 0.076 1.1 c 0.4

  11. Query: college education advantage b 0. 56 Variance of MI = 0.017 a 0.41 c 0.23

  12. Future Work • Investigate on more widespread queries. • Does the variance of MI between each pair affect to use AND or OR? • Should we additionally bring MI of two terms into the computation of allo-T edge?

More Related