1 / 22

Recommending Questions Using the MDL-based Tree Cut Model

Recommending Questions Using the MDL-based Tree Cut Model. Yunbo CAO , Huizhong DUAN, Chin-Yew LIN, Yong YU, and Hsiao-Wuen HON Natural Language Computing Group Microsoft Research Asia. Community-based Q&A Service. Question Search. Other Aspects about Hamburg or Berlin.

kylee
Download Presentation

Recommending Questions Using the MDL-based Tree Cut Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recommending Questions Using the MDL-based Tree Cut Model Yunbo CAO, Huizhong DUAN, Chin-Yew LIN, Yong YU, and Hsiao-Wuen HON Natural Language Computing Group Microsoft Research Asia

  2. Community-based Q&A Service Question Search Other Aspects about Hamburg or Berlin More Aspects (NOT DISCOVERED) How far is it from Berlin to Hamburg? Where to see between Hamburg and Berlin? …

  3. Question Recommendation • The problem • You ask: • Any cool clubs in Berlin or Hamburg? • We recommend: • How far is it from Berlin to Hamburg? • Where to see between Hamburg and Berlin? • Any good hostelsinHamburgorBerlin? • The principle of question recommendation • A good recommendation should be different from the queried question in question focus but similar in question topic.

  4. Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Related work • Experimental results • Conclusions

  5. Our Approach • The Principle: A good recommendation should be different from the queried question in question focus but similar in question topic. • Query: Any cool clubs in Hamburg or Berlin? • Topic terms: coolclubs, Hamburg, Berlin • How can we discriminate question topic from question focus? • different • Sameorclose • Topic terms: wheretosee, Hamburg, Berlin • Related question: where to see in Hamburg or Berlin

  6. Specificity – Weighing Terms Travel @Yahoo! Answers Travel @Yahoo! Answers China Anyone know where to see the Dragon Boat Festival in Beijing? Where is a good (Less expensive) place to shop in Beijing? What's the cheapestway to get from Beijing to HongKong? Europe Howfar is it from Berlin to Hamburg? What is the cheapestway from Berlin to Hamburg? Whereto see between Hamburg and Berlin? Howlongdoesittake from Hamburg to Berlin?n the train? Asia Pacific Asia Pacific China China Japan Japan … … Europe Europe The specificity of a topic term is the inverse entropy of the distribution of the topic term over the sub-categories. … …

  7. Order Topic Terms by Specificity • Query: Any cool clubs in Hamburg or Berlin? • Topic Chain: Hamburg Berlincoolclubs • Topic Terms: cool clubs, Hamburg, Berlin coolclubs Question Topic Question Focus Hamburg Berlin wheretosee howfar • Topic Terms: where to see, Hamburg, Berlin • Topic Chain: Hamburg  Berlinwhere to see • Hamburg  Berlinhowfar • Related questions: Where to see in Hamburg or Berlin? • How far is it from Berlin to Hamburg?

  8. Scoring the Candidates • The recommendation score over a queried question and a recommendation candidate is defined as where Question Topic Question Focus

  9. The MDL-based Tree Cut Model • The MDL principle • Model description length: uniform prior • Parameter description length: number of parameters • Data description length: minus log likelihood • The tree cut model (Li and Abe, 1998)

  10. Reduction of Topic Terms

  11. hotel (3983) western (40) nice (224) beachfront (5) affordable (248) suite (3) good (14) inexpensive (12) nice (2) embassy (1) great (3) good (3) hotel (3983) western (66) nice (224) beachfront (11) suite (6) affordable (248) good (14) inexpensive (12) Reduction of Topic Terms

  12. Any cool clubs in Berlin or Hamburg? cool club where to see Berlin how far Hamburg good hostel fun club Where to see between Hamburg and Berlin? How far is it from Berlin to Hamburg? Any good hostels in Hamburg or Berlin? What are the best/most fun clubs in Hamburg? Determining the Cut

  13. cool club Berlin where to see how far Hamburg good hostel fun club Flow of Question Recommendation Index Related Questions: 1. Where to see between Hamburg and Berlin? 2. How far is it from Berlin to Hamburg? 3. Any good hostels in Hamburg or Berlin? 4. What are the most/best fun club in Hamburg? STEP 1: Retrieve Related Questions Query: any cool clubs in Berlin or Hamburg? STEP 2: Discriminate Question Topic from Question Focus Recommendation: 1. Where to see between Hamburg and Berlin? 2. How far is it from Berlin to Hamburg? 3. Any good hostels in Hamburg or Berlin? Search: 1. What are the most/best fun club in Hamburg? STEP 3: Rank Questions on the basis of the cut

  14. Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Related work • Experimental results • Conclusions

  15. Related Work • Question search (Jeon et al., 2005; Sneiders, 2002; Lai et al., 2002; Burke et al., 1997) • Find semantically equivalent questions given queries • Satisfying different users’ needs when compared to question recommendation • Query suggestion (Cuerzan & White, 2007; Jensen et al., 2006; Fonseca et al., 2003) • Suggest related queries through query log mining • Query logs are usually absent for questions • Query substitution (Jones et al., 2006) • Generate queries by replacing query terms • New queries are close to the original queries

  16. Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Relatedwork • Experimental results • Conclusions

  17. Data and Evaluation Measures • The data • The resolved question from Yahoo! Answers 314,616 about ‘travel’ and 210,785 about ‘computers & internet’ • The test set developed via human judgments

  18. Experimental Results (Basic) • Travel • Computers & Internet

  19. Experimental Results (Basic) What's a good but cheap hotel/motel/anything in downtown Chicago?

  20. Effectiveness of MDL • The baseline methods • First = our approach – the MDL-based reduction of topic terms • Second = our approach – the MDL-based discrimination bet. question topic and question focus • Third = our approach – the MDL-based reduction of topic terms – the MDL-based discrimination bet. question topic and question focus • The use of the MDL is significant • The size of the vocabulary is 289,251 before the reduction of topic terms and 173,202 after the reduction. The reduction is about 40%. • The contribution given by the MDL-based selection of substitution is statistically significant

  21. Conclusions • Studied question recommendation by identifying question topics and question foci • Used the MDL-based tree cut model for • Reducing the set of topic terms • Discriminating question topics from question foci • Empirically verified the effectiveness of our approach to question recommendation

  22. Questions and Discussions!

More Related