1 / 14

Automatic Question Generation from Queries Natural Language Computing, Microsoft Research Asia

Automatic Question Generation from Queries Natural Language Computing, Microsoft Research Asia. Chin-Yew LIN cyl@microsoft.com. Generating Questions from Queries. Where is the next Hannah Montana concert?. Q2Q. Hannah Montana concert. Q2Q as a question generation shared task.

dex
Download Presentation

Automatic Question Generation from Queries Natural Language Computing, Microsoft Research Asia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Question Generation from QueriesNatural Language Computing, Microsoft Research Asia Chin-Yew LIN cyl@microsoft.com

  2. Generating Questions from Queries Where is the next Hannah Montana concert? Q2Q Hannah Montana concert Q2Q as a question generation shared task

  3. Remember Ask Jeeves? “How large is British Columbia?”

  4. Live Search QnA (English)

  5. Naver Knowledge iN (Korea) Naver “Knowledge iN “Service • Opened at October 2002 • 70 MillionsKnowledge iN DB are collected (2007. 06) • # of Users: 12 millions • Upper level users (higher than Kosu): 6,648 (0.05%) • Distribution of knowledge • Education, Learning: 17.78% • Computer, Communication: 12.89% • Entertainments, Arts: 11.42% • Business, Economy: 11.42% • Home, Life: 7.44%

  6. Baidu Zhidao (China) 17,012,767 resolved questions in two years’ operation. 8,921,610 are knowledge related. 96.7% of questions are resolved. 10,000,000 daily visitors. 71,308 new questions per day. 3.14 answers per question. http://www.searchlab.com.cn (中国人搜索行为研究/User Research Lab of Chinese Search)

  7. Yahoo! Answers (Global;Marciniak) Launched in December 2005. 20 million users in the U.S. (> 90 million worldwide). 33,557,437 resolved questions (US; April 2008). ~70,000* new questions per day (US). 6.76* answers per question (US).

  8. Question Taxonomy ISI’s question answer typology (Hovy et al. 2001 & 2002) • Results of analyzing over 20K online questions • 140 different question types with examples • http://www.isi.edu/natural-language/projects/webclopedia/Taxonomy/taxonomy_toplevel.html Liu et al. (COLING 2008)’s cQA question taxonomy • Derived from Broder’s (SIGIR Forum 2002) web serach taxonomy • Results of analyzing 100 randomly sampled questions from top 4 Yahoo! Answers categories • Entertainment & Music, Society & Culture, Health, and Computer & Internet

  9. Main Task: Q2Q Generate questions given a query • Query: “Hannah Montana concert” • Questions: • “How do I get Hannah Montana concert tickets for a really good price?” • “What should i wear to a hannah montana concert?” • “How long is the Hannah Montana concert?” • … Subtasks • Predict user goals • Learn question templates • Normalize questions

  10. Data Preparation • cQA archives • Live Search QnA • Yahoo! Answers • Ask.com • Other sources • Query logs • MSN/Live Search • Yahoo! • Ask.com • TREC and other sources • Possible process • Sample queries from search engine query logs • Ensure broad topic coverage • Find candidate questions from cQA archives given queries • Create mapped Q2Q corpus for training and testing

  11. Intrinsic Evaluation Given a query term • Generate a rank list of questions related to the query term • Open set – use pooling approach • Pool all questions from participants • Rate each question as relevant or not • Compute recall/precision/F1 scores • Closed set – use test set data as gold standard • Metrics • Diversity, interestingness, utility, and so on.

  12. Extrinsic Evaluation A straw man scenario • Task – online information seeking • Setup • A user select a topic (T) she is interested in. • Generate a set of N queries given T and a query log. • The user select a query (q) from the set. • Generate a set of M questions given q. • The user select the question (Q) that she has in mind. • If the user does not select any question, record it as not successful. • Send q to a search engine (S); get results X. • Send q, Q, and anything inferred from Q to S; get results Y. • Compare results X and Y using standard IR relevance metrics.

  13. Summary Task: Question generation from queries Data: • Search engine query logs • cQA question answer archives • Question taxonomies Evaluation: • Intrinsic – evaluate specific technology areas • Extrinsic – evaluate its effect on real world scenarios Real data, real task, and real impact

  14. Analyze cQA Questions (Liu et al. COLING 08)

More Related