learning surface text patterns for a question answering system l.
Skip this Video
Loading SlideShow in 5 Seconds..
Learning Surface Text Patterns for a Question Answering System PowerPoint Presentation
Download Presentation
Learning Surface Text Patterns for a Question Answering System

Loading in 2 Seconds...

play fullscreen
1 / 54

Learning Surface Text Patterns for a Question Answering System - PowerPoint PPT Presentation

  • Uploaded on

Learning Surface Text Patterns for a Question Answering System Deepak Ravichandran Eduard Hovy Information Sciences Institute University of Southern California From Proceedings of the ACL Conference, 2002 Goal Explore power of surface text patterns for open-domain QA systems Why This Paper

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Learning Surface Text Patterns for a Question Answering System' - jaden

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning surface text patterns for a question answering system

Learning Surface Text Patterns for a Question Answering System

Deepak Ravichandran

Eduard Hovy

Information Sciences Institute

University of Southern California

  • Explore power of surface text patterns for open-domain QA systems
why this paper
Why This Paper
  • Fall 2001 NLP project - QA system
winning team
Winning Team
  • Matt Myers & Henry Longmore
    • "If we were asked to design another question answering system, we would keep the same basic system as a foundation. We would then use more patterns and variations of patterns in the NE recognizer. We would use Machine Learning techniques, particularly for learning patterns for the NE recognizer."
meanwhile back at the batcave
Meanwhile, back at the batcave...
  • Automatic learning of surface text patterns for open-domain question answering
recent open domain systems
Recent Open Domain Systems
  • External knowledge, tools
    • Named Entity taggers
    • WordNet
    • parsers
    • hand-tagged corpora
    • ontology lists
recent o d systems cont
Recent O-D Systems (cont.)
  • Recent TREC-10 evaluation
    • winning system used just 1 resource
    • extensive list of surface patterns
    • surprised many
basic idea
Basic Idea
  • Investigate potential of surface patterns
    • Learn patterns
    • Measure accuracy
characteristic phrases
Characteristic Phrases
  • "When was <person> born”
    • Typical answers
      • "Mozart was born in 1756.”
      • "Gandhi (1869-1948)...”
    • Suggests phrases like
      • "<NAME> was born in <BIRTHDATE>”
      • "<NAME> ( <BIRTHDATE>-”
    • as Regular Expressions can help locate correct answer
auto learn patterns from web
Auto-learn Patterns from Web
  • Tagged corpus using AltaVista
  • Hand-crafted examples of each question type
  • Bootstrapping to build large tagged corpus as in Information Extraction (Riloff, 96)
  • Abundance of data on web - reliable statistical estimates
the system
The System
  • Assume sentence is a simple sequence of words
  • Search for repeated word orderings
  • Evidence for useful answer phrases
system cont
System (cont.)
  • Suffix trees to extract substrings of optimal length
  • Suffix trees from Computational Biology (Gusfield, 97)
  • Used to detect DNA sequences
  • Linear time on size of corpus
  • Don't restrict length of substrings
pattern learning algorithm
Pattern Learning Algorithm
  • Select example for question type
    • BIRTHYEAR questions select "Mozart 1756”
      • "Mozart" is question term
      • "1756" is answer term
  • Submit Q & A terms to AltaVista
  • Require both terms to be present
pattern learning cont
Pattern Learning (cont.)
  • Download top 1000 documents returned
  • Apply sentence breaker to documents
  • Keep only those sentences with both terms present
pattern learning cont16
Pattern Learning (cont.)
  • Terms can be present in various forms
    • e.g. Mozart as:
      • Wolfgang Amadeus Mozart
      • Mozart, Wolfgang Amadeus
      • Amadeus Mozart
      • Mozart
pattern learning cont17
Pattern Learning (cont.)
  • Specify ways in which Q term and A term can be specified in text
  • Easy to do for BIRTHDATE
  • Not so for Q types like DEFINITION
    • Many acceptable answers, all answers need to be used to ensure high confidence in precision
pattern learning cont18
Pattern Learning (cont.)
  • Process (tokenize, smooth whitespace, remove tags, etc.)
    • simplify input for egrep (or other regular expression tool)
  • Pass sentence through suffix tree constructor
    • finds substrings (and counts) of all lengths
pattern learning cont19
Pattern Learning (cont.)
  • Example:
      • “The great composer Mozart (1756-1791) achieved fame at a young age”
      • “Mozart (1756-1791) was a genius”
      • “The whole world would always be indebted to the great music of Mozart (1756-1791)”
    • Longest matching substring for all 3 sentences is "Mozart (1756-1791)”
    • Suffix tree would extract "Mozart (1756-1791)" as an output, with score of 3
pattern learning cont20
Pattern Learning (cont.)
  • Filter phrases in suffix tree
  • Keep phrases containing Q & A terms
  • Replace question term with <NAME>
  • Replace answer term with <ANSWER>
pattern learning cont21
Pattern Learning (cont.)
  • Repeat with different examples of same question type
    • “Gandhi 1869”, “Newton 1642”, etc.
  • Some patterns learned for BIRTHDATE
    • a. born in <ANSWER>, <NAME>
    • b. <NAME> was born on <ANSWER> ,
    • c. <NAME> ( <ANSWER> -
    • d. <NAME> ( <ANSWER> - )
pattern learning last one
Pattern Learning (last one!)
  • Strings partly overlapping (c & d) saved separately
    • Separate counts of occurrence frequencies
    • Can distinguish (in this case) between pattern for person still living (d) and more general pattern (c)
calculate precision
Calculate Precision
  • Submit query to AltaVista using only Q term ("Mozart")
  • Download top 1000 returned documents
  • Segment into sentences as in pattern learning algorithm
  • Keep sentences containing Q term
calculate precision cont
Calculate Precision (cont.)
  • For each pattern learned, check presence of pattern in sentence
    • pattern with <ANSWER> tag matched by any word
    • pattern with <ANSWER> tag matched by correct A term
      • Mozart was born in <ANY_WORD>
      • Mozart was born in 1756
calculate precision cont25
Calculate Precision (cont.)
  • Calculate precision of each pattern
  • P = Ca/Co
    • Ca = total # of patterns w/answer term present
    • Co = total # of patterns w/answer term replaced by any word
  • Keep only patterns matching sufficient # of examples (e.g. >5)
calculate precision cont26
Calculate Precision (cont.)
  • Obtain table of Regular Expression patterns
  • 1 table per question type
    • Precision of pattern
    • precision is probability pattern containing answer
    • principle of maximum likelihood estimation
calculate precision cont27
Calculate Precision (cont.)
  • BIRTHDATE table:
      • 1.0 <NAME> ( <ANSWER> - )
      • 0.85 <NAME> was born on <ANSWER>,
      • 0.6 <NAME> was born in <ANSWER>
      • 0.59 <NAME> was born <ANSWER>
      • 0.53 <ANSWER> <NAME> was born
      • 0.50 - <NAME> ( <ANSWER>
      • 0.36 <NAME> ( <ANSWER> -
calculate precision cont28
Calculate Precision (cont.)
  • Good range of patterns obtained with as few as 10 examples
  • Rather long list difficult to come up with manually
  • Largest number of examples the system required to get a good range of patterns?
calculate precision cont29
Calculate Precision (cont.)
  • Precision of patterns learned from one QA-pair calculated for other examples of same question type
  • Helps eliminate dubious patterns
    • Contents of two or more sites are the same
    • Same document appears in search engine output for learning & precision stages
finding answers
Finding Answers
  • To new questions!
  • Use existing QA system (Hovy et al., 2002b;2001)
  • Determine type of new question
  • Identify Question term
finding answers cont
Finding Answers (cont.)
  • Create query from Q term & do IR
    • use answer document corpus such as TREC-10 or web search
  • Segment returned documents into sentences & process as before
  • Replace Q term by Q tag
    • e.g. <NAME> in case of BIRTHYEAR type
finding answers cont32
Finding Answers (cont.)
  • Using pattern table developed for Q type, search for presence of each pattern
  • Select words matching <ANSWER> as potential answer
  • Sort answers by pattern's precision scores
  • Discard duplicate answers (string compare)
  • Return top 5
  • 6 different Q types
    • from Webclopedia QA Typology (Hovy et al., 2002a)
      • LOCATION
      • INVENTOR
      • WHY-FAMOUS
experiments cont
Experiments (cont.)
  • (BIRTHYEAR - previously shown)
      • 1.0 <ANSWER> invents <NAME>
      • 1.0 the <NAME> was invented by <ANSWER>
      • 1.0 <ANSWER> invented the <NAME> in
    • all have precision of 1.0
experiments cont35
Experiments (cont.)
      • 1.0 when <ANSWER> discovered <NAME>
      • 1.0 <ANSWER>'s discovery of <NAME>
      • 0.9 <NAME> was discovered by <ANSWER> in
      • 1.0 <NAME> and related <ANSWER>
      • 1.0 form of <ANSWER>, <NAME>
      • 0.94 as <NAME>, <ANSWER> and
experiments cont36
Experiments (cont.)
      • 1.0 <ANSWER> <NAME> called
      • 1.0 laureate <ANSWER> <NAME>
      • 0.71 <NAME> is the <ANSWER> of
      • 1.0 <ANSWER>'s <NAME>
      • 1.0 regional : <ANSWER> : <NAME>
      • 0.92 near <NAME> in <ANSWER>
experiments cont37
Experiments (cont.)
  • For each Q type, extract questions from TREC-10 set
  • Run through testing phase (precision)
  • Two sets of experiments
experiments cont38
Experiments (cont.)
  • Set one
    • TREC corpus is input
    • IR done by IR component of their QA system (Lin, 2002)
  • Set two
    • Web is input
    • IR performed by AltaVista
  • Measured by Mean Reciprocal Rank (?)
  • TREC
      • Question type # of Q's MRR
      • BIRTHYEAR 8 0.48
      • INVENTOR 6 0.17
      • DISCOVERER 4 0.13
      • DEFINITION 102 0.34
      • WHY-FAMOUS 3 0.33
      • LOCATION 16 0.75
results cont
Results (cont.)
  • Web
      • Q type # of Q’s MRR
      • BIRTHYEAR 8 0.69
      • INVENTOR 6 0.58
      • DISCOVERER 4 0.88
      • DEFINITION 102 0.39
      • WHY-FAMOUS 3 0.00
      • LOCATION 16 0.86
results cont41
Results (cont.)
  • System performs better on web data than on TREC corpus
  • Abundant web data makes it easier for system to locate answers with high precision scores
  • TREC corpus does not have enough candidate answers with high precision score
    • must settle for answers from low precision patterns
  • WHY-FAMOUS exception - may be due to small # of test Q's
shortcomings extensions
Shortcomings & Extensions
  • Need for POS &/or semantic types
      • "Where are the Rocky Mountains?”
      • "Denver's new airport, topped with white fiberglass cones in imitation of the Rocky Mountains in the background , continues to lie empty”
      • <NAME> in <ANSWER>
  • NE tagger &/or ontology could enable system to determine "background" is not a location
shortcomings cont
Shortcomings... (cont.)
  • DEFINITION Q's - match term too general, though correct technically
      • "What is nepotism?”
      • <ANSWER>, <NAME>
      • "...in the form of widespread bureaucratic abuses: graft, nepotism...”
      • "What is sonar?”
      • <NAME> and related <ANSWER>
      • "...while its sonar and related underseas systems are built...”
shortcomings cont44
Shortcomings... (cont.)
  • Long distance dependencies
      • "Where is London?”
      • "London, which has one of the most busiest airports in the world, lies on the banks of the river Thames”
      • would require pattern like:<QUESTION>, (<any_word>)*, lies on <ANSWER>
    • Abundance & variety of Web data helps system to find an instance of patterns w/o losing answers to long distance dependencies
shortcomings cont45
Shortcomings... (cont.)
  • More info in patterns regarding length of expected answer phrase
    • Searches in range of 50 bytes of answer phrase to capture pattern
    • fails under some conditions
      • "When was Lyndon B. Johnson born?”
      • "...lost to democratic Sen. Lyndon B. Johnson, who ran for both re-election and the vice presidency”
      • <NAME> <ANSWER> -
shortcomings cont46
Shortcomings... (cont.)
  • Lacks info that <ANSWER> in this case should be exactly replaced by 1 word
  • Could extend system to search for answer in range of 1-2 chunks
    • basic English phrases, NP, VP, PP, etc.
shortcomings cont47
Shortcomings... (cont.)
  • System doesn't work for Q types requiring multiple words from question to be in answer
      • "In which county does the city of Long Beach lie?”
      • "Long Beach is situated in Los Angeles County”
      • required pattern:<QUESTION_TERM_1> situated in <ANSWER> <QUESTION_TERM_2>
shortcomings cont48
Shortcomings... (cont.)
  • Performance of system depends greatly on having only 1 anchor word
  • Multiple anchor points
    • would help eliminate candidate answers
    • require all anchor words be present in candidate answer sentence
shortcomings cont49
Shortcomings... (cont.)
  • Does not use case
      • "What is micron?”
      • "...a spokesman for Micron, a maker of semiconductors, said SIMMs are..."
  • If Micron had been capitalized in question, would be a perfect answer
shortcomings cont50
Shortcomings... (cont.)
  • Canonicalization of words
      • BIRTHDATE for Gandhi: 1869; Oct. 2, 1869; 2nd October 1869; October 2 1869; 02 October 1869; etc.
    • Use date tagger to cluster all variations and tag with same term
    • Extend idea to smooth out variations in Q term for names: Gandhi, Mahatma Gandhi, Mohandas Karamchand Gandhi, etc.
  • Web results easily outperform TREC results
  • Suggests need to integrate outputs from Web & TREC
  • Word count to help eliminate unlikely answers
      • ? DEFINITION
conclusion cont
Conclusion (cont.)
  • But what about DEFINITION?
      • 102 Q's in TREC Corpus and in Web
      • Most Q's of all types
      • MRR-TREC == 0.34
      • MRR-Web == 0.39
      • All other Q's have # < 20, most < 10
  • If enough Q's are asked, will difference in performance on Web data vs. TREC data diminish?
conclusion cont53
Conclusion (cont.)
  • Simplicity - "perfect" for multilingual system QA
    • Low resource requirement - no NE taggers, no parsers, no ontologies, etc.
    • No adaptation of these to new language required
    • Need to create manual training terms & use appropriate web search engine
regular expressions from ask iggy
Regular Expressions from ask_iggy
  • “place called\\s+($cap_pattern+)” “home called\\s+($cap_pattern+)”
  • “at\\s((the)?\\s+($cap_pattern+))” “to\\s+($cap_pattern+)”
  • “place\\s+in\\s+($cap_pattern+)called\\s+($cap_pattern+)”
  • “in\\s+($cap_pattern+)” “up\\s+($cap_pattern+)”
  • “left\\s+($cap_pattern+)” “(($cap_pattern+)[Ii]slands)”
  • “(northern|southern|eastern|western)\\s+($cap_pattern+)”
  • “from\\s+($cap_pattern+)” “far\\s+as\\s+($cap_pattern+)”
  • “place\\s+in\\s+($cap_pattern+)” “home\\s+town”
  • “city\\s+of\\s+($cap_pattern+)”
  • “middle\\s+of\\s+((the)?\\s+($cap_pattern+))”
  • “(($cap_pattern+)[Ii]slands\\s+of\\s+($cap_pattern+))”
  • “place{1,1}d?\\s+near\\s+($cap_pattern+)”
  • “((above|over)\\s+($cap_pattern+))”