Open Domain Question Answering:
Download
1 / 113

RANLP 2005 - Bernardo Magnini - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

Open Domain Question Answering: Techniques, Resources and Systems. Bernardo Magnini Itc-Irst Trento, Italy [email protected] Outline of the Tutorial. Introduction to QA QA at TREC System Architecture - Question Processing - Answer Extraction Answer Validation on the Web

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'RANLP 2005 - Bernardo Magnini' - synnove


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Open Domain Question Answering:

Techniques, Resources and Systems

Bernardo Magnini

Itc-Irst

Trento, Italy

[email protected]

RANLP 2005 - Bernardo Magnini


Outline of the tutorial
Outline of the Tutorial

  • Introduction to QA

  • QA at TREC

  • System Architecture

    - Question Processing

    - Answer Extraction

  • Answer Validation on the Web

  • Cross-Language QA

RANLP 2005 - Bernardo Magnini


Previous lectures tutorials on qa
Previous Lectures/Tutorials on QA

  • Dan Moldovan and Sanda Harabagiu: Question Answering, IJCAI 2001.

  • Marteen de Rijke and Bonnie Webber: Knowlwdge-Intensive Question Answering, ESSLLI 2003.

  • Jimmy Lin and Boris Katz, Web-based Question Answering, EACL 2004

RANLP 2005 - Bernardo Magnini


Introduction to question answering
Introduction to Question Answering

  • What is Question Answering

  • Applications

  • Users

  • Question Types

  • Answer Types

  • Evaluation

  • Presentation

  • Brief history

RANLP 2005 - Bernardo Magnini


Query driven vs answer driven information access
Query Driven vs Answer Driven Information Access

  • What does LASER stand for?

  • When did Hitler attack Soviet Union?

    • Using Google we find documents containing the question itself, no matter whether or not the answer is actually provided.

  • Current information access is query driven.

  • Question Answering proposes an answer driven approach to information access.

RANLP 2005 - Bernardo Magnini


Question answering
Question Answering

  • Find the answer to a question in a large collection of documents

    • questions (in place of keyword-based query)

    • answers (in place of documents)

RANLP 2005 - Bernardo Magnini


Why question answering

From the Caledonian Star in the Mediterranean – September 23, 1990 (www.expeditions.com):

On a beautiful early morning the Caledonian Star approaches Naxos, situated on the east coast of Sicily. As we anchored and put the Zodiacs into the sea we enjoyed the great scenery. Under Mount Etna, the highest volcano in Europe, perches the fabulous town of Taormina. This is the goal for our morning.

After a short Zodiac ride we embarked our buses with local guides and went up into the hills to reach the town of Taormina.

Naxos was the first Greek settlement at Sicily. Soon a harbor was established but the town was later destroyed by invaders.[...]

Why Question Answering?

Document

collection

RANLP 2005 - Bernardo Magnini

What continent is Taormina in?

What is the highest volcano in Europe?

Searching for: Etna

Searching for: Taormina

Where is Naxos?

Searching for: Naxos


Alternatives to information retrieval
Alternatives to Information Retrieval 23, 1990 (www.expeditions.com):

  • Document Retrieval

    • users submit queries corresponding to their information need

    • system returns (voluminous) list of full-length documents

    • it is the responsibility of the users to find their original information need, within the returned documents

  • Open-Domain Question Answering (QA)

    • users ask fact-based, natural language questions

      What is the highest volcano in Europe?

    • system returns list of short answers

      … Under Mount Etna, the highest volcano in Europe, perches the fabulous town …

    • more appropriate for specific information needs

RANLP 2005 - Bernardo Magnini


What is qa
What is QA? 23, 1990 (www.expeditions.com):

  • Find the answer to a question in a large collection of documents

    What is the brighteststar visible from Earth?

    1. Sirio A is the brightest star visible from Earth even if it is…

    2. the planet is 12-times brighter than Sirio, the brightest star in the sky…

RANLP 2005 - Bernardo Magnini


Qa a complex problem 1
QA: a Complex Problem (1) 23, 1990 (www.expeditions.com):

  • Problem: discovery implicit relations among question and answers

    Who is the author of the “Star Spangled Banner”?

    …Francis Scott Key wrote the “Star Spangled Banner” in 1814.

    …comedian-actress Roseanne Barr sang her famous rendition of the “Star Spangled Banner” before …

RANLP 2005 - Bernardo Magnini


Qa a complex problem 2
QA: a Complex Problem (2) 23, 1990 (www.expeditions.com):

  • Problem: discovery implicit relations among question and answers

    Which is the Mozart birth date?

    …. Mozart (1751 – 1791) ….

RANLP 2005 - Bernardo Magnini


Qa a complex problem 3
QA: a complex problem (3) 23, 1990 (www.expeditions.com):

  • Problem: discovery implicit relations among question and answers

    Which is the distance between Naples and Ravello?

    “From the Naples Airport follow the sign to Autostrade (green road sign). Follow the directions to Salerno (A3). Drive for about 6 Km. Pay toll (Euros 1.20). Drive appx. 25 Km. Leave the Autostrade at Angri (Uscita Angri). Turn left, follow the sign to Ravello through Angri. Drive for about 2 Km. Turn right following the road sign "Costiera Amalfitana". Within 100m you come to traffic lights prior to narrow bridge. Watch not to miss the next Ravello sign, at appx. 1 Km from the traffic lights. Now relax and enjoy the views (follow this road for 22 Km). Once in Ravello ...”.

RANLP 2005 - Bernardo Magnini


Qa applications 1
QA: Applications (1) 23, 1990 (www.expeditions.com):

  • Information access:

    • Structured data (databases)

    • Semi-structured data (e.g. comment field in databases, XML)

    • Free text

  • To search over:

    • The Web

    • Fixed set of text collection (e.g. TREC)

    • A single text (reading comprehension evaluation)

RANLP 2005 - Bernardo Magnini


Qa applications 2
QA: Applications (2) 23, 1990 (www.expeditions.com):

  • Domain independent QA

  • Domain specific (e.g. help systems)

  • Multi-modal QA

    • Annotated images

    • Speech data

RANLP 2005 - Bernardo Magnini


Qa users
QA: Users 23, 1990 (www.expeditions.com):

  • Casual users, first time users

    • Understand the limitations of the system

    • Interpretation of the answer returned

  • Expert users

    • Difference between novel and already provided information

    • User Model

RANLP 2005 - Bernardo Magnini


Qa questions 1
QA: Questions (1) 23, 1990 (www.expeditions.com):

  • Classification according to the answer type

    • Factual questions (What is the larger city …)

    • Opinions (What is the author attitude …)

    • Summaries (What are the arguments for and against…)

  • Classification according to the question speech act:

    • Yes/NO questions (Is it true that …)

    • WH questions (Who was the first president …)

    • Indirect Requests (I would like you to list …)

    • Commands (Name all the presidents …)

RANLP 2005 - Bernardo Magnini


Qa questions 2
QA: Questions (2) 23, 1990 (www.expeditions.com):

  • Difficult questions

    • Why, How questions require understanding causality or instrumental relations

    • What questions have little constraint on the answer type (e.g. What did they do?)

RANLP 2005 - Bernardo Magnini


Qa answers
QA: Answers 23, 1990 (www.expeditions.com):

  • Long answers, with justification

  • Short answers (e.g. phrases)

  • Exact answers (named entities)

  • Answer construction:

    • Extraction: cut and paste of snippets from the original document(s)

    • Generation: from multiple sentences or documents

    • QA and summarization (e.g. What is this story about?)

RANLP 2005 - Bernardo Magnini


Qa information presentation
QA: Information Presentation 23, 1990 (www.expeditions.com):

  • Interfaces for QA

    • Not just isolated questions, but a dialogue

    • Usability and user satisfaction

  • Critical situations

    • Real time, single answer

  • Dialog-based interaction

    • Speech input

    • Conversational access to the Web

RANLP 2005 - Bernardo Magnini


Qa brief history 1
QA: Brief History (1) 23, 1990 (www.expeditions.com):

  • NLP interfaces to databases:

    • BASEBALL (1961), LUNAR (1973), TEAM (1979), ALFRESCO (1992)

    • Limitations: structured knowledge and limited domain

  • Story comprehension: Shank (1977), Kintsch (1998), Hirschman (1999)

RANLP 2005 - Bernardo Magnini


Qa brief history 2
QA: Brief History (2) 23, 1990 (www.expeditions.com):

  • Information retrieval (IR)

    • Queries are questions

    • List of documents are answers

    • QA is close to passage retrieval

    • Well established methodologies (i.e. Text Retrieval Conferences TREC)

  • Information extraction (IE):

    • Pre-defined templates are questions

    • Filled template are answers

RANLP 2005 - Bernardo Magnini


Question Answering 23, 1990 (www.expeditions.com):

Domain specific

Domain-independent

Structured data

Free text

Fixed set

of collections

Single

document

Web

Research Context (1)

Growing interest in QA (TREC, CLEF, NT evaluation campaign).

Recent focus on multilinguality and context aware QA

RANLP 2005 - Bernardo Magnini


Research Context (2) 23, 1990 (www.expeditions.com):

compactness

as compact as possible

Automatic

Summarization

answers must be faithful w.r.t. questions (correctness) and compact (exactness)

Automatic

Question Answering

as faithful as possible

Machine

Translation

faithfulness

RANLP 2005 - Bernardo Magnini


Ii question answering at trec
II. Question Answering at TREC 23, 1990 (www.expeditions.com):

  • The problem simplified

  • Questions and answers

  • Evaluation metrics

  • Approaches

RANLP 2005 - Bernardo Magnini


The problem simplified the text retrieval conference
The problem simplified: 23, 1990 (www.expeditions.com):The Text Retrieval Conference

  • Goal

    • Encourage research in information retrieval based on large-scale collections

  • Sponsors

    • NIST: National Institute of Standards and Technology

    • ARDA: Advanced Research and Development Activity

    • DARPA: Defense Advanced Research Projects Agency

  • Since 1999

  • Participants are research institutes, universities, industries

RANLP 2005 - Bernardo Magnini


Trec questions
TREC Questions 23, 1990 (www.expeditions.com):

Fact-based, short answer questions

Q-1391: How many feet in a mile?

Q-1057: Where is the volcano Mauna Loa?

Q-1071: When was the first stamp issued?

Q-1079: Who is the Prime Minister of Canada?

Q-1268: Name a food high in zinc.

Q-896: Who was Galileo?

Q-897: What is an atom?

Q-711: What tourist attractions are there in Reims?

Q-712: What do most tourists visit in Reims?

Q-713: What attracts tourists in Reims

Q-714: What are tourist attractions in Reims?

Definition questions

Reformulation questions

RANLP 2005 - Bernardo Magnini


Answer assessment
Answer Assessment 23, 1990 (www.expeditions.com):

  • Criteria for judging an answer

    • Relevance: it should be responsive to the question

    • Correctness: it should be factually correct

    • Conciseness: it should not contain extraneous or irrelevant information

    • Completeness: it should be complete, i.e. partial answer should not get full credit

    • Simplicity: it should be simple, so that the questioner can read it easily

    • Justification: it should be supplied with sufficient context to allow a reader to determine why this was chosen as an answer to the question

RANLP 2005 - Bernardo Magnini


Questions at TREC 23, 1990 (www.expeditions.com):

RANLP 2005 - Bernardo Magnini


Exact answers
Exact Answers 23, 1990 (www.expeditions.com):

  • Basic unit of a response: [answer-string, docid] pair

  • An answer string must contain a complete, exact answer and nothing else.

    What is the longest river in the United States?

    The following are correct, exact answers Mississippi, the Mississippi, the Mississippi River, Mississippi River mississippi while none of the following are correct exact answers At 2,348 miles the Mississippi River is the longest river in the US. 2,348 miles; Mississippi Missipp Missouri

RANLP 2005 - Bernardo Magnini


Assessments
Assessments 23, 1990 (www.expeditions.com):

  • Four possible judgments for a triple

    [ Question, document, answer ]

  • Rigth: the answer is appropriate for the question

  • Inexact: used for non complete answers

  • Unsupported: answers without justification

  • Wrong: the answer is not appropriate for the question

RANLP 2005 - Bernardo Magnini


What is the capital city of New Zealand? 23, 1990 (www.expeditions.com):

What is the Boston Strangler's name?

What is the world's second largest island?

What year did Wilt Chamberlain score

100 points?

Who is the governor of Tennessee?

What's the name of King Arthur's sword?

When did Einstein die?

What was the name of the plane that dropped

the Atomic Bomb on Hiroshima?

What was the name of FDR's dog?

What day did Neil Armstrong land on the moon?

Who was the first Triple Crown Winner?

When was Lyndon B. Johnson born?

Who was Woodrow Wilson's First Lady?

Where is Anne Frank's diary?

R 1530 XIE19990325.0298 Wellington

R 1490 NYT20000913.0267 Albert DeSalvo

R 1503 XIE19991018.0249 New Guinea

U 1402 NYT19981017.0283 1962

R 1426 NYT19981030.0149 Sundquist

U 1506 NYT19980618.0245 Excalibur

R 1601 NYT19990315.0374 April 18 , 1955

X 1848 NYT19991001.0143 Enola

R 1838 NYT20000412.0164 Fala

R 1674 APW19990717.0042 July 20 , 1969

X 1716 NYT19980605.0423 Barton

R 1473 APW19990826.0055 1908

R 1622 NYT19980903.0086 Ellen

W 1510 NYT19980909.0338 Young Girl

R=Right, X=ineXact, U=Unsupported, W=Wrong

RANLP 2005 - Bernardo Magnini


1402: 23, 1990 (www.expeditions.com):What year did Wilt Chamberlain score 100 points?

DIOGENE: 1962

ASSESMENT: UNSUPPORTED

PARAGRAPH: NYT19981017.0283

Petty's 200 victories, 172 of which came during a 13-year

span between 1962-75, may be as unapproachable as Joe DiMaggio's

56-game hitting streak or Wilt Chamberlain's 100-point game.

RANLP 2005 - Bernardo Magnini


1506: 23, 1990 (www.expeditions.com):What's the name of King Arthur's sword?

ANSWER: Excalibur

PARAGRAPH: NYT19980618.0245

ASSESMENT: UNSUPPORTED

`QUEST FOR CAMELOT,' with the voices of Andrea Carr, Gabriel Byrne, Cary Elwes, John Gielgud, Jessalyn Gilsig, Eric Idle, Gary Oldman, Bronson Pinchot, Don Rickles and Bryan White. Directed by Frederik Du Chau (G, 100 minutes). Warner Brothers' shaky entrance into the Disney-dominated sweepstakes of the musicalized animated feature wants to be a juvenile feminist ``Lion King'' with a musical heart that fuses ``Riverdance'' with formulaic Hollywood gush. But its characters are too wishy-washy and visually unfocused to be compelling, and the songs (by David Foster and Carole Bayer Sager) so forgettable as to be extraneous. In this variation on the Arthurian legend, a nondescript Celtic farm girl named Kayley with aspirations to be a knight wrests the magic sword Excalibur from the evil would-be emperor Ruber (a Hulk Hogan look-alike) and saves the kingdom (Holden).

RANLP 2005 - Bernardo Magnini


1848: 23, 1990 (www.expeditions.com):What was the name of the plane that dropped the

Atomic Bomb on Hiroshima?

DIOGENE: Enola

PARAGRAPH: NYT19991001.0143

ASSESMENT: INEXACT

Tibbets piloted the Boeing B-29 Superfortress Enola Gay,

which dropped the atomic bomb on Hiroshima on Aug. 6, 1945, causing an estimated 66,000 to 240,000 deaths. He named the plane after his mother, Enola Gay Tibbets.

RANLP 2005 - Bernardo Magnini


1716: 23, 1990 (www.expeditions.com):Who was the first Triple Crown Winner?

DIOGENE: Barton

PARAGRAPH: NYT19980605.0423

ASSESMENT: INEXACT

Not all of the Triple Crown winners were immortals. The first, Sir Barton, lost six races in 1918 before his first victory, just as Real Quiet lost six in a row last year. Try to find Omaha and Whirlaway on anybody's list of all-time greats.

RANLP 2005 - Bernardo Magnini


1510: 23, 1990 (www.expeditions.com):Where is Anne Frank's diary?

DIOGENE: Young Girl

PARAGRAPH: NYT19980909.0338

ASSESMENT: WRONG

Otto Frank released a heavily edited version of “B” for its first

publication as “Anne Frank: Diary of a Young Girl” in 1947.

RANLP 2005 - Bernardo Magnini


Trec evaluation metric mean reciprocal rank mrr
TREC Evaluation Metric: 23, 1990 (www.expeditions.com):Mean Reciprocal Rank (MRR)

  • Reciprocal Rank = inverse of rank at which first correct answer was found:

    [1, 0,5, 0.33, 0.25, 0.2, 0]

  • MRR: average over all questions

  • Strict score: unsupported count as incorrect

  • Lenient score: unsupported count as correct

RANLP 2005 - Bernardo Magnini


Trec evaluation metrics confidence weighted score cws
TREC Evaluation Metrics: 23, 1990 (www.expeditions.com):Confidence-Weighted Score (CWS)

Sum for i = 1 to 500 (#-correct-up-to-question i / i)

500

System A:

1  C

2  W

3  C

4  C

5  W

System B:

1  W

2  W

3  C

4  C

5  C

(1/1) + ((1+0)/2) + (1+0+1)/3) + ((1+0+1+1)/4) + ((1+0+1+1+0)/5)

5

Total: 0.7

0 + ((0+0)/2) + (0+0+1)/3) + ((0+0+1+1)/4) + ((0+0+1+1+1)/5)

5

Total: 0.29

RANLP 2005 - Bernardo Magnini


Evaluation

67% 23, 1990 (www.expeditions.com):

23%

66%

25%

58%

24%

TREC-8 TREC-9 TREC-10

Evaluation

  • Best result: 67%

  • Average over 67 runs: 23%

RANLP 2005 - Bernardo Magnini


Main approaches at trec
Main Approaches at TREC 23, 1990 (www.expeditions.com):

  • Knowledge-Based

  • Web-based

  • Pattern-based

RANLP 2005 - Bernardo Magnini


Knowledge based approach
Knowledge-Based Approach 23, 1990 (www.expeditions.com):

  • Linguistic-oriented methodology

    • Determine the answer type from question form

    • Retrieve small portions of documents

    • Find entities matching the answer type category in text snippets

  • Majority of systems use a lexicon (usually WordNet)

    • To find answer type

    • To verify that a candidate answer is of the correct type

    • To get definitions

  • Complex architecture...

RANLP 2005 - Bernardo Magnini


Web based approach

Auxiliary Corpus 23, 1990 (www.expeditions.com):

WEB

Question Processing Component

TREC Corpus

Search Component

Answer

Extraction

Component

ANSWER

Web-Based Approach

QUESTION

RANLP 2005 - Bernardo Magnini


Patter based approach 1 3
Patter-Based Approach (1/3) 23, 1990 (www.expeditions.com):

  • Knowledge poor

  • Strategy

    • Search for predefined patterns of textual expressions that may be interpreted as answers to certain question types.

    • The presence of such patterns in answer string candidates may provide evidence of the right answer.

RANLP 2005 - Bernardo Magnini


Patter based approach 2 3
Patter-Based Approach (2/3) 23, 1990 (www.expeditions.com):

  • Conditions

    • Detailed categorization of question types

      • Up to 9 types of the “Who” question; 35 categories in total

    • Significant number of patterns corresponding to each question type

      • Up to 23 patterns for the “Who-Author” type, average of 15

    • Find multiple candidate snippets and check for the presence of patterns (emphasis on recall)

RANLP 2005 - Bernardo Magnini


Pattern based approach 3 3
Pattern-based approach (3/3) 23, 1990 (www.expeditions.com):

  • Example: patterns for definition questions

  • Question: What is A?

    1. <A; is/are; [a/an/the]; X> ...23 correct answers

    2. <A; comma; [a/an/the]; X; [comma/period]>…26 correct answers

    3. <A; [comma]; or; X; [comma]> …12 correct answers

    4. <A; dash; X; [dash]> …9 correct answers

    5. <A; parenthesis; X; parenthesis> …8 correct answers

    6. <A; comma; [also] called; X [comma]> …7 correct answers

    7. <A; is called; X> …3 correct answers

    total: 88 correct answers

RANLP 2005 - Bernardo Magnini


Use of answer patterns
Use of answer patterns 23, 1990 (www.expeditions.com):

  • For generating queries to the search engine.

    How did Mahatma Gandhi die?

    Mahatma Gandhi die <HOW>

    Mahatma Gandhi die of <HOW>

    Mahatma Gandhi lost his life in <WHAT>

    The TEXTMAP system (ISI) uses 550 patterns, grouped in 105 equivalence blocks. On TREC-2003 questions, the system produced, on average, 5 reformulations for each question.

  • For answer extraction

    When was Mozart born?

    P=1 <PERSON> (<BIRTHDATE> - DATE)

    P=.69 <PERSON> was born on <BIRTHDATE>

RANLP 2005 - Bernardo Magnini


Acquisition of answer patterns
Acquisition of Answer Patterns 23, 1990 (www.expeditions.com):

Relevant approaches:

  • Manually developed surface pattern library (Soubbotin, Soubbotin, 2001)

  • Automatically extracted surface patterns (Ravichandran, Hovy 2002)

Patter learning:

  • Start with a seed, e.g. (Mozart, 1756)

  • Download Web documents using a search engine

  • Retain sentences that contain both question and answer terms

  • Construct a suffix tree for extracting the longest matching substring that spans <Question> and <Answer>

  • Calculate precision of patterns

    Precision = # of correct patterns with correct answer / # of total patterns

RANLP 2005 - Bernardo Magnini


Capturing variability with patterns
Capturing variability with patterns 23, 1990 (www.expeditions.com):

  • Pattern based QA is more effective when supported by variable typing obtained using NLP techniques and resources.

    When was <A> born?

    <A:PERSON> (<ANSWER:DATE> -

    <A :PERSON > was born in <ANSWER :DATE >

  • Surface patterns can not deal with word reordering and apposition phrases: Galileo, the famous astronomer, was born in …

  • The fact that most of the QA systems use syntactic parsing demonstrates that the successful solution of the answer extraction problem goes beyond the surface form analysis

RANLP 2005 - Bernardo Magnini


Syntactic answer patterns 1
Syntactic answer patterns (1) 23, 1990 (www.expeditions.com):

Answer patterns that capture the syntactic

relations of a sentence.

When was <A> invented?

RANLP 2005 - Bernardo Magnini


Syntactic answer patterns 2
Syntactic answer patterns (2) 23, 1990 (www.expeditions.com):

The matching phase turns out to be a problem

of partial match among syntactic trees.

S

NP

VP

The

first

phonograph

was invented

PP

in

1877

RANLP 2005 - Bernardo Magnini


Iii system architecture
III. System Architecture 23, 1990 (www.expeditions.com):

  • Knowledge Based approach

    • Question Processing

    • Search component

    • Answer Extraction

RANLP 2005 - Bernardo Magnini


Knowledge based qa

QUESTION 23, 1990 (www.expeditions.com):

TOKENIZATION & POS TAGGING

Document collection

ANSWER VALIDATION

MULTIWORDS RECOGNITION

QUESTION PARSING

WORD SENSE DISAMBIGUATION

SEARCH ENGINE

NAMED ENTITIES RECOGNITION

ANSWER TYPE IDENTIFICATION

QUERY COMPOSITION

PARAGRAPH FILTERING

KEYWORDS EXPANSION

Answer Extraction Component

Question Processing Component

Knowledge based QA

ANSWER

ANSWER IDENTIFICATION

Search Component

RANLP 2005 - Bernardo Magnini


Question analysis 1
Question Analysis (1) 23, 1990 (www.expeditions.com):

  • Input: NLP question

  • Output:

    • query for the search engine (i.e. a boolean composition of weighted keywords)

    • Answer type

    • Additional constraints: question focus, syntactic or semantic relations that should hold for a candidate answer entity and other entities

RANLP 2005 - Bernardo Magnini


Question analysis 2
Question Analysis (2) 23, 1990 (www.expeditions.com):

  • Steps:

    • Tokenization

    • POS-tagging

    • Multi-words recognition

    • Parsing

    • Answer type and focus identification

    • Keyword extraction

    • Word Sense Disambiguation

    • Expansions

RANLP 2005 - Bernardo Magnini


Tokenization and pos tagging
Tokenization and POS-tagging 23, 1990 (www.expeditions.com):

NL-QUESTION: Who was the inventor of the electric light?

WhoWho CCHI [0,0]

wasbe VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

ofof ES [4,4]

the det RS [5,5]

electricelectricAS [6,6]

lightlight SS [7,7]

? ? XPS [8,8]

RANLP 2005 - Bernardo Magnini


Multi words recognition
Multi-Words recognition 23, 1990 (www.expeditions.com):

NL-QUESTION: Who was the inventor of the electric light?

WhoWho CCHI [0,0]

wasbe VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

ofof ES [4,4]

the det RS [5,5]

electric_light electric_light SS [6,7]

? ? XPS [8,8]

RANLP 2005 - Bernardo Magnini


Syntactic parsing
Syntactic Parsing 23, 1990 (www.expeditions.com):

  • Identify syntactic structure of a sentence

    • noun phrases (NP), verb phrases (VP), prepositional phrases (PP) etc.

Why did David Koresh ask the FBI for a word processor?

SBARQ

SQ

VP

PP

WHADVPNPNPNP

WRB VBD NNP NNP VB DT NNP IN DT NN NN

Why did David Koresh ask the FBI for a word processor

RANLP 2005 - Bernardo Magnini


Answer type and focus
Answer Type and Focus 23, 1990 (www.expeditions.com):

  • Focus is the word that expresses the relevant entity in the question

    • Used to select a set of relevant documents

    • ES: Where was Mozart born?

  • Answer Type isthe category of the entity to be searched as answer

    • PERSON, MEASURE, TIME PERIOD, DATE, ORGANIZATION, DEFINITION

    • ES: Where was Mozart born?

      • LOCATION

RANLP 2005 - Bernardo Magnini


Answer type and focus1
Answer Type and Focus 23, 1990 (www.expeditions.com):

What famous communist leader died in Mexico City?

RULENAME: WHAT-WHO

TEST: [“what” [¬ NOUN]* [NOUN:person-p]J +]

OUTPUT: [“PERSON” J]

Answer type: PERSON

Focus: leader

This rule matches any question starting with what, whose first noun, if any, is a person (i.e. satisfies the person-p predicate)

RANLP 2005 - Bernardo Magnini


Keywords extraction
Keywords Extraction 23, 1990 (www.expeditions.com):

NL-QUESTION: Who was the inventor of the electric light?

WhoWho CCHI [0,0]

wasbe VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

ofof ES [4,4]

the det RS [5,5]

electric_light electric_light SS [6,7]

? ? XPS [8,8]

RANLP 2005 - Bernardo Magnini


Word sense disambiguation
Word Sense Disambiguation 23, 1990 (www.expeditions.com):

What is the brightest star visible from Earth?”

STARstar#1: celestial body ASTRONOMY star#2: an actor who play … ART

BRIGHT bright #1: bright brilliant shining PHYSICS

bright #2: popular glorious GENERIC

bright #3: promising auspicious GENERIC

VISIBLE visible#1: conspicuous obvious PHYSICS

visible#2: visible seeable ASTRONOMY

EARTHearth#1: Earth world globe ASTRONOMY

earth #2: estate land landed_estate acres ECONOMY

earth #3: clay GEOLOGY

earth #4: dry_land earth solid_ground GEOGRAPHY

earth #5: land ground soil GEOGRAPHY

earth #6: earth ground GEOLOGY

RANLP 2005 - Bernardo Magnini


Expansions
Expansions 23, 1990 (www.expeditions.com):

- NL-QUESTION: Who was the inventor of the electric light?

- BASIC-KEYWORDS: inventor electric-light

inventor

synonymsdiscoverer, artificer

derivationinvention

synonyms innovation

derivationinvent

synonyms excogitate

electric_light

synonymsincandescent_lamp, ligth_bulb

RANLP 2005 - Bernardo Magnini


Keyword composition
Keyword Composition 23, 1990 (www.expeditions.com):

  • Keywords and expansions are composed in a boolean expression with AND/OR operators

  • Several possibilities:

    • AND composition

    • Cartesian composition

(OR (inventor AND electric_light)

OR (inventor AND incandescent_lamp)

OR (discoverer AND electric_light)

…………………………

OR inventor OR electric_light))

RANLP 2005 - Bernardo Magnini


Document collection pre processing
Document Collection Pre-processing 23, 1990 (www.expeditions.com):

  • For real time QA applications off-line pre-processing of the text is necessary

    • Term indexing

    • POS-tagging

    • Named Entities Recognition

RANLP 2005 - Bernardo Magnini


Candidate answer document selection
Candidate Answer Document Selection 23, 1990 (www.expeditions.com):

  • Passage Selection: Individuate relevant, small, text portions

  • Given a document and a list of keywords:

    • Paragraph length (e.g. 200 words)

    • Consider the percentage of keywords present in the passage

    • Consider if some keyword is obligatory (e.g. the focus of the question).

RANLP 2005 - Bernardo Magnini


Candidate answer document analysis
Candidate Answer Document Analysis 23, 1990 (www.expeditions.com):

  • Passage text tagging

  • Named Entity Recognition

    Who is the author of the “Star Spangled Banner”?

    …<PERSON>Francis Scott Key </PERSON> wrote the “Star Spangled Banner” in <DATE>1814</DATE>

  • Some systems:

    • passages parsing (Harabagiu, 2001)

    • Logical form (Zajac, 2001)

RANLP 2005 - Bernardo Magnini


Answer extraction 1
Answer Extraction (1) 23, 1990 (www.expeditions.com):

  • Who is the author of the “Star Spangled Banner”?

    …<PERSON>Francis Scott Key </PERSON> wrote the “Star Spangled Banner” in <DATE>1814</DATE>

    Answer Type = PERSON

    Candidate Answer = Francis Scott Key

    Ranking candidate answers: keyword density in the passage, apply additional constraints (e.g. syntax, semantics), rank candidates using the Web

RANLP 2005 - Bernardo Magnini


Answer identification
Answer Identification 23, 1990 (www.expeditions.com):

Thomas E. Edison

RANLP 2005 - Bernardo Magnini


Iv answer validation
IV. Answer Validation 23, 1990 (www.expeditions.com):

  • Automatic answer validation

  • Approach:

    • web-based

    • use of patterns

    • combine statistics and linguistic information

  • Discussion

  • Conclusions

RANLP 2005 - Bernardo Magnini


Qa architecture

QUESTION 23, 1990 (www.expeditions.com):

QUESTION PARSING

WORD SENSE DISAMBIGUATION

ANSWER TYPE IDENTIFICATION

KEYWORDS EXPANSION

Question Processing Component

QA Architecture

ANSWER

TOKENIZATION & POS TAGGING

Document collection

ANSWER RANKING

ANSWER IDENTIFICATION

SEARCH ENGINE

NAMED ENTITIES RECOGNITION

QUERY COMPOSITION

PARAGRAPH FILTERING

Answer Extraction Component

Search Component

RANLP 2005 - Bernardo Magnini


The problem answer validation
The problem: Answer Validation 23, 1990 (www.expeditions.com):

Given a question q and a candidate answer a, decide if a is a correct answer for q

What is the capital of the USA?

Washington D.C.

San Francisco

Rome

RANLP 2005 - Bernardo Magnini


The problem answer validation1
The problem: Answer Validation 23, 1990 (www.expeditions.com):

Given a question q and a candidate answer a, decide if a is a correct answer for q

What is the capital of the USA?

Washington D.C. correct

San Francisco wrong

Rome wrong

RANLP 2005 - Bernardo Magnini


Requirements for automatic av
Requirements for Automatic AV 23, 1990 (www.expeditions.com):

  • Accuracy: it has to compare well with respect to human judgments

  • Efficiency: large scale (Web), real time scenarios

  • Simplicity: avoid the complexity of QA systems

RANLP 2005 - Bernardo Magnini


Approach
Approach 23, 1990 (www.expeditions.com):

  • Web-based

    • take advantage of Web redundancy

  • Pattern-based

    • the Web is mined using patterns (i.e. validation patterns) extracted from the question and the candidate answer

  • Quantitative (as opposed to content-based)

    • check if the question and the answer tend to appear together in the Web considering the number of documents returned (i.e. documents are not downloaded)

RANLP 2005 - Bernardo Magnini


Web redundancy
Web Redundancy 23, 1990 (www.expeditions.com):

What is the capital of the USA?

Washington

RANLP 2005 - Bernardo Magnini


Validation pattern
Validation Pattern 23, 1990 (www.expeditions.com):

[Capital NEARUSANEARWashington]

RANLP 2005 - Bernardo Magnini


Related work
Related Work 23, 1990 (www.expeditions.com):

  • Pattern-based QA

    • Brill, 2001 – TREC-10

    • Subboting, 2001 – TREC-10

    • Ravichandran and Hovy, ACL-02

  • Use of the Web for QA

    • Clarke et al. 2001 – TREC-10

    • Radev, et al. 2001 - CIKM

  • Statistical approach on the Web

    • PMI-IR: Turney, 2001 and ACL-02

RANLP 2005 - Bernardo Magnini


Architecture

candidate answer 23, 1990 (www.expeditions.com):

question

filtering

validation pattern

answer validity score

wrong answer

correct answer

Architecture

#doc

#doc < k

> t

< t

RANLP 2005 - Bernardo Magnini


Architecture1

candidate answer 23, 1990 (www.expeditions.com):

question

filtering

answer validity score

wrong answer

correct answer

Architecture

validation pattern

#doc

#doc < k

> t

< t

RANLP 2005 - Bernardo Magnini


Extracting validation patterns

candidate answer 23, 1990 (www.expeditions.com):

question

validation pattern

Extracting Validation Patterns

answer type

stop-word filter

named entity

recognition

stop-word

filter

term expansion

answer pattern (Ap)

question pattern (Qp)

RANLP 2005 - Bernardo Magnini


Architecture2

candidate answer 23, 1990 (www.expeditions.com):

question

filtering

validation pattern

wrong answer

correct answer

Architecture

answer validity score

#doc

#doc < k

> t

< t

RANLP 2005 - Bernardo Magnini


Answer validity score

P( 23, 1990 (www.expeditions.com):Qp, Ap)

PMI (Qp, Ap) =

P(Qp) * P(Ap)

Answer Validity Score

  • PMI-IR algorithm (Turney, 2001)

  • The result is interpreted as evidence that the validation pattern is consistent, which imply answer accuracy

RANLP 2005 - Bernardo Magnini


Answer validity score1
Answer Validity Score 23, 1990 (www.expeditions.com):

hits(QpNEARAp)

PMI (Qp, Ap) =

hits(Qp) * hits(Ap)

  • Three searches are submitted to the Web:

hits(Qp)

hits(Ap)

hits(QpNEARAp)

RANLP 2005 - Bernardo Magnini


Example

8 23, 1990 (www.expeditions.com):

3 *10

Example

A1= The Stanislaus County district attorney’s

A2 = In Modesto, San Francisco, and

What county is Modesto, California in?

Stop-word

filter

Answer type: Location

Qp = [county NEAR Modesto NEAR California]

909

P(Qp) = P(county, Modesto, California) =

RANLP 2005 - Bernardo Magnini


Example cont

8 23, 1990 (www.expeditions.com):

8

3 *10

3 *10

Example (cont.)

In Modesto, San Francisco, and

The Stanislaus County district attorney’s

NER(location)

A1p = [Stanislaus]

A2p = [San Francisco]

73641

4072519

P(Stanislaus)=

P(San Francisco)=

RANLP 2005 - Bernardo Magnini


Example cont1

11 23, 1990 (www.expeditions.com):

552

P(Qp, A2p) =

P(Qp, A1p) =

8

8

3 *10

3 *10

Example (cont.)

In Modesto, San Francisco, and

The Stanislaus County district attorney’s

PMI(Qp, A1p) = 2473

PMI(Qp, A2p) = 0.89

t = 0.2 * MAX(AVS)

> t

< t

wrong answer

correct answer

RANLP 2005 - Bernardo Magnini


Experiments
Experiments 23, 1990 (www.expeditions.com):

  • Data set:

    • 492 TREC-2001 questions

    • 2726 answers: 3 correct answers and 3 wrong answers for each question, randomly selected from TREC-10 participants human-judged corpus

  • Search engine: Altavista

    • allows the NEAR operator

RANLP 2005 - Bernardo Magnini


Experiment answers

  • The Mississippi 23, 1990 (www.expeditions.com):

  • Known as Big Muddy, the Mississippi is the longest

  • as Big Muddy, the Mississippi is the longest

  • messed with. Known as Big Muddy, the Mississip

  • Mississippi is the longest river in the US

  • the Mississippi is the longest river(Mississippi)

  • has brought the Mississippi to its lowest

  • ipes.In Life on the Mississippi,Mark Twain wrote t

  • Southeast;Mississippi;Mark Twain; officials began

  • Known; Mississippi; US; Minnesota; Gulf Mexico

  • Mud Island,;Mississippi;”The;--history,;Memphis

Experiment: Answers

Q-916: What river in the US is known as the Big Muddy ?

RANLP 2005 - Bernardo Magnini


Baseline
Baseline 23, 1990 (www.expeditions.com):

  • Consider the documents provided by NIST to TREC-10 participants (1000 documents for each question)

  • If the candidate answer occurs (i.e. string match) at least one time in the top 10 documents it is judged correct, otherwise it is considered wrong

RANLP 2005 - Bernardo Magnini


Asymmetrical measures

~ 23, 1990 (www.expeditions.com):

PMI(q, ac) = PMI (q, aw)

Asymmetrical Measures

  • Problem: some candidate answers (e.g. numbers) produce an enormous amount of Web documents

  • Scores for good (Ac) and bad (Aw) answers tend to be similar, making the choice more difficult

How many Great Lakes are there?

… to cross all five Great Lakes completed a 19.2 …

RANLP 2005 - Bernardo Magnini


Asymmetric conditional probability acp
Asymmetric Conditional Probability (ACP) 23, 1990 (www.expeditions.com):

P(Qsp | Asp)

ACP (Qsp, Asp) =

2/3

P(Qsp) * P(Asp)

hits(Qsp NEAR Asp)

=

2/3

hits(Qsp) * hits(Asp)

RANLP 2005 - Bernardo Magnini


Comparing pmi and acp
Comparing PMI and ACP 23, 1990 (www.expeditions.com):

ACP increases the difference between the right and the wrong answer.

RANLP 2005 - Bernardo Magnini


Results
Results 23, 1990 (www.expeditions.com):

SR on all 492

TREC-2001

questions

SR on all 249

factoid

questions

RANLP 2005 - Bernardo Magnini


Discussion 1
Discussion (1) 23, 1990 (www.expeditions.com):

  • Definition questions are the more problematic

    • on the subset of 249 named-entities questions success rate is higher (i.e. 86.3)

  • Relative threshold improve performance (+ 2%) over fixed threshold

  • Non symmetric measures of co-occurrence work better for answer validation (+ 2%)

  • Source of errors:

    • Answer type recognition

    • Named-entities recognition

    • TREC answer set (e.g. tokenization)

RANLP 2005 - Bernardo Magnini


Discussion 2
Discussion (2) 23, 1990 (www.expeditions.com):

  • Automatic answer validation is a key challenge for Web-based question/answering systems

  • Requirements:

    • accuracy with respect to human judgments: 80% success rate is a good starting point

    • efficiency: documents are not downloaded

    • simplicity: based on patterns

  • At present, it is suitable for a generate&test component integrated in a QA system

RANLP 2005 - Bernardo Magnini


V cross language qa
V. Cross-Language QA 23, 1990 (www.expeditions.com):

RANLP 2005 - Bernardo Magnini


Motivations
Motivations 23, 1990 (www.expeditions.com):

  • Answers may be found in languages different from the language of the question.

  • Interest in QA systems for languages other than English.

  • Force the QA community to design real multilingual systems.

  • Check/improve the portability of the technologies implemented in current English QA systems.

RANLP 2005 - Bernardo Magnini


Cross language qa
Cross-Language QA 23, 1990 (www.expeditions.com):

Quanto è alto il Mont Ventoux?

(How tall is Mont Ventoux?)

“Le Mont Ventoux, impérial avec ses 1909 mètres et sa tour blanche telle un étendard, règne de toutes …”

1909 metri

Italian corpus

French corpus

English corpus

Spanish corpus

RANLP 2005 - Bernardo Magnini


Cl qa at clef
CL-QA at CLEF 23, 1990 (www.expeditions.com):

  • Adopt the same rules used at TREC QA

    • Factoid questions (i.e. no definition questions)

    • Exact answers + document id

  • Use the CLEF corpora (news, 1994 -1995)

  • Return the answer in the language of the text collection in which it has been found (i.e. no translation of the answer)

  • QA-CLEF-2003 was an initial step toward a more complex task organized at CLEF-2004 and 2005.

RANLP 2005 - Bernardo Magnini


QA @ CLEF 2004 23, 1990 (www.expeditions.com): (http://clef-qa.itc.it/2004)

  • Seven groups coordinated the QA track:

  • ITC-irst (IT and EN test set preparation)

  • DFKI (DE)

  • ELDA/ELRA (FR)

  • Linguateca (PT)

  • UNED (ES)

  • U. Amsterdam (NL)

  • U. Limerick (EN assessment)

  • Two more groups participated in the test set construction:

  • Bulgarian Academy of Sciences (BG)

  • U. Helsinki (FI)

RANLP 2005 - Bernardo Magnini


question generation (2.5 p/m per group) 23, 1990 (www.expeditions.com):

100 monolingual Q&A pairs with EN translation

translation

EN => 7 languages

700 Q&A pairs in 1 language + EN

document

collections

IT

FR

Multieight-04

XML collection

of 700 Q&A

in 8 languages

evaluation

(2 p/d for 1 run)

selection of

additional

80 + 20 questions

NL

manual

assessment

ES

Exercise (10-23/5)

systems’

answers

experiments

(1 week window)

extraction of

plain text test sets

CLEF QA - Overview

RANLP 2005 - Bernardo Magnini


CLEF QA – Task Definition 23, 1990 (www.expeditions.com):

Given 200 questions in a source language, find one exact answer per question in a collection of documents written in a target language, and provide a justification for each retrieved answer (i.e. the docid of the unique document that supports the answer).

T

6 monolingual and 50 bilingual tasks.

Teams participated in 19 tasks,

S

RANLP 2005 - Bernardo Magnini


CLEF QA - 23, 1990 (www.expeditions.com):Questions

  • All the test sets were made up of 200 questions:

  • ~90% factoid questions

  • ~10% definition questions

  • ~10% of the questions did not have any answer in the corpora (right answer-string was “NIL”)

  • Problems in introducing definition questions:

  • What’s the right answer? (it depends on the user’s model)

  • What’s the easiest and more efficient way to assess their answers?

  • Overlap with factoid questions:

    F Who is the Pope?

    D Who is John Paul II?

the Pope

John Paul II

the head of the Roman Catholic Church

RANLP 2005 - Bernardo Magnini


CLEF QA – 23, 1990 (www.expeditions.com):Multieight

<q cnt="0675" category="F" answer_type="MANNER">

<language val="BG" original="FALSE">

<question group="BTB">Как умира Пазолини?</question>

<answer n="1" docid="">TRANSLATION[убит]</answer>

</language>

<language val="DE" original="FALSE">

<question group="DFKI">Auf welche Art starb Pasolini?</question>

<answer n="1" docid="">TRANSLATION[ermordet]</answer>

<answer n="2" docid="SDA.951005.0154">ermordet</answer>

</language>

<language val="EN" original="FALSE">

<question group="LING">How did Pasolini die?</question>

<answer n="1" docid="">TRANSLATION[murdered]</answer>

<answer n="2" docid="LA112794-0003">murdered</answer>

</language>

<language val="ES" original="FALSE">

<question group="UNED">¿Cómo murió Pasolini?</question>

<answer n="1" docid="">TRANSLATION[Asesinado]</answer>

<answer n="2" docid="EFE19950724-14869">Brutalmente asesinado en los arrabales de Ostia</answer>

</language>

<language val="FR" original="FALSE">

  <question group="ELDA">Comment est mort Pasolini ?</question>

  <answer n="1" docid="">TRANSLATION[assassiné]</answer>

  <answer n="2" docid="ATS.951101.0082">assassiné</answer>

  <answer n="3" docid="ATS.950904.0066">assassiné en novembre 1975 dans des circonstances mystérieuses</answer>

  <answer n="4" docid="ATS.951031.0099">assassiné il y a 20 ans</answer>

</language>

<language val="IT" original="FALSE">

  <question group="IRST">Come è morto Pasolini?</question>

  <answer n="1" docid="">TRANSLATION[assassinato]</answer>

  <answer n="2" docid="AGZ.951102.0145">massacrato e abbandonato sulla spiaggia di Ostia</answer>

  </language>

<language val="NL" original="FALSE">

  <question group="UoA">Hoe stierf Pasolini?</question>

  <answer n="1" docid="">TRANSLATION[vermoord]</answer>

  <answer n="2" docid="NH19951102-0080">vermoord</answer>

  </language>

<language val="PT" original="TRUE">

  <question group="LING">Como morreu Pasolini?</question>

  <answer n="1" docid="LING-951120-088">assassinado</answer>

  </language>

</q>

RANLP 2005 - Bernardo Magnini


Q 23, 1990 (www.expeditions.com):

i=1

1

number of correct responses in first i ranks

CWS =

Q

i

CLEF QA -Assessment

  • Judgments taken from the TREC QA tracks:

  • Right

  • Wrong

  • ineXact

  • Unsupported

  • Other criteria, such as the length of the answer-strings (instead of X, which is underspecified) or the usefulness of responses for a potential user, have not been considered.

  • Main evaluation measure was accuracy (fraction of Right responses).

  • Whenever possible, a Confidence-Weighted Score was calculated:

RANLP 2005 - Bernardo Magnini


Evaluation Exercise - Participants 23, 1990 (www.expeditions.com):

Distribution of participating groups in different QA evaluation campaigns.

RANLP 2005 - Bernardo Magnini


T 23, 1990 (www.expeditions.com):

S

Evaluation Exercise - Participants

Number of participating teams-number of submitted runs at CLEF 2004.

Comparability issue.

RANLP 2005 - Bernardo Magnini


83 23, 1990 (www.expeditions.com):

22

Evaluation Exercise - Results

Systems’ performance at the TREC and CLEF QA tracks.

accuracy (%)

best system

average

70

70

67

65

45.5

41.5

35

35

29

25

24

23.7

23

21.4

17

14.7

CLEF-2004

TREC-8

TREC-9

TREC-10

TREC-11

TREC-12*

CLEF-2003**

monol.

monol.

bil.

bil.

* considering only the 413 factoid questions

** considering only the answers returned at the first rank

RANLP 2005 - Bernardo Magnini


Evaluation Exercise – CL Approaches 23, 1990 (www.expeditions.com):

INPUT (source language)

U. Amsterdam

U. Edinburgh

U. Neuchatel

question translation

into target language

Question

Analysis /

keyword

extraction

Bulg. Ac. of Sciences

ITC-Irst

U. Limerick

U. Helsinki

DFKI

LIMSI-CNRS

OUTPUT (target language)

translation of

retrieved data

Candidate

Document

Selection

Candidate

Document

Analysis

Answer

Extraction

Document

Collection

Preprocessed

Documents

Document

Collection

Preprocessing

RANLP 2005 - Bernardo Magnini


Discussion on Cross-Language QA 23, 1990 (www.expeditions.com):

  • CLEF multilingual QA track (like TREC QA) represents a formal evaluation, designed with an eye to replicability. As an exercise, it is an abstraction of the real problems.

  • Future challenges:

  • investigate QA in combination with other applications (for instance summarization)

  • access not only free text, but also different sources of data (multimedia, spoken language, imagery)

  • introduce automated evaluation along with judgments given by humans

  • focus on user’s need: develop real-time interactive systems, which means modeling a potential user and defining suitable answer types.

RANLP 2005 - Bernardo Magnini


References
References 23, 1990 (www.expeditions.com):

  • Books

  • Pasca, Marius, Open Domain Question Answering from Large Text Collections, CSLI, 2003.

  • Maybury, Mark (Ed.), New Directions in Question Answering, AAAI Press, 2004.

  • Journals

  • Hirshman, Gaizauskas. Natural Language question answering: the view from here. JNLE, 7 (4), 2001.

  • TREC

  • E. Voorhees. Overview of the TREC 2001 Question Answering Track.

  • M.M. Soubbotin, S.M. Soubbotin. Patterns of Potential Answer Expressions as Clues to the Right Answers.

  • S. Harabagiu, D. Moldovan, M. Pasca, M. Surdeanu, R. Mihalcea, R. Girju, V. Rus, F. Lacatusu, P. Morarescu, R. Brunescu. Answering Complex, List and Context questions with LCC’s Question-Answering Server.

  • C.L.A. Clarke, G.V. Cormack, T.R. Lynam, C.M. Li, G.L. McLearn. Web Reinforced Question Answering (MultiText Experiments for TREC 2001).

  • E. Brill, J. Lin, M. Banko, S. Dumais, A. Ng. Data-Intensive Question Answering.

RANLP 2005 - Bernardo Magnini


References1
References 23, 1990 (www.expeditions.com):

  • Workshop Proceedings

  • H. Chen and C.-Y. Lin, editors. 2002. Proceedings of the Workshop on Multilingual Summarization and Question Answering at COLING-02, Taipei, Taiwan.

  • M. de Rijke and B. Webber, editors. 2003. Proceedings of the Workshop on Natural Language Processing for Question Answering at EACL-03, Budapest, Hungary.

  • R. Gaizauskas, M. Hepple, and M. Greenwood, editors. 2004. Proceedings of the Workshop on Information Retrieval for Question Answering at SIGIR-04, Sheffield, United Kingdom.

RANLP 2005 - Bernardo Magnini


References2
References 23, 1990 (www.expeditions.com):

  • N. Kando and H. Ishikawa, editors. 2004. Working Notes of the 4th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Summarization (NTCIR-04), Tokyo, Japan.

  • M. Maybury, editor. 2003. Proceedings of the AAAI Spring Symposium on New Directions in Question Answering, Stanford, California.

  • C. Peters and F. Borri, editors. 2004. Working Notes of the 5th Cross-Language Evaluation Forum (CLEF-04), Bath, United Kingdom.

  • J. Pustejovsky, editor. 2002. Final Report of the Workshop on TERQAS: Time and Event Recognition in Question Answering Systems, Bedford, Massachusetts.

  • Y. Ravin, J. Prager and S. Harabagiu, editors. 2001. Proceedings of the Workshop on Open-Domain Question Answering at ACL-01, Toulouse, France.

RANLP 2005 - Bernardo Magnini


ad