Multi-Perspective Question Answering - PowerPoint PPT Presentation

Ava
multi perspective question answering l.
Skip this Video
Loading SlideShow in 5 Seconds..
Multi-Perspective Question Answering PowerPoint Presentation
Download Presentation
Multi-Perspective Question Answering

play fullscreen
1 / 65
Download Presentation
Multi-Perspective Question Answering
306 Views
Download Presentation

Multi-Perspective Question Answering

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Multi-Perspective Question Answering ARDA NRRC Summer 2002 Workshop

  2. Janyce Wiebe Eric Breck Chris Buckley Claire Cardie Paul Davis Bruce Fraser Diane Litman David Pierce Ellen Riloff Theresa Wilson Participants

  3. Finding and organizing opinions in the world press and other text Problem

  4. Our Work will Support • Finding a range of opinions expressed on a particular topic, event, issue • Clustering opinions and their sources • Attitude (positive, negative, uncertain) • Basis for opinion (supporting beliefs, experiences) • Expressive style (sarcastic, vehement, neutral) • Building perspective profiles of individuals and groups over many documents and topics

  5. Manual annotation scheme for linguistic expressions of opinions “It is heresy,” said Cao. “The `Shouters’ claim they are biggerthan Jesus.” Task: Annotation (writer,Cao) (writer,Cao,Shouters) (writer,Cao) (writer,Cao)

  6. (writer,FM) (writer,FM,FM) (writer,FM) (writer,FM,FM,SD) (writer,FM) (writer,FM) Task: Annotation The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya.

  7. Task: Conceptualization • Various ways perspective is manifested in language • Implications for higher-level tasks

  8. Task: Automate Manual Annotations • Machine learning • Identification of opinionated phrases, sources of opinions, …

  9. Task: Organizing Perspective Segments • Unsupervised clustering • Text features + features from the annotation scheme + higher-level features

  10. Solution Architecture Annotation Architecture AnnotationTool Learning Architecture LearningAlgorithms Trained Taggers Application Architecture DocumentRetrieval PerspectiveTagging SegmentClustering Question Other Taggers

  11. Evaluation • Exploratory manual clustering • Evaluation of automatic annotations against manual annotations • End-user evaluation of how well the system groups text segments into clusters of similar opinions about a given topic • Development of other end-user evaluation tasks

  12. Example The Annual Human Rights Report of the US State Department has been strongly criticized and condemned by many countries. Though the report has been made public for 10 days, its contents, which are inaccurate and lacking good will, continue to be commented on by the world media. Many countries in Asia, Europe, Africa, and Latin America have rejected the content of the US Human Rights Report, calling it a brazen distortion of the situation, a wrongful and illegitimate move, and an interference in the internal affairs of other countries. Recently, the Information Office of the Chinese People's Congress released a report on human rights in the United States in 2001, criticizing violations of human rights there. The report quoting data from the Christian Science Monitor, points out that the murder rate in the United States is 5.5 per 100,000 people. In the United States, torture and pressure to confess crime is common. Many people have been sentenced to death for crime they did not commit as a result of an unjust legal system. More than 12 million children are living below the poverty line. According to the report, one American woman is beaten every 15 seconds. Evidence show that human rights violations in the United States have been ignored for many years.

  13. <writer>:fact <writer> neg-attitude the report <writer>:fact Many countries in Asia, Europe, Africa, and Latin America neg-attitude the content of the US Human Rights Report neg-attitude it <writer>:fact a report on human rights in the United States in 2001 neg-attitude there <writer>:fact the report:fact <writer>:fact <writer>:subjective <writer>:fact <writer>:fact the report:fact <writer>:subjective Example The Annual Human Rights Report of the US State Department has been strongly criticized and condemned by many countries. Though the report has been made public for 10 days, its contents, which are inaccurate and lacking good will, continue to be commented on by the world media. Many countries in Asia, Europe, Africa, and Latin America have rejected the content of the US Human Rights Report, calling it a brazen distortion of the situation, a wrongful and illegitimate move, and an interference in the internal affairs of other countries. Recently, the Information Office of the Chinese People's Congress released a report on human rights in the United States in 2001, criticizing violations of human rights there. The report quoting data from the Christian Science Monitor, points out that the murder rate in the United States is 5.5 per 100,000 people. In the United States, torture and pressure to confess crime is common. Many people have been sentenced to death for crime they did not commit as a result of an unjust legal system. More than 12 million children are living below the poverty line. According to the report, one American woman is beaten every 15 seconds. Evidence show that human rights violations in the United States have been ignored for many years.

  14. neg-attitude <writer>: <many countries>: <Chinese HR report>: <HR report> subjectivity index: 4/10=40% subjectivity index: 2/2=100% subjectivity index: 1/3=33% (medium) expressive style: medium expressive style: extreme expressive style: medium neg-attitude (strong) (medium) <USA> Example neg-attitude

  15. Support the following… • Describe the collective perspective w.r.t. issue/object presented in an individual article, across a set of articles, … • Describe the perspective of a particular writer/individual/government/news service w.r.t. issue/object in an individual article, across a set of articles, … • Create a perspective profile for agents, groups, news sources, etc.

  16. Outline • Annotation: Wiebe & Wilson • Conceptualization: Davis • Architecture: Pierce • End-user evaluation: Buckley

  17. Annotation • Find opinions, evaluations, emotions, speculations (private states) expressed in language

  18. Annotation • Explicit mentions of private states and speech events • The United States fears a spill-over from the anti-terrorist campaign • Expressive subjective elements • The part of the US human rights report about China is full of absurdities and fabrications.

  19. Nested sources (writer,Xirao-Nima,US) (writer,Xirao-Nima) “The report is full of absurdities,’’ he continued. Annotation “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the central university for nationalities.

  20. Annotation • Whether opinions or other private states are expressed in speech • Type of private state (negative evaluation, positive evaluation, …) • Object of positive or negative evaluation • Strengths of expressive elements and private states

  21. “It is heresy,” said Cao. “The `Shouters’ claim they are bigger than Jesus.” (writer,Cao) (writer,Cao,Shouters) (writer,Cao) (writer,Cao) Example

  22. (writer,FM) (writer,FM,FM) (writer,FM) (writer,FM,FM,SD) (writer,FM) (writer,FM) Example The Foreign Ministry said Thursday that it was “surprised, to put it mildly” by the U.S. State Department’s criticism of Russia’s human rights record and objected in particular to the “odious” section on Chechnya.

  23. Accomplishments • Fairly mature annotation scheme and instructions • Representation supporting manual annotation using GATE (Sheffield) • Annotation corpus • Significant training of 3 annotators • Participants understand the annotation scheme

  24. Sample Gate Annotation

  25. Conceptualization • Ideology, emotions, and opinions are reflected in language • Language gives us a means to track and assess perspective • Goal: create document to support workshop annotation and experiments, and to extend to future applications

  26. Conceptualization Part I:Theoretical Background • Types of perspective: attitudes (subjectivity), spatial, temporal, sociological, etc. • Focuses on subjectivity expressed linguistically (e.g., opinions: criticized an unfair election, emotions: applaudedthe election, speculations: probably will be elected)

  27. Conceptualization: SubjectivityTheoretical Background (continued) • Sources have Attitudes about Objects: (writer, criticizes, election) • An ontology of attitudes leading to different types of private states (distinctions can range from identification, to positive and negative, to more fine-grained: reliability, source, assessment, necessity, etc.) This theoretical background informs the annotation strategy, experiments, and extensions

  28. Conceptualization Part II:Looking to higher levels: larger segments • Subjectivity beyond the immediate occurrence of the segment: • sentence and paragraph level • document level • discourse and topic level

  29. Conceptualization Part III:Looking forward: applications • Track perspective over time (identify changes) • Identify ideology (subjective expressions taken as a unit may approximate ideology) • Cluster agents with similar ideologies (similar expressions of opinions may help group those on the same side) • Infer ideology from limited expressions of perspective (some subjectivity for a source may suggest opinions on other topics)

  30. Architecture Overview • Solution architecture includes: • Application Architecture • supports high-level QA task • Annotation Architecture • supports document annotation • Learning Architecture • supports development of low- and mid-level system components via machine learning

  31. Solution Architecture AnnotationArchitecture annotateddocuments LearningArchitecture automaticannotators ApplicationArchitecture

  32. Solution Architecture Annotation Architecture AnnotationTool Learning Architecture LearningAlgorithms Trained Taggers Application Architecture DocumentRetrieval PerspectiveTagging DocumentClustering Question Other Taggers

  33. Application Architecture Multi-perspective Classifiers Document Clustering Documents Annotation Database Gate NE CASS Feature Generators

  34. Annotation Components • GATE’s ANNIE or MITRE Alembic • Tokenization, sentence-finding • Part-of-speech tagging • Name finding • Coreference resolution • CASS partial parser • SMART IR engine • Feature Generators

  35. Learning Architecture Evaluation Training Data Weka Learner Weka Learner Annotation Database Gate NE CASS Feature Generators

  36. Learning: Tasks • Identify subjective phrases • Identify nested sources • Discriminate Facts and Views • Classify Opinion Strength

  37. Learning: Features • Name recognition • Syntactic features • Lists of words • Contextual features • Density • …

  38. Annotation Architecture TopicDocuments GateAnnotationTool HumanAnnotators Gate XML MPQADatabase

  39. Annotation Tool (GATE) • Move headers and original markup to standoff annotation database • Initialize document annotations • Initial sources and speech events • Verify human annotations • Check id existence • Check attribute consistency

  40. Data Formats • Gate XML Format • standoff • structured • MPQA Annotation Format • standoff • flat • Machine Learning Formats (e.g., ARFF)

  41. Gate XML Format <Annotation Type=“expressive-subjectivity” StartNode=“215” EndNode=“228”> <Feature> <Name>strength</Name> <Value>low</Value> </Feature> <Feature> <Name>source</Name> <Value>w,foreign-ministry</Value> </Feature> </Annotation>

  42. MPQA Annotation Format id span type name content 42 215,228 string MPQA-agent id=“foreign-ministry”

  43. End-User Evaluation: Goal • Establish framework for evaluating tasks that would be of direct interest to analyst users • Do an example evaluation

  44. Manual Clustering • Human exploratory effort • MPQA participants manually cluster documents from 1-2 topics • Analyze basis for cluster

  45. User Task: Topic • U1: User states topic of interest and interacts with IR system • S1: System retrieves set of “relevant” documents along with their perspective annotations

  46. Example: Topic • U1: 2002 election in Zimbabwe • S1: System returns • 03.47.06-11142 Mugabe confident of victory in • 04.33.07-17094 Mugabe victory leaves West in • 05.22.13-11526 Mugabe says he is wide awake • 06.21.57-1967 Mugabe predicts victory • 06.37.20-8125 Major deployment of troops • 06.47.23-22498 Zambia hails results

  47. User Task: Question • U2: User states particular perspective question on topic. • Question should • identify source type (eg, governments, individuals, writers) of interest. • Be a yes/no (or pro/con) question for now

  48. Example: Question • Give Range of perspective: national government,groups of governments: • Was the election process fair, valid, and free of voter intimidation?

  49. User Task: Question Response • S2:System clusters documents • based on question,text,annotations • goal:group together documents with same answer and perspective (including expressive content). • System,for now, does not attempt to label each group with specific answers. • Target a small number of clusters (4?)

  50. Example:Question Response • Cluster 1 <keywords> • 07.20.20-11694 • 08.12.40-1611 • 08.15.19-23507 • 09.35.06-27851 • 13.10.41-18948 • Cluster 2 <keywords> • 12.08.27-27397 • 13.44.36-19236 • 04.33.07-17094 • 05.22.13-11526 • Cluster 3 <keywords> • 06.47.23-22498 • 06.51.18-1222 • 06.56.31-3120 • 07.16.31-13271