1 / 16

COGEX at the Second RTE

COGEX at the Second RTE. Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th , 2006. LCC’s Submission to RTE2. Linear combination of three entailment scores COGEX with constituency parse tree-derived logic forms

Download Presentation

COGEX at the Second RTE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10th, 2006

  2. LCC’s Submission to RTE2 • Linear combination of three entailment scores • COGEX with constituency parse tree-derived logic forms • COGEX with dependency parse tree-derived logic forms • Lexical alignment between T and H For each pair i (Ti,Hi) If then Ti entails Hi • Lambda(λ) parameters learned on the development data for each task (IE, IR, QA, SUM) COGEX@RTE2

  3. Approach to RTE with COGEX • Transform the two text fragments into 3-layered logic forms • Syntactic • Semantic • Temporal • Automatically create axioms to be used during the proof • Lexical Chains axioms • World Knowledge axioms • Linguistic transformation axioms • Load COGEX’s SOS with T and H and its USABLE list of clauses with the generated axioms, • Search for a proof by iteratively removing clauses from SOS and searching the USABLE for possible inferences until a refutation is found • If no contradiction is detected • Relax arguments • Drop entire predicates from H • Compute proof score semantic and temporal axioms COGEX@RTE2

  4. COGEX Enhancements (1/3) • Logic Form Transformation • Negations • not_RB(x1,e1) & walk_VB(e1,x2,x3) » -walk_VB(e1,x2,x3) • not_RB(x1,e1) & walk_VB(e1,x2,x3) & fast_RB(x4,e1) » -fast_RB(x4,e1) • no/DT case_NN(x1) & confirm_VB(e1,x2,x1) » -confirm_VB(e1,x2,x1) COGEX@RTE2

  5. COGEX Enhancements (1/3) • Logic Form Transformation • Temporal normalization of date/time predicates • 13th of January 1990 vs. January 13th, 1990 • 13th_of_January_1990_NN(x1) vs. January_13th_1990_NN(x1) • time_TMP(BeginFN(x1), year, month, day, hour, minute, second) & time_TMP(EndFN(x1), year, month, day, hour, minute, second) • time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59) COGEX@RTE2

  6. COGEX Enhancements (1/3) • Logic Form Transformation • Temporal context SUMO predicates (Clark et al., 2005) • (S,E1,E2) : S is the temporal signal linking two events E1 and E2 • during_TMP(e1,x1), earlier_TMP(e1,x1), … COGEX@RTE2

  7. Logic Forms Differences • Generate LF from two different sources • Constituency parse of the data • Dependency parse trees (data provided by the challenge organizers) COGEX@RTE2

  8. Logic Forms Differences • Gilda Flores was kidnapped on the 13th of January 1990. • Constituency: Gilda_NN(x1) & Flores_NN(x2) & nn_NNC(x3,x1,x2) & _human_NE(x3) & kidnap_VB(e1,x9,x3) & on_IN(e1,x8) & 13th_NN(x4) & of_NN(x5) & January_NN(x6) & 1990_NN(x7) & nn_ NNC(x8,x4,x5,x6,x7) & _date_NE(x8) & THM_SR(x3,e1) & TMP_SR(x8,e1) & time_TMP(BeginFN(x1), 1990, 1, 13, 0, 0, 0) & time_TMP(EndFN(x1), 1990, 1, 13, 23, 59, 59) & during_TMP(e1,x8) • Dependency: Gilda_Flores_NN(x2) & _human_NE(x2) & kidnap_VB(e1,x4,x2) & on_IN(e1,x3) & 13th_NN(x3) & of_IN(x3,x1) & January_1990_NN(x1) COGEX@RTE2

  9. COGEX Enhancements (2/3) • Axioms on Demand • Lexical Chains • Consider the first k=3 senses for each word • Maximum length of a lexical chain = 3 • DERIVATIONAL WordNet relation is ambiguous with respect to the role of the noun • Derivation-ACT: employ_VB(e1,x1,x2) → employment_NN(e1) • Derivation-AGENT: employ_VB(e1,x1,x2) → employer_NN(x1) • Derivation-THEME: employ_VB(e1,x1,x2) → employee_NN(x2) • Morphological derivations between adjectives and verbs COGEX@RTE2

  10. COGEX Enhancements (2/3) • Axioms on Demand • Lexical Chains • Augment with the NE predicate for NE target concepts • nicaraguan_JJ(x1,x2) → Nicaragua_NN(x1) & _country_NE(x1) • Discard lexical chains • with more than 2 HYPONYMY relations (H too specific) • with a HYPONYMY followed by an ISA • Chicago_NN(x1) → Detroit_NN(x1) • which include general concepts: object/NN, act/VB, be/VB • ni= number of hyponyms of concept ci • N = number of concepts in ci’s hierarchy COGEX@RTE2

  11. More Axioms • Another 73 World Knowledge axioms • Semantic Calculus – combinations of two semantic relations (82 axioms) • ISA, KINSHIP, CAUSE are transitive relations • ISA_SR(x1,x2) & PAH_SR(x3,x2) → PAH_SR(x3,x2) • Mike is a rich man → Mike is rich • Temporal Reasoning Axioms (Clark et al., 2005) (65 axioms) • Dates entail more general times • October 2000 → year 2000 • during_TMP(e1,e2) & during_TMP(e2,e3) → during_TMP(e1,e3) COGEX@RTE2

  12. COGEX Enhancements (3/3) • Proof Re-Scoring • (T)  smart people →  people (H) • (T)  people →  smart people (H) • Entities mentioned in T and H are existentially quantified • Universally quantified T and H entities • (T)  people →  smart people (H) • (T)  smart people →  people (H) COGEX@RTE2

  13. Shallow Lexical Alignment • Compute the edit distance between T and H • Cost (deletion of a word from T) = 0 • Cost (replace of a word from T with another in H) = ∞ • Cost (insert a word from H) = • Edit distance between synonyms = 0 COGEX@RTE2

  14. Results • IE: score given by COGEXC with some correction from COGEXD • IR: the highest contribution is made by LexAlign (~62%) • COGEXD better on IE, IR, QA (~69% accuracy) • COGEXC better on SUM (~66% accuracy) • Three-way combination outperforms any individual results and any two-system combination Learned parameters: COGEX@RTE2

  15. Results, Future Work • Higher accuracy on the SUM task • SUM is the highest accuracy task for all systems (false entailment pairs had H completely unrelated with the texts T) • IE: highest number of false positives • Future enhancements • Other types of context: report, planning, etc. • Need for more axioms • Automatic gathering of semantic axioms • Paraphrase acquisition (phrase1→ phrase2) COGEX@RTE2

  16. Thank You ! Questions?

More Related