1 / 60

Learning Semantic Parsers Using Statistical Syntactic Parsing Techniques

Learning Semantic Parsers Using Statistical Syntactic Parsing Techniques. Ruifang Ge Supervisor Professor: Raymond J. Mooney. February 8, 2006. Semantic Parsing.

brede
Download Presentation

Learning Semantic Parsers Using Statistical Syntactic Parsing Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Semantic Parsers Using Statistical Syntactic Parsing Techniques Ruifang Ge Supervisor Professor: Raymond J. Mooney February 8, 2006

  2. Semantic Parsing • Semantic Parsing: maps a natural-language sentence to a complete, detailed and formal meaning representation (MR) in a meaning representation language • Applications • Core component in practical spoken language systems: • JUPITER (MIT weather 1-888-573-talk) • MERCURY (MIT flight 1-877-MIT-talk) • Advice taking (Kuhlmann et al., 2004)

  3. CLang: RoboCup Coach Language • In RoboCup Coach competition teams compete to coach simulated players • The coaching instructions are given in a formal language called CLang If our player 2 has the ball, our player 4 should stay in our half Simulated soccer field Coach Semantic Parsing ((bowner our {2}) (do our {4} (pos (half our)))) CLang

  4. Motivating Example If our player 2 has the ball, our player 4 should stay in our half ((bowner our {2}) (do our {4} (pos (half our)))) Semantic parsing is a compositional process. Sentence structures are needed for building meaning representations.

  5. Roadmap • Related work on semantic parsing • SCISSOR • Experimental results • Proposed work • Conclusions

  6. Category I: Syntax-Based Approaches • Meaning composition follows the tree structure of a syntactic parse • Composing the meaning of a constituent from the meanings of its sub-constituents in a syntactic parse • specified using syntactic relations and semantic constraints in application domains • Miller et al. (1996), Zettlemoyer & Collins (2005)

  7. require no arguments semantic vacuous require arguments Category I: Example S-bowner(player(our,2)) NP-player(our,2) VP-bowner(_) NP-null PRP$-our NN-player(_,_) CD-2 VB-bowner(_) 2 DT-null NN-null our player has the ball player(team,unum) bowner(player)

  8. Category I: Example S-bowner(player(our,2)) NP-player(our,2) VP-bowner(_) NP-null PRP$-our NN-player(_,_) CD-2 VB-bowner(_) 2 DT-null NN-null our player has the ball player(team,unum) bowner(player)

  9. Category I: Example S-bowner(player(our,2)) NP-player(our,2) VP-bowner(_) NP-null PRP$-our NN-player(_,_) CD-2 VB-bowner(_) 2 DT-null NN-null our player has the ball player(team,unum) bowner(player)

  10. Category II: Purely Semantic-Driven Approaches • No syntactic information is used in building tree structures • Non-terminals in this category correspond to semantic concepts in application domains • Tang & Mooney (2001), Kate (2005), Wong(2005)

  11. Category II: Example bowner player has the ball our 2 our player 2

  12. Category III: Hybrid Approaches • Utilizing syntactic information in semantic parsing approaches driven by semantics • Syntactic phrase boundaries • syntactic category of semantic concepts • word dependencies • Kate, Wong & Mooney (2005)

  13. Our Approach • We introduce an approach falling into category I: a syntax-driven approach • Reason • Employ state-of-the-art statisticalsyntactic parsing techniques to help building tree structures for meaning composition • State-of-the-art statistical parsing techniques are becoming more and more robust and accurate[Collins (1997) and Charniak & Johnson (2005)]

  14. Roadmap • Related work on semantic parsing • SCISSOR • Experimental results • Proposed work • Conclusions

  15. SCISSOR: Semantic Composition that Integrates Syntax and Semantics to get Optimal Representations

  16. S-bowner NP-player VP-bowner PRP$-team NN-player CD-unum VB-bowner NP-null our player 2 has DT-null NN-null the ball SCISSOR • An integrated syntax-based approach • Allows both syntax and semantics to be used simultaneously to build meaning representations • A statistical parser is used to generate a semantically augmented parse tree (SAPT) • Translate a SAPT into a complete formal meaning representation (MR) using a meaning composition process MR: bowner(player(our,2))

  17. SCISSOR • An integrated syntax-based approach • Allows both syntax and semantics to be used simultaneously to build meaning representations • A statistical parser is used to generate a semantically augmented parse tree (SAPT) • Translate a SAPT into a complete formal meaning representation (MR) using a meaning composition process • Allow statistical modeling of semantic selectional constraints in application domains • (AGENTpass) = PLAYER

  18. NL Sentence learner SAPT Training Examples SAPT TRAINING TESTING ComposeMR MR Overview of SCISSOR Integrated Semantic Parser

  19. Extending Collins’ (1997) Syntactic Parsing Model • Collins’ (1997) introduced a lexicalized head-driven syntactic parsing model • Bikel’s (2004) provides an easily-extended open-source version of the Collins statistical parser • Extending the parsing model to generate semantic labels simultaneously with syntactic labels constrained by semantic constraints in application domains

  20. Probability of rules are independent Of words involved S  NP VP 0.4 NP  PRP$ NN CD 0.06 S VP  VB NP 0.3 NP VP PRP$  our 0.01 NN  player 0.001 PRP$ NN CD VB NP CD  2 0.0001 our player 2 has DT NN VB  has 0.02 DT  the 0.1 the ball NN  ball 0.01 P(Tree, S) = 0.4*0.06*0.3*…*0.01 Example: Probabilistic Context Free Grammar (PCFG)

  21. S NP VP PRP$ NN CD VB NP our player 2 has DT NN the ball S(has) VP(has) NP(player) NN VB PRP$ CD NP(ball) NN our player 2 has DT the ball Example: Lexicalized PCFG Nodes in purple are heads of the rules

  22. P(NP(player) VP(has)| S(has)) S(has) VP(has) NP(player) P(NP(player) | S(has) VP(has)) Example: Estimating Rule Probability = P(VP(has)| S(has)) × Decompose expansion of a non-terminal into primitive steps In Collins’ model, syntactic subcategorization frames are used to constrain the generation of modifiers, e.g., has requires an NP as its subject

  23. Integrating Semantics into the Model S-bowner(has) NP-player(player) VP-bowner(has) PRP$-team NN-null CD-unum VB-bowner NP-null(ball) our player 2 has DT-null NN-null the ball Non-terminals now have both syntactic and semantic labels

  24. Estimating Rule Probability Including Semantic Labels S-bowner(has) VP-bowner(has) Ph(VP-bowner | S-bowner, has)

  25. {NP}: syntactic constraint to the left {player}: semantic constraint to the left has requires an NP as its object, but it’s within VP Estimating Rule Probability Including Semantic Labels S-bowner(has) VP-bowner(has) {NP}-{player} { }-{ } Ph(VP-bowner | S-bowner, has) × Plc({NP}-{player} | S-bowner,VP-bowner, has) × Prc({}-{}| S-bowner,VP-bowner, has)

  26. Estimating Rule Probability Including Semantic Labels S-bowner(has) NP-player(player) VP-bowner(has) {NP}-{player} { }-{ } Ph(VP-bowner | S-bowner, has) × Plc({NP}-{player} | S-bowner, VP-bowner, has) × Prc({}-{}| S-bowner, VP-bowner, has) × Pd(NP-player(player) | S-bowner,VP-bowner, has, LEFT,{NP}-{player})

  27. Estimating Rule Probability Including Semantic Labels S-bowner(has) NP-player(player) VP-bowner(has) { }-{ } { }-{ } Ph(VP-bowner | S-bowner, has) × Plc({NP}-{player} | S-bowner, VP-bowner, has) × Prc({}-{}| S-bowner, VP-bowner, has) × Pd(NP-player(player) | S-bowner, VP-bowner, has, LEFT, {NP}-{player})

  28. Parser Implementation • Supervised training on annotated SAPTs is just frequency counting • Augmented smoothing technique is employed to account for additional data sparsity created by semantic labels. • Parsing of test sentences to find the most probable SAPT is performed using a variant of standard CKY chart-parsing algorithm.

  29. Roadmap • Related work on semantic parsing • SCISSOR • Experimental results • Proposed work • Conclusions

  30. Experimental Results: Experimental Corpora • CLang • 300 randomly selected rules from the log files of the 2003 RoboCup Coach Competition • Coaching advice is annotated with NL sentences by 4 annotators independently • 22.52 words per sentence • GeoQuery [Zelle & Mooney, 1996] • 250 queries for U.S. geography database • 6.87 words per sentence

  31. Experimental Methodology • Evaluated using standard 10-fold cross validation • Correctness • CLang: output exactly matches the correct representation • Geoquery: query retrieves correct answer

  32. Experimental Methodology • Metrics

  33. Compared Systems • COCKTAIL (Tang & Mooney, 2001) • A purely semantic-driven approach which learns a shift-reduce deterministic parser using inductive logic programming techniques • WASP (Wong, 2005) • A purely semantic-driven approach using machine translation techniques • KRISP (Kate, 2005) • A purely semantic-driven approach based on string kernel • The above systems all learn from sentences paired with meaning representations • SCISSOR need extra annotation (SAPTs)

  34. Precision Learning Curve for CLang deterministic parsing memoryoverflow

  35. Recall Learning Curve for CLang 12

  36. F-measure Learning Curve for CLang Significantly better at the 95% confidence interval

  37. Results on Sentences within Different Length Range • How does sentence complexity affect parsing performance • Sentence complexity is a difficult thing to measure • Use sentence length as an indicator

  38. Sentence Length Distribution (CLang)

  39. Detailed CLang Results on Sentence Length Syntactic structure is needed on longer sentences where using semantic constraints alone can not sufficiently eliminate ambiguities

  40. Precision Learning Curve for GeoQuery

  41. Recall Learning Curve for GeoQuery

  42. F-measure Learning Curve for GeoQuery Not significantly better at the 95% confidence interval

  43. Zettlemoyer & Collins (2005) • It introduces a syntax-based semantic parser based on combinatory categorical grammar (CCG) (Steedman, 2000) • Require a set of hand-built rules to specify possible syntactic categories for each type of semantic concepts

  44. Zettlemoyer & Collins (2005) • Provide results on a larger GeoQuery dataset (880 examples): • Using a different experimental setup • Prec/Recall: 96.25/79.29 (SCISSOR Prec/Recall: 92.08/72.27) • Performance on more complex domains such as CLang is not clear • Need to design another set of hand-built template rules

  45. Roadmap • Related work on semantic parsing • SCISSOR • Experimental results • Proposed work • Discriminative Reranking for Semantic Parsing • Automating the SAPT-Generation • Other issues • Conclusions

  46. SAPTs after Reranking Current Ranked SAPTs Input Sentence S3 S1 S1 S2 SCISSOR S2 S3 S4 S4 local features global features Reranking for Semantic Parsing Reranker Reranking has been successfully used in parsing, tagging, machine translation, …

  47. Reranking Features NP-PLAYER • Collins (2000) introduces syntactic features for reranking syntactic parses • One level rules: f(NP  PRP$ NN CD)=1 • Bigrams, two level rules, … • To reranking SAPTs, we can introduce a semantic feature type for each syntactic feature type • Based on the coupling of syntax and semantics • Example: one level rules • f(PLAYER TEAM PLAYER UNUM)=1 PRP$-TEAM NN-PLAYER CD-UNUM

  48. Reranking Evaluation • Rerank on top 50 best parses generated by SCISSOR • Reranking algorithm: averaged perceptron (Collins, 2002) • Simple, fast and effective Significantly better Significantly better • Reranking does not improve the results on GeoQuery

  49. [giver John] gave [entity given to Mary] [thing given a pen] Further Investigation of Reranking Features • Semantic Role Labeling (SRL) features • Identifying the semantic relations, or semantic roles of a target word in a given sentence

  50. Roadmap • Related work on semantic parsing • SCISSOR • Experimental results • Proposed work • Discriminative Reranking for Semantic Parsing • Automating the SAPT-Generation • Other issues • Conclusions

More Related