1 / 47

Learning to Transform Natural to Formal Languages

Learning to Transform Natural to Formal Languages. Rohit J. Kate Yuk Wah Wong Raymond J. Mooney. July 13, 2005. Introduction. Semantic Parsing : Transforming natural language sentences into executable complete formal representations

monty
Download Presentation

Learning to Transform Natural to Formal Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning to Transform Natural to Formal Languages Rohit J. Kate Yuk Wah Wong Raymond J. Mooney July 13, 2005

  2. Introduction • Semantic Parsing: Transforming natural language sentences into executablecomplete formal representations • Different from Semantic Role Labeling which involves only shallow semantic analysis • Two application domains: • CLang: RoboCup Coach Language • GeoQuery: A Database Query Application

  3. CLang: RoboCup Coach Language • In RoboCup Coach competition teams compete to coach simulated players • The coaching instructions are given in a formal language called CLang If the ball is in our penalty area, then all our players except player 4 should stay in our half. Simulated soccer field Coach Semantic Parsing ((bpos (penalty-area our)) (do (player-except our{4}) (pos (half our))) CLang

  4. GeoQuery: A Database Query Application • Query application for U.S. geography database containing about 800 facts [Zelle & Mooney, 1996] How many cities are there in the US? User Semantic Parsing answer(A, count(B, (city(B), loc(B, C), const(C, countryid(USA))),A)) Query

  5. Outline • Semantic Parsing using Transformation Rules • Learning Transformation Rules • Experiments • Conclusions

  6. Semantic Parsing using Transformation Rules • SILT(Semantic Interpretation by Learning Transformations) • Uses pattern-based transformation rules which map natural language phrases to formal language constructs • Transformation rules are repeatedly applied to the sentence to construct its formal language expression

  7. RULE CONDITION DIRECTIVE bowner TEAM UNUM do TEAM UNUM ACTION our our 4 shoot 4 Formal Language Grammar NL: If our player 4 has the ball, our player 4 should shoot. CLang: ((bowner our {4}) (do our {4} shoot)) CLang Parse: • Non-terminals: RULE, CONDITION, ACTION… • Terminals: bowner, our, 4… • Productions:RULE  CONDITION DIRECTIVE DIRECTIVE  do TEAM UNUM ACTION ACTION  shoot

  8. S VP NP NP VBZ TEAM UNUM has DT NN the ball Transformation Rule Representation • Rule has two components: a natural language pattern and an associated formal language template • Two versions of SILT: • String-based rules: used to convert natural language sentence directly to formal language • Tree-based rules: used to convert syntactic tree to formal language word gap

  9. Example of Semantic Parsing If ourplayer 4 has the ball, our player 4 should shoot.

  10. TEAM TEAM our our Example of Semantic Parsing If player 4 has the ball, player 4 should shoot . our our

  11. TEAM TEAM our our Example of Semantic Parsing If player 4 has the ball, player 4 should shoot .

  12. UNUM UNUM TEAM TEAM our our 4 4 Example of Semantic Parsing If has the ball, should shoot . player 4 player 4

  13. UNUM UNUM TEAM TEAM our our 4 4 Example of Semantic Parsing If has the ball, should shoot .

  14. ACTION UNUM UNUM TEAM TEAM shoot our our 4 4 Example of Semantic Parsing If has the ball, should . shoot

  15. ACTION UNUM UNUM TEAM TEAM shoot our our 4 4 Example of Semantic Parsing If has the ball, should .

  16. CONDITION ACTION UNUM UNUM TEAM TEAM (bowner our {4}) shoot our our 4 4 Example of Semantic Parsing If , should . has the ball

  17. CONDITION ACTION UNUM TEAM (bowner our {4}) shoot our 4 Example of Semantic Parsing If , should .

  18. CONDITION DIRECTIVE ACTION UNUM TEAM (do our {4} shoot) (bowner our {4}) shoot our 4 Example of Semantic Parsing If , . should

  19. CONDITION DIRECTIVE (do our {4} shoot) (bowner our {4}) Example of Semantic Parsing If , .

  20. CONDITION DIRECTIVE RULE ((bowner our {4}) (do our {4} shoot)) (do our {4} shoot) (bowner our {4}) Example of Semantic Parsing If , .

  21. Learning Transformation Rules • SILT induces rules from a corpora of NL sentences paired with their formal representations • Patterns are learned for each production by bottom-up rule learning • For every production: • Call those sentences positives whose formal representations’ parses use that production • Call the remaining sentences negatives

  22. Rule Learning for a Production • SILT applies greedy-covering, bottom-up rule induction method that repeatedly generalizes positives until they start covering negatives CONDITION (bpos REGION) positives negatives • The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. • If the ball is in REGION and not in REGION then player 3 should intercept the ball. • During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION . • When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION . • All players except the goalie should pass the ball to REGION if it is in RP18. • If the ball is inside rectangle ( -54 , -36 , 0 , 36 ) then player 10 should position itself at REGION with a ball attraction of REGION . • Player 2 should pass the ball to REGION if it is in REGION . • If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . • If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one else.

  23. Generalization of String Patterns ACTION  (pos REGION) Pattern 1:Always position player UNUM at REGION . Pattern 2: Whenever the ball is in REGION, position player UNUM near the REGION . Find the highest scoring common subsequence:

  24. Generalization of String Patterns ACTION  (pos REGION) Pattern 1:Always position player UNUM at REGION. Pattern 2: Whenever the ball is in REGION, position player UNUM near the REGION. Find the highest scoring common subsequence: Generalization:position player UNUM [2] REGION .

  25. Generalization of Tree Patterns REGION  (penalty-area TEAM) Pattern 1: Pattern 2 Find common subgraphs. NP NP PRP$ NN NN NP NN NN penalty area TEAM penalty TEAM POS box ’s

  26. Generalization of Tree Patterns REGION  (penalty-area TEAM) Pattern 1: Pattern 2 Find common subgraphs. NP NP PRP$ NN NN NP NN NN penalty area TEAM penalty TEAM POS box ’s NP * NN NN Generalization: TEAM penalty

  27. Rule Learning for a Production CONDITION  (bpos REGION) positives negatives • The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. • If the ball is in REGION and not in REGION then player 3 should intercept the ball. • During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION . • When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION . • All players except the goalie should pass the ball to REGION if it is in REGION. • If the ball is inside REGION then player 10 should position itself at REGION with a ball attraction of REGION . • Player 2 should pass the ball to REGION if it is in REGION . • If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . • If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one else. Bottom-up Rule Learner

  28. Rule Learning for a Production CONDITION  (bpos REGION) positives negatives • The CONDITION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. • If the CONDITION and not in REGION then player 3 should intercept the ball. • During normal play if the CONDITION then player 7 , 9 and 11 should dribble the ball to the REGION . • When the play mode is normal and the CONDITION then our player 2 should pass the ball to the REGION . • All players except the goalie should pass the ball to REGION if CONDITION. • If the CONDITION then player 10 should position itself at REGION with a ball attraction of REGION . • Player 2 should pass the ball to REGION if CONDITION . • If our player 6 has the ball then he should take a shot on goal. • If player 4 has the ball , it should pass the ball to player 2 or 10. • If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. • During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. • If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . • If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . • If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. • If Pass_11 then player 11 should pass to player 9 and no one else. Bottom-up Rule Learner

  29. accuracy coverage Rule Learning for All Productions • Transformation rules for productions should cooperate globally to generate complete semantic parses • Redundantly cover every positive example by β = 5 best rules • Find the subset of these rules which best cooperate to generate complete semantic parses on the training data

  30. Experimental Corpora • CLang • 300 randomly selected pieces of coaching advice from the log files of the 2003 RoboCup Coach Competition • 22.52 words on average in NL sentences • 14.24 tokens on average in formal expressions • GeoQuery [Zelle & Mooney, 1996] • 250 queries for the given U.S. geography database • 6.87 words on average in NL sentences • 5.32 tokens on average in formal expressions

  31. Experimental Methodology • Evaluated using standard 10-fold cross validation • Syntactic parses needed by tree-based version were obtained by training Collins’ parser [Bikel, 2004] on WSJ treebank and gold-standard parses of training sentences • Correctness • CLang: output exactly matches the correct representation • Geoquery: the resulting query retrieves the same answer as the correct representation • Metrics

  32. Compared Systems • CHILL • Learns control rules for shift-reduce parsing using Inductive Logic Programming (ILP) • CHILLIN [Zelle & Mooney, 1996] • COCKTAIL [Tang & Mooney, 2001] • GEOBASE • Hand-built parser for GeoQuery [Borland International, 1988]

  33. Precision Learning Curves for CLang

  34. Recall Learning Curves for CLang

  35. Precision Learning Curves for GeoQuery

  36. Recall Learning Curves for GeoQuery

  37. Related Work • SCISSOR [Ge & Mooney, 2005] • Integrates semantic and syntactic statistical parsing • Requires extensive annotations but gives better results • PRECISE [Popescu et al., 2003] • Designed to work specially on NL database interfaces • Speech Recognition Community [Zue & Glass, 2000] • Simpler queries in ATIS corpus

  38. Conclusions • New approach for semantic parsing, SILT, which uses transformation rules • SILT learns transformation rules by doing bottom-up rule induction exploiting the target language grammar • Tested on two very different domains, performs better than previous ILP-based approaches

  39. Thank You! Our corpora can be downloaded from: http://www.cs.utexas.edu/~ml/nldata.html Questions??

  40. F-measure Learning Curves for CLang

  41. F-measure Learning Curves for GeoQuery

  42. Extra Slide: Average Training Time in Minutes

  43. Extra Slide: Variations of Rule Representation • Context in the patterns:

  44. CONDITION (bpos REGION) Extra Slide: Variations of Rule Representation • Context in the patterns: TEAM UNUMhas the ball inREGION

  45. Extra Slide: Variations of Rule Representation • Context in the patterns: • Templates with multiple productions:

  46. Extra Slide: Experimental Methodology • Correctness • CLang: output exactly matches the correct representation • Geoquery: the resulting query retrieves the same answer as the correct representation If the ball is in our penalty area, all our players except player 4 should stay in our half. Correct: ((bpos (penalty-area our)) (do (player-except our{4}) (pos (half our))) ((bpos (penalty-area opp)) (do (player-except our{4}) (pos (half our))) Output:

  47. Extra Slide: Future Work • Hard-matching symbolic patterns are sometimes too brittle, exploit string and tree kernels as classifiers [Lodhi et al., 2002] • Unified implementation of string and tree-based versions for direct comparisons

More Related