1 / 55

CS626: NLP, Speech and the Web

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17 th and 24 th September, 2012 (16 th lecture was on UNL by Avishek ). Dealing With Structural Ambiguity. Multiple parses for a sentence

banyan
Download Presentation

CS626: NLP, Speech and the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS626: NLP, Speech and the Web Pushpak BhattacharyyaCSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17th and 24th September, 2012 (16th lecture was on UNL by Avishek)

  2. Dealing With Structural Ambiguity • Multiple parses for a sentence • The man saw the boy with a telescope. • The man saw the mountain with a telescope. • The man saw the boy with the ponytail. At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2nd & 3rd sentence.

  3. Prepositional Phrase (PP) Attachment Problem V – NP1 – P – NP2 (Here P means preposition) NP2 attaches to NP1 ? or NP2 attaches to V ?

  4. Xerox Parsers • XLE Parser (Freely available) • XIP Parser (expensive) • Linguistic Data Consortium • (LDC at UPenn) Europe India European Language Resource Association (ELRA) Linguistic Data Consortium for Indian Languages (LDC-IL)

  5. Parse Trees for a Structurally Ambiguous Sentence Let the grammar be – S  NP VP NP  DT N | DT N PP PP  P NP VP  V NP PP | V NP For the sentence, “I saw a boy with a telescope”

  6. Parse Tree - 1 S NP VP N V NP Det N PP saw I a P NP boy Det N with a telescope

  7. Parse Tree -2 S NP VP PP N V NP P NP Det N saw I Det N with a boy a telescope

  8. Parsing Structural Ambiguity

  9. Parsing for Structurally Ambiguous Sentences Sentence “I saw a boy with a telescope” Grammar: S  NP VP NP  ART N | ART N PP | PRON VP  V NP PP | V NP ART  a | an | the N  boy | telescope PRON  I V  saw

  10. Ambiguous Parses • Two possible parses: • PP attached with Verb (i.e. I used a telescope to see) • ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” ) ( NP ( (ART “a”) ( N “boy”)) ( PP (P “with”) (NP ( ART “a” ) ( N “telescope”))))) • PP attached with Noun (i.e. boy had a telescope) • ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” ) ( NP ( (ART “a”) ( N “boy”) (PP (P “with”) (NP ( ART “a” ) ( N “telescope”))))))

  11. Top Down Parse

  12. Top Down Parse

  13. Top Down Parse

  14. Top Down Parse

  15. Top Down Parse

  16. Top Down Parse

  17. Top Down Parse

  18. Top Down Parse

  19. Top Down Parse

  20. Top Down Parse

  21. Top Down Parse

  22. Top Down Parse

  23. Top Down Parse

  24. Top Down Parse

  25. Top Down Parse

  26. Top Down Parsing - Observations Top down parsing gave us the Verb Attachment Parse Tree (i.e., I used a telescope) To obtain the alternate parse tree, the backup state in step 5 will have to be invoked Is there an efficient way to obtain all parses ?

  27. Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purplefor Noun Attachment Parse Redfor Invalid Parse

  28. Bottom Up Parse NP12  PRON12 S1?NP12VP2? I saw a boy with a telescope 1 2 3 4 5 6 7 8

  29. Bottom Up Parse NP12  PRON12 VP2?V23NP3? S1?NP12VP2? VP2?V23NP3?PP?? I saw a boy with a telescope 1 2 3 4 5 6 7 8

  30. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? I saw a boy with a telescope 1 2 3 4 5 6 7 8

  31. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? I saw a boy with a telescope 1 2 3 4 5 6 7 8

  32. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  33. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  34. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  35. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? S15NP12VP25 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  36. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  37. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  38. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58 VP28V23NP35PP58 VP28V23NP38 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  39. Bottom Up Parse NP12  PRON12 VP2?V23NP3? NP35 ART34N45 NP35ART34N45 PP5?P56NP6? NP68ART67N7? NP68ART67N78 S1?NP12VP2? VP2?V23NP3?PP?? NP3?ART34N45PP5? NP3?ART34N45PP5? NP6?ART67N78PP8? VP25V23NP35 VP2?V23NP35PP5? PP58P56NP68 S15NP12VP25 NP38ART34N45PP58 VP28V23NP35PP58 VP28V23NP38 S18NP12VP28 I saw a boy with a telescope 1 2 3 4 5 6 7 8

  40. Bottom Up Parsing - Observations Both Noun Attachment and Verb Attachment Parses obtained by simply systematically applying the rules Numbers in subscript help in verifying the parse and getting chunks from the parse

  41. Exercise For the sentence, “The man saw the boy with a telescope” & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.

  42. Start of Probabilistic Parsing

  43. Example of Sentence labeling: Parsing [S1[S[S[VP[VBCome][NP[NNPJuly]]]] [,,] [CC and] [S [NP [DT the] [JJ IIT] [NN campus]] [VP [AUX is] [ADJP [JJ abuzz] [PP[IN with] [NP[ADJP [JJ new] [CC and] [ VBG returning]] [NNS students]]]]]] [..]]]

  44. Noisy Channel Modeling Source sentence Noisy Channel Target parse T*= argmax [P(T|S)] T = argmax [P(T).P(S|T)] T = argmax [P(T)], since given the parse the T sentence is completely determined and P(S|T)=1

  45. Corpus • A collection of text called corpus, is used for collecting various language data • With annotation: more information, but manual labor intensive • Practice: label automatically; correct manually • The famous Brown Corpus contains 1 million tagged words. • Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics

  46. Discriminative vs. Generative Model W* = argmax (P(W|SS)) W Generative Model Discriminative Model Compute directly from P(W|SS) Compute from P(W).P(SS|W)

  47. Language Models • N-grams: sequence of n consecutive words/characters • Probabilistic / Stochastic Context Free Grammars: • Simple probabilistic models capable of handling recursion • A CFG with probabilities attached to rules • Rule probabilities  how likely is it that a particular rewrite rule is used?

  48. PCFGs • Why PCFGs? • Intuitive probabilistic models for tree-structured languages • Algorithms are extensions of HMM algorithms • Better than the n-gram model for language modeling.

  49. Data driven parsing is tied up with what is seen in the training data • “Stand right walk left.” Often a message on the airport escalators. • “Walk left” can come from :- “After climbing , we realized the walk left was still considerable.” • “Stand right” can come from: - “The stand right though it seemed at that time, does not stand the scrutiny now.” • Then the given sentence “Stand right walk left” will never be parsed correctly.

  50. Chomsky Hierarchy Properties of the hierarchy:- • The containment is proper. • Grammar is not learnable even for the lowest type from positive examples alone. Type 0 Type 1 or CSG Type 2 or CFG Type 3 or regular grammar

More Related