slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin PowerPoint Presentation
Download Presentation
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin

Loading in 2 Seconds...

play fullscreen
1 / 31

Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor : Paul Tarau, based on Rada Mihalcea’s original slides. Parsing. Parsing with CFGs refers to the task of assigning correct trees to input strings

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin' - portia-roman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Parsing with

Context Free Grammars

Reading: Chap 13, Jurafsky & Martin

This slide set was adapted from J. Martin, U. Colorado

Instructor: Paul Tarau, based on RadaMihalcea’s original slides

parsing
Parsing
  • Parsing with CFGs refers to the task of assigning correct trees to input strings
  • Correct here means a tree that covers all and only the elements of the input and has an S at the top
  • It doesn’t actually mean that the system can select the correct tree from among the possible trees
  • As with everything of interest, parsing involves a search that involves the making of choices
some assumptions
Some assumptions..
  • Assume…
    • You have all the words already in some buffer
    • The input is/isn’t pos tagged
    • All the words are known
  • These are all (quite) feasible
    • State-of-the art in POS tagging?
    • “all words are known” ?
top down parsing
Top-Down Parsing
  • Since we are trying to find trees rooted with an S (Sentences) start with the rules that give us an S.
  • Then work your way down from there to the words.
bottom up parsing
Bottom-Up Parsing
  • Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way.
  • Then work your way up from there.
top down vs bottom up
Top-Down VS. Bottom-Up
  • Top-down
    • Only searches for trees that can be answers
    • But suggests trees that are not consistent with the words
    • Guarantees that tree starts with S as root
    • Does not guarantee that tree will match input words
  • Bottom-up
    • Only forms trees consistent with the words
    • Suggest trees that make no sense globally
    • Guarantees that tree matches input words
    • Does not guarantee that parse tree will lead to S as a root
  • Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
example cont d1
Example (cont’d)

flight

flight

example cont d2
Example (cont’d)

flight

flight

possible problem left recursion
Possible Problem: Left-Recursion
  • What happens in the following situation
    • S -> NP VP
    • S -> Aux NP VP
    • NP -> NP PP
    • NP -> Det Nominal
    • With the sentence starting with
      • Did the flight…
solution rule ordering
Solution: Rule Ordering
  • S -> Aux NP VP
  • S -> NP VP
  • NP -> Det Nominal
  • NP -> NP PP
  • The key for the NP is that you want the recursive option after any base case.
avoiding repeated work
Avoiding Repeated Work
  • Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over.
  • Consider an attempt to top-down parse the following as an NP
  • A flight from Indianapolis to Houston on TWA
slide17

flight

flight

dynamic programming
Dynamic Programming
  • We need a method that fills a table with partial results that
    • Does not do (avoidable) repeated work
    • Does not fall prey to left-recursion
    • Solves an exponential problem in (approximately) polynomial time
earley parsing
Earley Parsing
  • Fills a table in a single sweep over the input words
    • Table is length N+1; N is number of words
    • Table entries represent
      • Completed constituents and their locations
      • In-progress constituents
      • Predicted constituents
states
States
  • The table-entries are called states and are represented with dotted-rules.
    • S -> · VP A VP is predicted
    • NP -> Det · Nominal An NP is in progress
    • VP -> V NP · A VP has been found
states locations
States/Locations
  • It would be nice to know where these things are in the input so…
    • S -> · VP [0,0] Predictor
    • A VP is predicted at the start of the sentence
    • NP -> Det · Nominal [1,2] Scanner
    • An NP is in progress; the Det goes from 1 to 2
    • VP -> V NP · [0,3] Completer
    • A VP has been found starting at 0 and ending at 3
earley
Earley
  • As with most dynamic programming approaches, the answer is found by looking in the table in the right place.
  • In this case, there should be an S state in the final column that spans from 0 to n+1 and is complete.
  • If that’s the case you’re done.
    • S -> α· [0,n+1]
  • So sweep through the table from 0 to n+1…
    • Predictor: New predicted states are created by states in current chart
    • Scanner: New incomplete states are created by advancing existing states as new constituents are discovered
    • Completer: New complete states are created in the same way.
earley1
Earley
  • More specifically…

1. Predict all the states you can upfront

2. Read a word

      • Extend states based on matches
      • Add new predictions
      • Go to 2

3. Look at N+1 to see if you have a winner

earley and left recursion
Earley and Left Recursion
  • So Earley solves the left-recursion problem without having to alter the grammar or artificially limit the search
    • Never place a state into the chart that’s already there
    • Copy states before advancing them
  • S -> NP VP
  • NP -> NP PP
  • The first rule predicts
    • S -> · NP VP [0,0] that adds
    • NP -> · NP PP [0,0]
    • stops there since adding any subsequent prediction would be fruitless
  • When a state gets advanced make a copy and leave the original alone
    • Say we have NP -> · NP PP [0,0]
    • We find an NP from 0 to 2 so we create NP -> NP · PP [0,2]
    • But we leave the original state as is
example
Example

Book that flight

We should find… an S from 0 to 3 that is a completed state…