basic parsing with context free grammars
Download
Skip this Video
Download Presentation
Basic Parsing with Context-Free Grammars

Loading in 2 Seconds...

play fullscreen
1 / 26

Basic Parsing with Context-Free Grammars - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

Basic Parsing with Context-Free Grammars. Analyzing Linguistic Units. Morphological parsing: analyze words into morphemes and affixes rule-based, FSAs, FSTs Ngrams for Language Modeling POS Tagging Syntactic parsing: identify constituents and their relationships

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Basic Parsing with Context-Free Grammars' - manton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
analyzing linguistic units
Analyzing Linguistic Units
  • Morphological parsing:
    • analyze words into morphemes and affixes
    • rule-based, FSAs, FSTs
  • Ngrams for Language Modeling
  • POS Tagging
  • Syntactic parsing:
    • identify constituents and their relationships
    • to see if a sentence is grammatical
    • to assign an abstract representation of meaning
syntactic parsing
Syntactic Parsing
  • Declarative formalisms like CFGs, FSAs define the legal strings of a language -- but only tell you ‘this is a legal string of the language X’
  • Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses
  • Parsing useful for grammar checking, semantic analysis, MT, QA, information extraction, speech recognition…almost every task in NLP…but…
parsing as a form of search
Parsing as a Form of Search
  • Searching FSAs
    • Finding the right path through the automaton
    • Search space defined by structure of FSA
  • Searching CFGs
    • Finding the right parse tree among all possible parse trees
    • Search space defined by the grammar
  • Constraints provided by the input sentence and the automaton or grammar
cfg for fragment of english
CFG for Fragment of English

LC’s

TopDBotUp

E.g.

rule expansion

S  NP VP

VP  V

S  Aux NP VP

PP -> Prep NP

S  VP (1)

N  book | flight | meal | money

NP  Det Nom (3)

V  book | include | prefer

NP PropN

Aux  does

Nom  N Nom

Prep from | to | on

Nom  N (4)

PropN  Houston | TWA

Nom  Nom PP

VP  V NP (2)

Rule Expansion

Det  that | this | a

LC’s

TopDBotUp

E.g.

top down parser
Top-Down Parser
  • Builds from the root S node to the leaves
  • Assuming we build all trees in parallel:
    • Find all trees with root S (or all rules w/lhs S)
    • Next expand all constituents in these trees/rules
    • Continue until leaves are pos
    • Candidate trees failing to match pos of input string are rejected (e.g. Book that flight matches only one subtree)
top down search space for cfg expanding only leftmost leaves
Top-Down Search Space for CFG (expanding only leftmost leaves)

S S S

NP VP Aux NP VP VP

S S S S S S

NP VP NP VP Aux NP VP Aux NP VP VP VP

Det Nom PropN Det Nom PropN V NP V

Det Nom

N

bottom up parsing
Bottom-Up Parsing
  • Parser begins with words of input and builds up trees, applying grammar rules whose rhs match
    • Book that flight

N Det N V Det N

Book that flight Book that flight

    • ‘Book’ ambiguous (2 pos appear in grammar)
    • Parse continues until an S root node reached or no further node expansion possible
two candidates one successful parse
Two Candidates: One Successful Parse

S

VP

VP NP NP

Nom Nom

V Det N V Det N

Book that flight Book that flight

S ~  VP NP

what s right wrong with
What’s right/wrong with….
  • Top-Down parsers – they never explore illegal parses (e.g. which can’t form an S) -- but waste time on trees that can never match the input
  • Bottom-Up parsers – they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)
  • For both: find a control strategy -- how explore search space efficiently?
    • Pursuing all parses in parallel or backtrack or …?
    • Which rule to apply next?
    • Which node to expand next?
a possible top down parsing strategy
A Possible Top-Down Parsing Strategy
  • Depth-first search:
    • Agenda of search states: expand search space incrementally, exploring most recently generated state (tree) each time
    • When you reach a state (tree) inconsistent with input, backtrack to most recent unexplored state (tree)
  • Which node to expand?
    • Leftmost or rightmost
  • Which grammar rule to use?
    • Order in the grammar? How?
top down depth first left right strategy
Top-Down, Depth-First, Left-Right Strategy
  • Initialize agenda with ‘S’ tree and ptr to first word (cur)
  • Loop: Until successful parse or empty agenda
    • Apply next applicable grammar rule to leftmost unexpanded node (n) of current tree (t) on agenda and push resulting tree (t’) onto agenda
      • If n is a POS category and matches the POS of cur, push new tree (t’’) onto agenda and increment cur
      • Else pop t’ from agenda
    • Final agenda contains history of successful parse
  • Does this flight include a meal?
left corners top down parsing with bottom up filtering
Left Corners: Top-Down Parsing with Bottom-Up Filtering
  • We saw: Top-Down, depth-first, L2R parsing
    • Expands non-terminals along the tree’s left edge down to leftmost leaf of tree
    • Moves on to expand down to next leftmost leaf…
    • Note: In successful parse, current input word will be first word in derivation of node the parser currently processing
    • So….look ahead to left-corner of the tree
      • B is a left-corner of A if A =*=> Bα
      • Build table with left-corners of all non-terminals in grammar and consult before applying rule
left recursion vs right recursion
Left Recursion vs. Right Recursion
  • Depth-first search will never terminate if grammar is left recursive (e.g. NP --> NP PP)
slide20
Solutions:
    • Rewrite the grammar (automatically?) to a weakly equivalent one which is not left-recursive

e.g. The man {on the hill with the telescope…}

NP  NP PP (wanted: Nom plus a sequence of PPs)

NP  Nom PP

NP  Nom

Nom  Det N

…becomes…

NP  Nom NP’

Nom  Det N

NP’  PP NP’ (wanted: a sequence of PPs)

NP’  e

      • Not so obvious what these rules mean…
slide21
Harder to detect and eliminate non-immediate left recursion
      • NP --> Nom PP
      • Nom --> NP
  • Fix depth of search explicitly
  • Rule ordering: non-recursive rules first
    • NP --> Det Nom
    • NP --> NP PP
an exercise the city hall parking lot in town
An Exercise: The city hall parking lot in town
  • NP NP NP PP
  • NP  Det Nom
  • NP  Adj Nom
  • NP  Nom Nom
  • Nom  NP Nom
  • Nom  N
  • PP  Prep NP
  • N  city | hall | lot | town
  • Adj  parking
  • Prep  to | for | in
another problem structural ambiguity
Another Problem: Structural ambiguity
  • Multiple legal structures
    • Attachment (e.g. I saw a man on a hill with a telescope)
    • Coordination (e.g. younger cats and dogs)
    • NP bracketing (e.g. Spanish language teachers)
slide25
Solution?
    • Return all possible parses and disambiguate using “other methods”
summing up
Summing Up
  • Parsing is a search problem which may be implemented with many control strategies
    • Top-Down or Bottom-Up approaches each have problems
      • Combining the two solves some but not all issues
    • Left recursion
    • Syntactic ambiguity
  • Next time: Making use of statistical information about syntactic constituents
    • Read Ch 12
ad