Syntax
Download
1 / 27

Syntax - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

Syntax. The study of how words are ordered and grouped together Key concept: constituent = a sequence of words that acts as a unit. }. {. Phrase Structure. S. NP. VP. PN. VBD. NP. PP. PRP. NP. She. saw. a tall man. with. a telescope. det. adj. adj. head. PP.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Syntax' - curt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Syntax
Syntax

  • The study of how words are ordered and grouped together

  • Key concept: constituent = a sequence of words that acts as a unit

}

{


Phrase structure
Phrase Structure

S

NP

VP

PN

VBD

NP

PP

PRP

NP

She

saw

a tall man

with

a telescope


Noun phrases

det

adj

adj

head

PP

relative clause

That old green couch of yours that I want to throw out

Noun Phrases

  • Contains a noun plus descriptors, including:

    • Determiner: the, a, this, that

    • Adjective phrases: green, very tall

    • Head: the main noun in the phrase

    • Post-modifiers: prepositional phrases or relative clauses


Verb phrases

head

PP

want to throw out

adv

head

direct object

PP

previously saw the man in the park with her telescope

indirect

object

modal

aux

head

DObj

adverb

might have showed his boss the code yesterday

Verb Phrases

  • Contains a verb (the head) with modifiers and other elements that depend on the verb


Prepositional phrases

Adjective Phrases

  • Adjective as head with modifiers

adv

head

relative clause

extremely sure that he would win

Prepositional Phrases

  • Preposition as head and NP as complement

head

complement

with her grey poodle


Shallow parsing
Shallow Parsing

  • Extract phrases from text as ‘chunks’

  • Flat, no tree structures

  • Usually based on patterns of POS tags

  • Full parsing conceived of two steps:

    • Chunking / Shallow parsing

    • Attachment of chunks to each other


Noun phrases1
Noun Phrases

  • Base Noun Phrase: A noun phrase that does not contain other noun phrases as a component

  • Or, no modification to the right of the head

    a large green cow

    The United States Government

    every poor shop-owner’s dream ?

    other methods and techniques ?


Manual methodology
Manual Methodology

  • Build a regular-expression over POS

  • E.g:

    DT? (ADJ | VBG)* (NN)+

  • Very hard to do accurately

  • Lots of manual labor

  • Cannot be easily tuned to a specific corpus


Chunk tags
Chunk Tags

  • Represent NPs by tags:

    [the tall man] ran with [blinding speed]

    DT ADJ NN1 VBD PRP VBG NN0

    I I I O O I I

  • Need B tag for adjacent NPs:

    On [Tuesday][the company] went bankrupt

    O I B I O O


Transformational learning
Transformational Learning

  • Baseline tagger:

    • Most frequent chunk tag for POS or word

  • Rule templates (100 total):


Some rules learned
Some Rules Learned

  • (T1=O, P0=JJ) I O

  • (T-2=I, T-1=I, P0=DT)  B

  • (T-2=O, T-1=I, P-1=DT)  I

  • (T-1=I, P0=WDT) I  B

  • (T-1=I, P0=PRP) I  B

  • (T-1=I, W0=who) I  B

  • (T-1=I, P0=CC, P1=NN) O  I


Results
Results

  • Precision = fraction of NPs predicted that are correct

  • Recall = fraction of actual NPs that are found


Memory based learning
Memory-Based Learning

  • Match test data to previously seen data and classify based on the most similar previously seen instances

  • E.g:

{

boy saw three

she saw the

boy saw the

the saw was

boy ate the


K nearest neighbor k nn
k-Nearest Neighbor (kNN)

  • Find k most similar training examples

  • Let them ‘vote’ on the correct class for the test example

    • Weight neighbors by distance from test

  • Main problem: defining ‘similar’

    • Shallow parsing – overlap of words and POS

    • Use feature weighting...


Information gain
Information Gain

  • Not all features are created equal (e.g. saw in previous example is more important)

  • Weight the features by information gain

    = how much does f distinguish different classes


low information gain

high information gain

C2

C1

C4

C3


Base verb phrase
Base Verb Phrase

  • Verb phrase not including NPs or PPs

    [NP Pierre VinkenNP] , [NP 61 yearsNP] old ,

    [VP will soon be joiningVP] [NP the boardNP]

    as [NP anonexecutive directorNP] .


Results1
Results

  • Context:

    2 words and POS on left and 1 word and POS on right


Efficiency of mbl
Efficiency of MBL

  • Finding the neighbors can be costly

  • Possibility:

    Build decision tree based on information gain of features to index data = approximate kNN

W0

saw

boy

the

W-1

P-1

P-2


MBSL

  • Memory-based technique relying on sequential nature of the data

    • Use “tiles” of phrases in memory to “cover” a new candidate (and context), and compute a tiling score

ADJ NN1 NP]

PRP [NP DT ADJ

[NP DT ADJ NN1

PRP [NPDT

NN1 NP] PRP

VBD PRP [[ DT ADJ NN1 ]] PRP NN1

went to the white house for dinner


Tile evidence
Tile Evidence

  • Memory:

  • [NP DT NN1 NP] VBD [NP DT NN1 NN1 NP] [NP NN2 NP] .

  • [NP ADJ NN2 NP] AUX VBG PRP [NP DT ADJ NN1 NP] .

  • Some tiles:

  • [NP DT pos=3 neg=0

  • [NP DT NN1 pos=2 neg=0

  • DT NN1 NP] pos=1 neg=1

  • NN1 NP] pos=3 neg=1

  • NN1 NP] VBD pos=1 neg=0

  • Score tile t by ft(t) = pos / total,

    • Only keep tiles that pass a threshhold ft(t) > 


Covers

[NP DT ADJ

PRP [NPDT

NN1 NP] PRP

VBD PRP [[ DT ADJ NN1 ]] PRP NN1

Covers

  • Tile t1connects to t2 in a candidate if:

    • t2 starts after t1

    • there is no gap between them (may be overlap)

    • t2 ends after t1

  • A sequence of tiles covers a candidate if

    • each tile connects to the next

    • the tiles collectively match the entire candidate including brackets and maybe some context


Cover graph
Cover Graph

ADJ NN1 NP]

PRP [NP DT ADJ

START

END

[NP DT ADJ NN1

PRP [NPDT

NN1 NP] PRP

VBD PRP [[ DT ADJ NN1 ]] PRP NN1


Measures of goodness
Measures of ‘Goodness’

  • Number of different covers

  • Size of smallest cover (fewest tiles)

  • Maximum context in any cover (left + right)

  • Maximum overlap of tiles in any cover

  • Grand total positive evidence divided by grand total positive+negative evidence

    Combine these measures by linear weighting


Scoring a candidate
Scoring a Candidate

CandidateScore(candidate, T)

  • G CoverGraph(candidate, T)

  • Compute statistics by DFS on G

  • Compute candidate score as linear function of statistics

    Complexity (O(l) tiles in candidate of length l):

    • Creating the cover graph is O(l2)

    • DFS is O(V+E)=O(l2)


Full algorithm
Full Algorithm

MBSL(sent, C, T)

  • For each subsequence of sent, do:

    • Construct a candidate s by adding brackets [[ and ]] before and after the subsequence

    • fC(s) CandidateScore(s, T)

    • IffC(s) > C, then add s to candidate-set

  • For eachc in candidate-set in decreasing order of fC(c), do:

    • Remove all candidates overlapping with c from candidate-set

  • Return candidate-set as target instances



ad