Hw7 extracting arguments for
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

HW7 Extracting Arguments for % PowerPoint PPT Presentation


  • 40 Views
  • Uploaded on
  • Presentation posted in: General

HW7 Extracting Arguments for %. Ang Sun [email protected] March 25, 2012. Outline. File Format Training Generating Training Examples Extracting Features Training of MaxEnt Models Decoding Scoring. File Format.

Download Presentation

HW7 Extracting Arguments for %

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Hw7 extracting arguments for

HW7 Extracting Arguments for %

Ang Sun

[email protected]

March 25, 2012


Outline

Outline

  • File Format

  • Training

    • Generating Training Examples

    • Extracting Features

    • Training of MaxEnt Models

  • Decoding

  • Scoring


File format

File Format

  • Statistics Canada said service-industry <ARG1> output </ARG1> in August <SUPPORT> rose </SUPPORT> 0.4 <PRED class="PARTITIVE-QUANT"> % </PRED> from July .


Hw7 extracting arguments for

  • Generating Training Examples

  • Positive Example

    • Only one positive example for a sentence

    • The one with the annotation ARG1


Hw7 extracting arguments for

  • Generating Training Examples

  • Negative Examples

    • Two methods!

    • Method 1: consider any token that has one of the following POSs

      • NN 1150

      • NNS 905

      • NNP 205

      • JJ 25

      • PRP 24

      • CD 21

      • DT 16

      • NNPS 13

      • VBG 2

      • FW 1

      • IN 1

      • RB 1

      • VBZ 1

      • WDT 1

      • WP 1

Too many negative examples!


Hw7 extracting arguments for

  • Generating Training Examples

  • Negative Examples

    • Two methods!

    • Method 2: only consider head tokens


Hw7 extracting arguments for

  • Extracting Features

    f:candToken=output


Hw7 extracting arguments for

  • Extracting Features

    f:tokenBeforeCand=service-industry


Hw7 extracting arguments for

  • Extracting Features

    f:tokenAfterCand=in


Hw7 extracting arguments for

  • Extracting Features

    f: tokensBetweenCandPRED=in_August_rose_0.4


Hw7 extracting arguments for

  • Extracting Features

    f: numberOfTokensBetween=4


Hw7 extracting arguments for

  • Extracting Features

    f: exisitVerbBetweenCandPred=true


Hw7 extracting arguments for

  • Extracting Features

    f: exisitSUPPORTBetweenCandPred=true


Hw7 extracting arguments for

  • Extracting Features

    f:candTokenPOS=NN


Hw7 extracting arguments for

  • Extracting Features

    f:posBeforeCand=NN


Hw7 extracting arguments for

  • Extracting Features

    f:posAfterCand=IN


Hw7 extracting arguments for

  • Extracting Features

    f: possBetweenCandPRED=IN_NNP_VBD_CD


Hw7 extracting arguments for

  • Extracting Features

    f: BIOChunkChain=

    I-NP_B-PP_B-NP_B-VP_B-NP_I-NP


Hw7 extracting arguments for

  • Extracting Features

    f: chunkChain=

    NP_PP_NP_VP_NP


Hw7 extracting arguments for

  • Extracting Features

    f: candPredInSameNP=False


Hw7 extracting arguments for

  • Extracting Features

    f: candPredInSameVP=False


Hw7 extracting arguments for

  • Extracting Features

    f: candPredInSamePP=False


Hw7 extracting arguments for

  • Extracting Features

    f: shortestPathBetweenCandPred=

    NP_NP-SBJ_S_VP_NP-EXT


Hw7 extracting arguments for

  • Training of MaxEnt Model

  • Each training example is one line

    • candToken=output . . . . . class=Y

    • candToken=Canada . . . . . Class=N

  • Put all examples in one file, the training file

  • Use the MaxEnt wrapper or the program you wrote in HW5 to train your relation extraction model


Decoding

Decoding

  • For each sentence

    • Generate testing examples as you did for training

      • One example per feature line (without class=(Y/N))

    • Apply your trained model to each of the testing examples

    • Choose the example with the highest probability returned by your model as the ARG1

    • So there should be and must be one ARG1 for each sentence


Scoring

Scoring

  • As you are required to tag only one ARG1 for each sentence

  • Your system will be evaluated based on accuracy

    • Accuracy = #correct_ARG1s / #sentences


Hw7 extracting arguments for

Good Luck!


  • Login