1 / 14

Extracting LTAGs from Treebanks

Extracting LTAGs from Treebanks. Fei Xia 04/26/07. Q1: How does grammar extraction work?. S. VP. VP. NP. ADVP. VP*. V. NP. ADV. draft. still. Two types of elementary tree in LTAG. Initial tree:. Auxiliary tree:. Arguments and adjuncts are in different types of elementary trees.

primo
Download Presentation

Extracting LTAGs from Treebanks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extracting LTAGs from Treebanks Fei Xia 04/26/07

  2. Q1: How does grammar extraction work?

  3. S VP VP NP ADVP VP* V NP ADV draft still Two types of elementary tree in LTAG Initial tree: Auxiliary tree: • Arguments and adjuncts are in different types of elementary trees

  4. Y Y* Y* Adjoining operation

  5. They still draft policies

  6. The treebank tree

  7. Step 1: Distinguish head/argument/adjunct

  8. S S VP VP NP NP ADVP VP ADVP PRP PRP RB VBP VBP NP NP RB they they still draft NNS NNS draft still policies policies Step 2: Insert additional nodes

  9. Step 3: Build elementary trees #3: #1: #2: #3: #4:

  10. #1: #2: VP NP NP ADVP VP* PRP NNS RB they still policies #3: S VP NP VBP NP draft Extracted grammar #4:

  11. Q2: What info was missing in the source treebank? • Head/argument/adjunct distinction • Use function tags and heuristics • Raising verbs (e.g., seem, appear) vs. other verbs. • He seems to be late • He wants to be late  Need a list of raising verbs in that language • Features, feature equation (e.g., agreement), …

  12. Q3: what methodological lessons can be drawn? • The algorithm for extracting LTAGs from treebanks is straightforward. • Some missing information can be “recovered” based on heuristics, others cannot.  The extracted LTAGs are not as rich as the ones built by hand. • Nevertheless, the grammars have been shown to be useful for parsing, SuperTagging, etc.

  13. Q4: What are the advantages of a PS or DS treebank? • The original extraction algorithm assumes the input is a PS treebank. • But it can be easily extended if the input is a DS treebank. • Extract tree segments from DS • Run DS PS algorithm on the segments to get elementary trees

  14. Q5: Building a treebank for a formalism or building a general treebank? • I prefer the latter because • A general treebank can be used for different formalisms. • Different grammars under the same formalisms can be extracted. • Annotating a general treebank is often easier.

More Related