1 / 14

Arabic Syntactic Trees

Arabic Syntactic Trees. from Constituency to Dependency. Zden ě k Ž abokrtsk ý Otakar Smr ž Center for Computational Linguistics Faculty of Mathematics and Physics Charles University in Prague. Motivation & Background. Linguistic Data Consortium Arabic Treebank

ceana
Download Presentation

Arabic Syntactic Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Arabic Syntactic Trees from Constituency to Dependency Zdeněk Žabokrtský Otakar Smrž Center for Computational Linguistics Faculty of Mathematics and Physics Charles University in Prague

  2. Motivation & Background • Linguistic Data Consortium Arabic Treebank • Constituent-syntax bracketing ~100k words published • Modification from English to Arabic • Prague Arabic Dependency Treebank • Dependency approach to syntax ~50k words in progress • Pre-step to tectogrammatical description • Motivation: co-operation and resource exchange • Our goal: transform the data from one annotation scheme to the other Arabic Syntactic Trees: from Constituency to Dependency

  3. Non-terminal nodes + Text tokens Constituent labeling on non-terminals Slots and traces Linguistic Data Consortium, University of Pennsylvania Sentence root node + Text tokens Analytical function for every tree node Government and roles CCL & IFAL & ICL, Charles University in Prague Constituency X Dependency Arabic Syntactic Trees: from Constituency to Dependency

  4. Trace of the antecedent subject Compound function of the head of the clause – outer and inner perspectives Free word-order compliant Model Arabic Phrase I Arabic Syntactic Trees: from Constituency to Dependency

  5. Outline of the Transformation 1. Build temporary dependency tree • Contraction of the input phrase-structure tree • Uniquely determined by head selection function • Implementation: simple recursive procedure 2. Create analytical tree topology • Post-processing (corrections) of the temporary dep. tree, e.g., substituting traces with trace coindexed fillers • Re-arrangement of special complex constructs 3. Assign analytical functions Arabic Syntactic Trees: from Constituency to Dependency

  6. Head Selection Function • For each constituent, select the head constituent among its children • Based on (ordered) handcrafted rules • Examples: • If there is a node with tag=PREP among the children, then it is the head • If there is a node with phrase_label=VP among the children, then it is the head • ... etc ... • If nothing was selected by the rules, then the rightmost child is selected Arabic Syntactic Trees: from Constituency to Dependency

  7. Analytical Function Assignment • Based on (ordered) handcrafted rules and lexical lists • Completes the process, does not override previous assignments • Examples: • phrase_label=NP-SBJ  afun=Sb • lemma=wa-  afun=Coord • pos_tag=CONJ  afun=AuxC • ... etc ... Arabic Syntactic Trees: from Constituency to Dependency

  8. Sister-like co-ordination Conjunction of co-ordination Status constructus Model Arabic Phrase II Arabic Syntactic Trees: from Constituency to Dependency

  9. Non-expressed subject (?) Complex modality constructs Principal discrepancies between descriptions – both in topology and labeling Model Arabic Phrase III Arabic Syntactic Trees: from Constituency to Dependency

  10. Model Arabic Sentence • Wa lam yakun mina ’s-sahli `alay hi muwāğahatu kāmīrāti ’t-tilfizyūni wa `adasāti ’l-muşawwirīna wa huwa yaş`adu ’l-bāşa. • It was not easy for him to face the television cameras and the lenses of photographers as he was getting on the bus. Arabic Syntactic Trees: from Constituency to Dependency

  11. Constituency Annotation Arabic Syntactic Trees: from Constituency to Dependency

  12. Dependency Annotation Arabic Syntactic Trees: from Constituency to Dependency

  13. Evaluation & Conclusion • Implementation still in progress, fine-tuning needed • 10,000 words manually annotated in both styles • ~60% of correctly aimed dependencies • 2nd Prague Penn Arabic Treebanking Workshop, May 2003 in Prague • Transfer from dependency to constituency? Arabic Syntactic Trees: from Constituency to Dependency

  14. Related Work • New tool for assignment of analytical functions • Based on machine learning (C5-trained decision trees) • Error rate 17% (supposing the topology of the tree is correct) • First experiments with Arabic dependency parser • Incorporated into the process of annotation of Prague Arabic Dependency Treebank Arabic Syntactic Trees: from Constituency to Dependency

More Related