1 / 23

Constraint based Dependency Telugu Parser

Constraint based Dependency Telugu Parser. Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain. Team members - Phani Chaitanya Ravi kiran. Overview. Motivation A word about the language Overview of constraint based parser Analysis of special cases Genitives Copula

cael
Download Presentation

Constraint based Dependency Telugu Parser

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran

  2. Overview • Motivation • A word about the language • Overview of constraint based parser • Analysis of special cases • Genitives • Copula • “ani” construction • Conjuncts • Future work

  3. Motivation • We thought about a question answering system in Telugu mainly for medical and tourism domain which could help native Telugu speakers (as a preliminary diagnosis tool and a travel guide). And we were in need of a parser to make things easier.

  4. A word about the language • Telugu is a South Asian language • Features • Morphologically rich • Free word order • Agglutinative • challenges • No Treebank • No parser • No wordnet

  5. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Telugu : rAmuduiMtikivaccAkapaMduniwiMtadu Gloss :Rama home after_coming apple eats English :Ram eats an apple after coming home

  6. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse 1 (( NP 1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,> )) 2 (( NP 2.1 iMtiki NN <af=illu,n,,s,,0,,ki,> )) 3 (( VG 3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,> )) 4 (( NP 4.1 paMdu NN <af=paMdu,n,,s,,0,,0,>|<af=paMdu,n,,s,,0,,obl,> 4.2 ni PREP <af=ni,n,,s,,0,,0,> )) 5 (( VG 5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,> 5.2 . SYM )) ))

  7. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse 1 (( NP Source 1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,> )) 2 (( NP Source 2.1 iMtiki NN <af=illu,n,,s,,0,,ki,> )) 3 (( VG Demand 3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,> )) 4 (( NP Source 4.1 paMdu NN <af=paMdu,n,,s,,0,,0,> 4.2 ni PREP <af=ni,n,,s,,0,,0,> )) 5 (( VG Demand 5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,> 5.2 . SYM )) ))

  8. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Frame for winu (eat in basic form so no transformation required) ------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|reln ------------------------------------------------------------------- k1 m 0 n l c k2 m ni n l c k1 k2 -------------------------------------------------------------------- Frame for vaccu (come) ------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|relnVmod ------------------------------------------------------------------- k1 m 0 n l c K2 m ki n l c ------------------------------------------------------------------- k1 k2 Transformation charts [ina_aka (after+ing)] ---------------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|reln|op ---------------------------------------------------------------------------- K1 m 0 n l c remove Vmod m - v r p insert ----------------------------------------------------------------------------- Winu[wa] (eat) paMdu (fruit) rAmudu(Ram) (after coming )Vaccu[ina_aka] rAmudu (House)iMtiki

  9. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Frame for vaccAka (after transformation) arc-label necessity vibhaktilextypeposnreln k2 m ki n l c Vmod m - v r p ------------------------------------------------------------- Frame for winu k1 m 0 n l c k2 m ni n l c ---------------------------------------------------------------------------------------- rAmuduiMtikivaccAkapaMduniwiMtadu X1:k1 X2:k2 X3:k2 X4:vmod

  10. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse C1 : For each of the mandatory karakas in a karaka chart for each demand group, there should be exactly one outgoing edge labeled by the karaka by the demand group. C2 : for each of the optional or desirable karakas in a karaka chart for each demand group, there should be at most one outgoing edge labeled by the karaka by the demand group. C3 : There should be exactly one incoming arc into each source group Equations formed by applying the above constraints are: C1 : X1 = 1 X2 = 1 X3 = 1 X4 = 1 C2 : No optional field found C3 : X1 = 1 X2 = 1 X3 = 1 X4 = 1

  11. Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Identify source groups satisfying demands and draw arcs Overview of constraint based parser Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse 1 (( NP < af=rAma,n,,,,0,,adj_vAdu,/drel=k1:5/name=1> 1.1 rAmudu NN <af=rAma,n,,,,0,,adj_vAdu,> )) 2 (( NP <af=illu,n,,s,,0,,ki,/drel = k2:3/name=2> 2.1 iMtiki NN <af=illu,n,,s,,0,,ki,> )) 3 (( VG <af=vaccu,v,,,any,0,,ina_Aka,/drel = vmod:5/name=3> 3.1 vaccAka VRB <af=vaccu,v,,,any,0,,ina_Aka,> )) 4 (( NP <af=paMdu,n,,s,,0,,0,/drel = k2:5/name=4> 4.1 paMdu NN <af=paMdu,n,,s,,0,,0,>|<af=paMdu,n,,s,,0,,obl,> 4.2 ni PREP <af=ni,n,,s,,0,,0,> )) 5 (( VG <af=winu,v,,,3_p,0,,wA,/name = 5> 5.1 wiMtAdu VFM <af=winu,v,,,3_p,0,,wA,> 5.2 . SYM )) ))

  12. Analysis of special cases • Genitives • Copula • “ani” construction • Conjuncts

  13. Genitives • Genitives is the case that marks a noun as being the possessor of another noun (ex – his, her, its …… etc) • Cases • Genitive marker exists • Telugu : rAmudiyoVkkapuswakaM • Gloss : ram 's book • So when there is a marker then it is a straight forward that the noun preceding “yoVkka” holds an R6 relation with the noun succeeding “yoVkka”. • Genitive marker is dropped • Telugu : rAmudipuswakaM • Gloss : ram book • here is the suffix “udi” in “rAmudi” which gives the information about existence of genitive.

  14. Genitive contd.. • Exceptions in case where genitive marker can be dropped • Telugu : raGu puswakaM rAmudiki icCadu • Gloss : Raghu book Ram gave • English (sense 1): Raghu gave book to sita. • English (sense 2): Raghu’s book is given to sita. So for non-masculine nouns (Raghu and Sita)in Telugu we don’t have any markers for genitives. • So we output all possible parses for this case. The parses include icCAdu icCAdu k4 k4 k1 k2 raGu rAmudiki k2 rAmudiki puswakam puswakam r6 raGu

  15. Copula • Ex – is, are, were ….. Etc • Copula is generally dropped in Telugu For ex- • Telugu : rAmudumaMcibAludu • gloss : RAM good boy • Eng : Ram is a good boy. • So we handle these cases by introducing a “NULL_VG” Frame for NULL_VG -------------------------------------------------------------------------------------------- arc-label necessity vibhaktilextypeposnreln -------------------------------------------------------------------------------------------- k1 m 0 n l c k1S m 0 n l c --------------------------------------------------------------------------------------------

  16. ‘ani’ construction • ‘ani’ in telugu is some times similar to “that” in english. • There are three different ways of using “ani” as follows : • Used as complementizer : • Telugu : rAmudupaMduwiMtAduanimohanceVppAdu. • Gloss : Ram fruit will_eat that mohan said . • English : Ram said that Mohan will eat a fruit. • Used as verb : • Telugu : mohanrAmudupaMduwiMtAduanivellipoyAdu . • English : mohan left saying ram eats an apple. • Used to state a reason : • Telugu : mohanrAmudupaMduwinnAdanivellipoyAdu. • Gloss : Mohan Ram fruit had_eaten went. • English : Mohan went because ram had eaten the fruit.

  17. “ani” construction Contd … So we created a demand frame for “ani” Frame for ani -------------------------------------------------------------------------------------------- arc-label necessity vibhaktilextypeposnreln -------------------------------------------------------------------------------------------- Ccof m - v_fin l c Ccof m - v_fin r p --------------------------------------------------------------------------------------------

  18. Conjuncts • In Telugu conjuncts occur as suffixes (tam of the verb) , DheergAs and as lexical items such as “inkA” , “anduke” , “mariyu” , “kAni” , “aiwe” and “anwe”. • Suffixes : • Here , just applying the corresponding transformation chart of the verb solves the case. Telugu : nenuiMtikivelwenixrapowAnu. Gloss : I home if go will_sleep . English: I will sleep if I go home.

  19. Contd … • Lexical items : Here we will have frame for each lexical entry which will do the corresponding job. In case of “mariyu” : Frame 1 : -------------------------------------------------------------------------------------------- arc-label necessity vibhaktilextypeposnreln -------------------------------------------------------------------------------------------- Ccof m - v l c Ccof m - v r c -------------------------------------------------------------------------------------------- Frame 2 : -------------------------------------------------------------------------------------------- arc-label necessity vibhaktilextypeposnreln -------------------------------------------------------------------------------------------- Ccof m - n l c Ccof m - n r c --------------------------------------------------------------------------------------------

  20. Contd … • DheergAs : • Often by elongation of the vowel at the end of lexical items the conjuncts information is implicit there without the need of explicit lexical entries such as “mariyu”. • Telugu : rAmudUsiwAiMtikivellAru. • Gloss : Ram (implicit conj) sita home went . • English : Ram and Sita went home . • In such cases a NULL_CCP is introduced which serves like explicit conjunct lexical entry and we have a frames for the NULL_CCP similar to the one in previous slide.

  21. Future work !! • A thorough analysis of Relative clauses. • Analysis and handling of NULL VERBS in case of complex constructions. • And their implementation. • Verb and TAM Classification.

  22. THANKS !!

  23. Any Queries ??

More Related