Dravidian wordnet
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

Dravidian WordNet PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on
  • Presentation posted in: General

Dravidian WordNet. S.Arulmozi Dravidian University. Tamil Thesaurus. Preliminary work on lexical semantics. Monumental work on Tamil Thesaurus. Ontologicial classification of Tamil Vocabulary Rajendran, S. (2001) tamizhc coRkaLanjciyam. (in Tamil).Tamil University Publication.

Download Presentation

Dravidian WordNet

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Dravidian wordnet

Dravidian WordNet

S.Arulmozi

Dravidian University


Tamil thesaurus

Tamil Thesaurus

  • Preliminary work on lexical semantics.

  • Monumental work on Tamil Thesaurus.

    • Ontologicial classification of Tamil Vocabulary

  • Rajendran, S. (2001) tamizhc coRkaLanjciyam. (in Tamil).Tamil University Publication.


Domains in tamil thesaurus

Domains in Tamil Thesaurus

  • Tamil vocabulary is classified into four major domains:

    • Entities

    • Abstracts

    • Events and

    • Relationals


Dravidian wordnet

Lexical Hierarchy of the Domain `Construction’

parumaippeyarkaL

`concrete nouns

'

aHRinaippeyarkaL

`irrational nouns'

uyirillaatavai

`non-living beings'

uruvaakkiya maRRum patananjceyta poruTkaL

`manufactured and processed items'

kaTTappaTTavai

`constructed'


Nouns

Nouns

RelationsExample

SynonymyviiTu ‘house’ - illam `house‘

Hypernymy-HyponymypaLLi 'school' – kalviccaalai 'educational institution‘

Hyponym-Hypernymykalluuri 'college' – aracukkalluuri `govt college‘

Holonymy-MeronymyndaaRkaali 'chair' - kaal 'leg‘

Meronymy-Holonymycakkaram 'wheel' to vaNTi 'cart‘

Related VerbpaTittal ‘reading’ – paTi ‘read’

Coordinate termskooyil `temple' – macuuti 'mosque'


Verbs

Verbs

RelationsExample

SynonympaTi ‘read’ – payilu ‘read’

Hypernymycuvai ‘taste’ – uNar

TroponymykeeL ‘ask’– kenjcu ‘plead’

Nominalparuku `drink’ – parukutal `drinking’

Related NounkaNTupiTi `discover’ – kaNTupiTippu

`discovery’


Tamil wordnet

Tamil WordNet

  • Objective: To build a WordNet for Tamil to enhance machine translation

  • Resources: Tamil Thesaurus, Technical Glossaries (Tamil University Publications), Princeton English WordNet

  • Funding Agency: Tamil Software Development Fund, Tamil Virtual University - 4 lacs

  • Time Frame: 18 months


Details

Details

  • Software used

    • Front-end – Java

    • Back-end - Mysql Database

  • Project Deliverables

    • 50k root words

    • Relationships coded

    • Stand-alone and web-based interface

    • Embedded morphological analyser


Statistics

Statistics

  • Total Words: 50497

  • Unique Senses: 41013

  • Nouns: 46710

  • Verbs: 2881

  • Adjectives: 416

  • Adverbs: 490


Total words 50497 unique senses 41013

Total Words: 50497Unique Senses: 41013

Project Completed (2004)

http://www.nrcfosshelpline.in/code/wiki/TamilWordnet


Standalone version tamil wordnet snapshot

Standalone version – Tamil WordNet (Snapshot)


Standalone version tamil wordnet snapshot1

Standalone version – Tamil WordNet (Snapshot)


Web version tamil wordnet snapshot

Web-version – Tamil WordNet (Snapshot)


Web version tamil wordnet snapshot1

Web-version – Tamil WordNet (Snapshot)


First effort on dravidian languages

First Effort on Dravidian Languages

  • National Workshop on WordNet for Dravidian Languages

  • 2-3 June 2003

  • Organized by AU-KBC Research Centre, Chennai, Central Institute of Indian Languages, Mysore and Tamil University.

  • Hands-on experience on specified domain – construction

  • Report available on Global WordNet website


Mhrd project

MHRD Project

  • Creation of Machine Translation tools and resources for English to Dravidian Languages: Pilot Study

  • to develop Machine Translation(MT) system and needed linguistic resources for

    • English-Dravidian languages(Tamil, Malayalam, Telugu and Kannada),

  • This would facilitate the creation of rich educational contents in Indian languages.

  • This research effort is to make all the tools and translation system to be based on Machine Learning methodologies so that computer graduates and other such non-linguists are able to immediately participate in the national mission on literacy by contributing additional tools for language translation.


Modules

Modules

  • Module 1: Machine Translation

    • aims at developing teaching material corresponding to the tools developed so that it can be delivered as part of undergraduate computer science and engineering curriculum on data mining/machine learning.

    • This will ensure a critical amount of man power required for sustaining translation effort needed for national mission on education.

  • Module 2: Training

    • aims at training 500 faculties selected from across the country on machine translation methodologies using machine learning techniques.

  • Module 3: Dravidian WordNet

    • aims at developing a Dravidian WordNetrequired for translation.


Total budget

Total Budget

  • IIT Bombay – 15 lacs

  • Amrita University – 40 lacs

  • Tamil University – 15 lacs

  • University of Hyderabad – 15 lacs

  • Dravidian University – 15 lacs

  • Time Frame

    • 12 months

    • March 30, 2009 – March 29, 2010


Work done

Work done

  • Part of a one year Pilot project involving Tamil, Telugu, Malayalam and Kannada

  • Funding Agency: Ministry of HRD

  • Duration: 18 months (July 2009-Dec 2010)

  • Deliverable: 13k synsets

  • 7k synsets linked to IndoWordNet, available at http://www.cfilt.iitb.ac.in/wordnet/webhwn/wn.php


Statistics on dravidian wordnet

Statistics on Dravidian WordNet


Publications

Publications

  • `Tamil WordNet’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran)

  • `Building a WordNet’ for Dravidian Languages, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran, S.Gopakumar, V.Dhanalakshmi)

  • `Representation of Kinship in WordNet’, Proceedings of the 9th International Tamil Internet Conference, Coimbatore, 23-27 June 2010 (S.Arulmozi)

  • `Polysemy in Tamil and other Indian Languages’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi & PanchananMohanty)

  • `Telugu WordNet’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi)


First indowordnet workshop

First IndoWordNet Workshop

  • Amrita University

  • 11-14 June 2009

  • Necessity for developing linked WordNets of different languages of India was stressed

    • Challenges such as language divergence, lexical semantics, embedding WordNet in MT and cross-lingual search applications can be achieved

  • Participation from groups: Hindi, Marathi, Sanskrit, Nepali, Assamese, Bodo, Manipuri, Konkani, Kashmiri, Tamil, Telugu, Malayalam, Kannada

  • Proposal on Indhradhanush


Dravidian wordnet1

Dravidian WordNet

  • Present Project

  • Funded by DIT.


Links

Links

  • Tamil WordNet – Open Source

    http://www.nrcfosshelpline.in/code/wiki/TamilWordnet

  • VerbNet (English)

    http://verbs.colorado.edu/~mpalmer/projects/verbnet.html

  • Princeton English WordNet

    http://wordnet.princeton.edu/

  • Global WordNet Association

    http://www.globalwordnet.org/

  • WordNets in the World

    http://www.globalwordnet.org/gwa/wordnet_table.htm

  • WordNet Bibliography

    http://lit.csci.unt.edu/~wordnet/

  • IndoWordNet

    http://www.cfilt.iitb.ac.in/wordnet/webhwn/wn.php


Thank you

Thank you!


  • Login