1 / 9

Standardization of Lexicon

Standardization of Lexicon. Team Members: Jaya Saraswati Gajanan K. Rane Kunal K. Patel. INTRODUCTION:. Dictionary is the major source of information in the Enconversion and Deconversion process

lolita
Download Presentation

Standardization of Lexicon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Standardization of Lexicon Team Members: Jaya Saraswati Gajanan K. Rane Kunal K. Patel

  2. INTRODUCTION: • Dictionary is the major source of information in the Enconversion and Deconversion process • The current Hindi Dictionary contains about 80,000 common words and there are about 200 Morphological, Grammatical and Semantic Attributes

  3. FORMAT OF THE DICTIONARY: • [HW]{} “UW(icl>restriction)” (attributes); • [Am]{} “mango(icl>fruit)”(N,MALE,EDBL,OBJCT,INANI,Na); HeadWord Grammatical, Morphological and Semantic Attributes Universal Word

  4. THE NEED FOR STANDARDIZING THE DICTIONARIES: • The dictionary contains Universal Words which represent concepts present in all the languages • Currently, the dictionaries are containing different restrictions for the same concept • Currently, the semantic attributes in the different dictionaries are also different

  5. Continued…………. e.g.: The boy is running English Dictionary – [run]{} "run(icl>walk)" (V,VINT); [boy]{} "boy(icl>living thing)" (N,ANI,CONCRETE); UNL: agt(run(icl>walk), boy(icl>living thing)) Hindi Dictionary – [xOdZ]{} "run(icl>act)" (V,VINT,Va,VOA-MOT); [ladZak]{} "boy(icl>person)“(N,MALE,ANIMT,MML,PRSN,NAA);

  6. KNOWLEDGE BASE TO BE USED FOR STANDARDIZING THE DICTIONARIES • The UNU, Tokyo has sent a knowledge base which is a hierarchy of concepts • We have created a set of semantic attributes and these semantic attributes have been incorporated into the knowledge base e.g.: “glass” – ARTFCT, OBJCT • Our task is to map each word of the dictionary to the concepts provided in the knowledge base

  7. CURRENT ACTIVITIES • The dictionary is divided into four parts - Noun, Verbs, Adjectives and Adverbs • For standardizing the Noun part, a program has been created, which facilitates the user to select a restriction quickly for a dictionary entry • For each restriction selected, the semantic attributes corresponding to that restriction are also automatically entered in the dictionary entry

  8. Continued…………. • Efforts are being made to automatically standardize the verb, adjective and adverb parts of the dictionary • For the Adverb part, the adverbs which end with “-ly” are given the restriction (icl>how) while those which do not end with "-ly" are given the restriction (icl>how(obj>thing))

  9. FINAL GOAL All the dictionaries should have uniform restrictions and semantic attributes for similar concepts

More Related