1 / 10

Macedonian DELAS – first results

Macedonian DELAS – first results. Aleksandar Petrovski Tetovo, Macedonia. Process of constructing. Scanning traditional dictionary OCR Eliminating errors Defining grammatical categories Developing inflectional graphs Assigning inflectional classes to lexical entries.

gyala
Download Presentation

Macedonian DELAS – first results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Macedonian DELAS – first results Aleksandar Petrovski Tetovo, Macedonia

  2. Process of constructing • Scanning traditional dictionary • OCR • Eliminating errors • Defining grammatical categories • Developing inflectional graphs • Assigning inflectional classes to lexical entries

  3. Word groups, DELAS entries Word group Code Number of entries • Nouns N 30,538 • Adjectives ADJ 9,522 • Pronouns PRO 28 • Verbs V 17,978 • Adverbs ADV 2,820 • Prepositions PREP 62 • Conjunctions CONJ 61 • Particles PART 60 • Interjections INT 135 • Numerals NUM 92 Total: 61,296

  4. Nouns, grammatical categories ATT VAL Example Code Gender Masculine list m Feminine devojka f Neuter dete n Number Singular list s Common plural listovi p Count plural lista c Collective plural lisja, lisje l Case Nominative list 0 Vocative listu, liste v Other case forms brata a Definiteness No list 0 Yes listot x Yes, close listov y Yes, far liston z

  5. Adjectives, grammatical categories ATT VAL Example Code • GenderMasculinedobarm • Femininedobra f • Neuterdobro n • Not defineddobri g • NumberSingulardobars • Pluraldobri p • Degree Positive dobar 7 • Comparative podobar 8 • Superlative najdobar 9 • Definiteness No dobar 0 • Yes dobriot x • Yes, close dobriov y • Yes, far dobrion z

  6. Pronouns, grammatical categories ATT VAL Example Code • Gender Masculinetojm • Femininetaaf • Neutertoan • Not definedjasg • NumberSingularkojs • Pluralkoip • Not definedshtoe • Case Nominativenieo • Dative longnamd • Accusative longnasa • Dative short niq • Accusative shortnew • Person Firstjas 1 • Secondti2 • Third toj 3 • Not definedsebe4

  7. Adverbs, grammatical categories • ATT VAL Example Code • Degree Positive malku 7 • Comparative pomalku 8 • Superlative najmalku 9 • Definiteness No mnogu 0 • Yes mnogute x

  8. Present situation Word group DELAS entries DELAF entries Inflect.Factor Classes • Nouns 30,538 274,014 9.04 140 • Adjectives 9,522 151,709 15.93 21 • Pronouns 28 120 15 8 • Verbs 17,978 • Adverbs 2,820 • Prepositions 62 62 1 1 • Conjunctions 61 61 1 1 • Particles 60 60 1 1 • Interjections 135 135 1 1 • Numerals 92 Total: 61,296 426,161 173

  9. Applying dictionaries - results • Different tokens: 5,368 • Different simple forms: 5,340 • Different simple words: 3,074 • Different unknown tokens: 2,266 • Simple words lexical entries: 3,356 • Unknown simple words: 2,266

More Related