paradigm based morphological analyzers n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Paradigm based Morphological Analyzers PowerPoint Presentation
Download Presentation
Paradigm based Morphological Analyzers

Loading in 2 Seconds...

play fullscreen
1 / 24

Paradigm based Morphological Analyzers - PowerPoint PPT Presentation


  • 130 Views
  • Uploaded on

Paradigm based Morphological Analyzers. Dr. Radhika Mamidi. Morphological Analyzers. They are tools to automatically decompose a word into its root and affixes and give related features. Example: 1 st stage – identifying morphemes ate: root = eat suffix = ed

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Paradigm based Morphological Analyzers' - francesca-roberts


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
morphological analyzers
Morphological Analyzers

They are tools to automatically decompose a word into its root and affixes and give related features.

Example:

1st stage – identifying morphemes

ate: root = eat

suffix = ed

2nd stage – analyzing morphemes

ate: root = eat

tense = past

some applications
Some Applications
  • Machine Translation
  • Speech Processing
machine translation
Machine Translation
  • Pos tagger gives only part of speech. More information is needed to translate a word correctly.
  • More information like tense, aspect and mood of the verbs, gender, number and person of the nouns.
slide5
Example: [Eng Hindi translation]

ENGLISH: Shewent home.

HINDI: vaha ghar gayi.

ENGLISH: Hewent home.

HINDI: vaha ghar gayaa.

  • The gender of the pronoun is essential for the translation in Hindi.
  • The morph analyzer will give the gender information.
slide6
Example: [Hindi Eng translation]

In Hindi ‘vaha’ can have different senses – ‘he’, ‘she’ or ‘that’.

“vaha ghar gayaa”

If we were to translate this, then the extra information on the verb will help us to translate the above sentence correctly as

“He went home”

  • The ‘yaa’ indicates past tense as well as singular number and masculine gender.
  • The morph analyzer will give this information.
speech processing
Speech Processing
  • In Text to Speech tools also Morph Analyzer is essential along with Part of Speech.
  • With extra information on the words, the efficiency increases.
  • The intonation, the pause, the stress etc can be close to the way humans speak.
  • This additional information is given by morph analyzers.
approaches
Approaches
  • Paradigm based
  • Finite State based

We will discuss the first approach.

requirement for building paradigm based morph analyzers
Requirement for building paradigm based Morph Analyzers
  • Knowledge of Lexeme and Word forms
  • Root and Affix dictionaries
  • Paradigm Table
  • Paradigm class
  • The lexemes are stored in the dictionaries and the word forms as paradigms.
lexeme and word form
Lexeme and Word form

APPLE: apple, apples

CHURCH: church, churches

BOY: boy, boys

WATCH: watch, watches

SPY: spy, spies

  • The word in upper case is called LEXEME and the inflected forms are WORD FORMS.
  • Lexemes are the headwords in a dictionary.
lexeme and word form1
Lexeme and Word form

Another example:

played is a word form of the lexeme PLAY

plays is a word form of the lexeme PLAY(1)

plays is a word form of the lexeme PLAY(2)

where PLAY(1) is a verb and PLAY(2) is a noun.

PLAY(1) and PLAY(2) are two different lexemes.

exercise 1
Exercise 1

Give the lexeme of the following word forms

ate

played

manufactured

glasses

players

bites

exercise 2
Exercise 2

“manufactured” can be a verb in past tense or an adjective. So it belongs to two different lexemes – manufacture and manufactured.

Which of the following words belong to more than one lexeme?

ate

wanted

wrote

written

finished

root and affix dictionaries
Root and Affix dictionaries

Root dictionary contains a list of roots or the base forms to which affixation takes place.

It is stored usually with its part of speech.

Affix dictionary contains a list of all the affixes in a language.

The features of the affixes are stored here.

The features are stored as attribute value pairs.

example entries in a dictionary
Example entries in a dictionary

Root dictionary

eat <root=‘eat’, category=‘verb’>

book <root=‘book’, category=‘verb’>

book <root=‘book’, category=‘noun’>

Affix dictionary

+s <tense = ‘present’>

+ed <tense = ‘past’>

+en <aspect = ‘perfective’>

+ing <aspect = ‘progressive’>

paradigm table
Paradigm table

A paradigm table represents the inflected forms of a particular word.

It includes the conjugation of verbs and declensions of nouns, adjectives, pronouns etc.

Example:

apple, apples

eat, eats, ate, eaten, eating

smart, smarter, smartest

conjugation of english verbs
Conjugation of English verbs
  • play plays played played playing
  • eat eats ate eaten eating
  • look looks looked looked looking
  • dance dances danced danced dancing
  • push pushes pushed pushed pushing
declension of english nouns
Declension of English nouns
  • apple, apples
  • boy, boys
  • church, churches
  • watch, watches
  • spy, spies
exercise 3
Exercise 3
  • Give the paradigm table for 5 different nouns and 5 different verbs in English.
paradigm class
Paradigm Class
  • A paradigm class contains the classes of words i.e. the prototypical root and all the roots that fall in its class including the given root.
  • By the term ‘root’ we mean the base form or stem to which affixation takes place.
  • Those words which decline or conjugate in exactly the same way, fall into one class.
slide21
The English verbs ‘play’ and ‘look’ have the following paradigm:
  • play plays played played playing
  • look looks looked looked looking

So they belong to the same class.

But ‘push’ since it differs in its present tense form i.e. it has ‘-es’ and not ‘- s’ falls in another class. Its paradigm is as follows:

  • push pushes pushed pushed pushing
slide22
The English nouns ‘play’ and ‘boy’ have the following paradigm:
  • play plays
  • boy boys

So they belong to the same class.

But ‘spies’ falls in another class. Its paradigm is as follows:

  • spy spies
slide23
Paradigm class is represented by one member of the class.

eat V eat

play V play, talk, walk, train

push V push, fish

play N play, boy, day

spy N spy, sky

church N church, watch

exercise 4
Exercise 4

Which of the following verbs belong to the same paradigm class?

mince ride walk speak

shake play dance take

Which of the following nouns belong to the same paradigm class?

girl house dish book

mouse beach flower pencil