Natural language processing
Sponsored Links
This presentation is the property of its rightful owner.
1 / 53

Natural Language Processing PowerPoint PPT Presentation


  • 126 Views
  • Uploaded on
  • Presentation posted in: General

Natural Language Processing. Spring 2007 V. “Juggy” Jagannathan. Course Book. Foundations of Statistical Natural Language Processing. By Christopher Manning & Hinrich Schutze. Chapter 3. Linguistic Essentials January 22, 2007. Parts of Speech and Morphology.

Download Presentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Natural Language Processing

Spring 2007

V. “Juggy” Jagannathan


Course Book

Foundations of Statistical Natural Language Processing

By

Christopher Manning & Hinrich Schutze


Chapter 3

Linguistic Essentials

January 22, 2007


Parts of Speech and Morphology

  • Syntactic/Grammatical categories – Parts of Speech (POS)

    • Nouns – refer to people, animal, concepts & things

    • Verbs – to express action in a sentence

    • Adjectives – describe properties of nouns

      • Substitution test for adjectives

      • Ex: The {sad, intelligent, green, fat…} one is in the corner.


Word class/lexical categories

  • Open or lexical categories

    • Nouns, verbs and adjectives that have a large membership and continually grows as new words are added to the language

  • Closed word or functional categories

    • Prepositions and determiners

      • Ex. Of, on, the, a

  • Words are listed in a “dictionary” referred to by linguists as the “lexicon”


Tags

  • Parts of Speech tagging – 8 categories – referred to as POS tags.

  • Corpus Linguists use more fine grained tagging

  • Various corpus have been tagged extensively and the pioneering one is the Brown corpus.

    • Adjectives in Brown corpus are referred by the tag “JJ”


Morphological process

  • Source: http://www.sil.org/LINGUISTICS/GlossaryOfLinguisticTerms/WhatIsAMorphologicalProcess.htm

    • “Definition A morphological process is a means of changing a stem to adjust its meaning to fit its syntactic and communicational context.”

  • Examples

    • Plural form (dog-s) derived from (dog)


Morphological processes

  • Major forms of morphological processes

    • Inflection

      • Systematic modification of a root (stem) form by means of prefixes and suffixes

      • Inflection does not change the meaning of the word but does change word features such as tense and plurality.

      • All of the inflectional forms of a word are grouped as manifestation of a “lexeme”

    • Derivation

      • Can dramatically change the meaning of the derived word.

      • Ex: Adverb “widely” derived from adjective “wide”

      • Ex: suffix use – weak-en; soft-en; understand-able; accept-able; teach-er; lead-er;

  • Compounding

    • Merging of two or more words into a new word (concept)

    • Ex. Disk drive, tea kettle, college degree, down market, mad cow disease, overtake


Nouns and Pronouns

  • Nouns – refers to people, animals and things

    • Dog, tree, person, hat, speech, idea, philosophy

    • Inflection is a process by which stem of a word can be modified to create new word

    • English the only form of inflection is one indicating whether a noun is singular or plural

    • Ex. Dogs, trees, hats, speeches, persons

    • Irregular inflection examples: women

    • Other languages use inflection to convey “gender – masculine, feminine, neuter” and “case – nominative, genitive, dative, accusative).


Gender forms

  • Pronouns

    • Masculine (he), feminine (she), neuter (it)

  • Case relationship in English – the genitive case

    • Ex: the woman’s house; the students’ grievances

  • Possessive pronouns

    • Ex: my car

    • Second possessive form of pronoun: a friend of mine

  • Reflexive pronouns – ex. Herself, myself

    • Ex:

      • Mary saw herself in the mirror.

      • Mary saw her in the mirror.

    • Also referred to as “anaphors” must refer to something nearby in the text.


Brown tags

** Examples from: http://www.tameri.com/edit/doubles.html


Pronoun forms and Brown Tags


Words that accompany nouns: determiners and adjectives

  • Determiners – describe the particular reference of a noun

    • Articles – refers to someone or something

    • “the” refers to someone or some thing we already know about and is being referenced

      • Ex. “the tree” refers to a known tree.

    • “a” or “an” introduces a new reference to some thing that has not appeared before or its identity cannot be inferred from the context.


Determiners and adjectives

  • Demonstratives

    • “this” or “that”

  • Adjectives

    • Describe properties of nouns

    • ex: a red rose, this long journey, many intelligent children, a very trendy magazine.

    • The above is also referred to as: attributive or adnominal.

    • Predicative form of adjective (appearing in the object place of a sentence)

      • Ex. The rose is red. The journey will be long.


Agreement

  • Agreement, here refers to congruence in gender, case and number between the determiner, adjective and the noun. Many languages, this can be quite complex.


Adjectives and Brown tags

  • Positive – the basic form of an adjective [JJ]

    • Ex. Rich, trendy, intelligent

  • Comparative [JJR]

    • Ex. Richer, trendier

  • Superlative [JJT]

    • Ex. Richest, trendiest

  • Semantically superlative adjectives [JJS]

    • Ex. Chief, main and top

  • Numbers – are subclasses of adjectives

    • Cardinals [CD]

      • Ex. One, two, and 6,000,000

    • Ordinals [OD]

      • Ex. First, second, tenth

  • Periphrastic forms - forms made by using auxiliary words

    • Ex. More intelligent, most intelligent


Brown tags for determiners, quantifiers

  • Determiners

    • Articles [AT]

    • Singular determiners [DT]

      • This, that

    • Plural determiners [DTS]

      • These, those

    • Determiners that can be both singular or plural [DTI]

      • Some, any

    • Double conjunction determiners [DTX]

      • Either, neither

  • Quantifiers

    • Words that express ideas like “all”, “many”, “some”

    • Pre-quantifier [ABN]

      • All, many

    • Nominal pronoun [PN]

      • One, something, anything, something

  • Interrogative pronouns

    • [WDT] – wh-determiner – what, which

    • [WP$] – possesive wh-pronoun: whose

    • [WPO] – objective wh-pronoun: whom, which, that

    • [WPS] – nominative wh-pronoun: who, which, that


Verbs


Phrase Structure


Phrase Structure

  • Noun phrases [NP]

    • Noun is the head of the noun phrase

  • Prepositional phrases [PP]

    • Headed by preposition and contain a NP complement

  • Verb phrases [VP]

    • Headed by a verb

      • Ex. Getting to school on time was a struggle.

  • Adjective phrases [AP]

    • She is very sure of herself

    • He seemed a man who was quite certain to succeed.


Phrase Structure Grammars

  • Syntactic analysis allows us to infer the meaning – meaning completely different in the following two sentences that use the same words

    • Mary gave Peter a book

    • Peter gave Mary a book

  • Some languages the order of the words does not matter – free word order language


Rewrite rules


Labeled bracketing


Non-local and long-distance dependencies

  • Subject-verb agreement

    • The women who found the wallet were given a reward.

  • Long-distance relationship

    • Which book should Peter buy?

  • These dependencies impact statistical NLP approaches


Dependency: Arguments and adjuncts

  • Dependency

    • Concept of dependents

    • “Sue watched the man at the next table”

      • Sue and man are dependent on watched.

      • The PP “at the next table” is dependent of man. It modifies man.

      • The two phrases can be viewed as “arguments” of the verb “watched”.

  • Semantic roles

    • Agent of an action is the person or thing doing the action [also viewed as subject]

    • Patient – is the person or thing that is being acted on [also viewed as the object]


Active & Passive voice

  • Example

    • Children eat candy.

    • Candy is eaten by children


Adjuncts


Sub categorization Frame

The set of arguments that a verb can appear with is referred to as sub categorization frame.


Selectional restrictions or selectional preferences


X’ Theory

  • N’ – “N bar nodes”

  • http://en.wikipedia.org/wiki/X-bar_theory


Phrase Structure Ambiguity


Garden Paths

  • Parsing the following sentence

    • The horse raced past the barn fell.

    • Garden path parse is the phenomenon by which a parse that is generated from “the horse raced past the barn” will have to be abandoned to accommodate “fell”.


Ungrammatical constructs

  • Parsing may fail or can get multiple parses due to ungrammatical constructs

    • Slept children the

  • Some sentences may be grammatically correct but meaningless

    • Colorless green ideas sleep furiously.

    • The cat barked.


Semantics and Pragmatics

Lexical Semantics: study of how meanings of individual words are combined into the meaning of sentences.

Hypernymy vs Hyponymy

animal is a hypernym of cat

cat is a hyponym of animal

Antonym – words with opposite meanings

Meronymy – part belonging to a whole

tire is a meronym of a car

Holonym – whole corresponding to a part

Synonyms – words with similar meanings

Homonyms – words that are spelled the same but have different meanings

bank – river bank; bank – a financial institution

Senses Polyseme – if the different senses (meanings) of the word are related. Example “branch” could mean part of a tree; could mean dependant part of an organization.

Ambiguity – lexical ambiguity refers to both homonymy and polyseme

Homophony – homonyms that are also pronounced the same. “bass” for example could mean a fish or low pitched sound – and is NOT a homophone.


Compositionality

  • Once we have the meaning of individual words, we need to assemble them into the meaning of a whole sentence. This is not easy…

    • White paper, white hair, white skin, white wine

    • Only the paper is white!

    • These are example of collocations

  • Idioms – individual word meaning does not predict the meaning of the whole

    • Kick the bucket

    • Carriage return


Scope and discourse analysis

  • Scope of quantifiers can be tricky

  • Discourse analysis requires resolution of “anaphoric relations”

  • Ex. Mary helped Peter get out of the cab. He thanked her.

  • Anaphoric relations is correctly mapping he to Peter and her to Mary.


Other areas in linguistics

  • Phonetics – study of physical sounds of language – phenomena like consonants, vowels and intonations.

  • Phonology – structure of sound system in languages

  • Sociolinguistics – interactions of social organization and language

  • Historical linguistics – study of how language changes over time

  • Psycholinguistics – study of how language is perceived

  • Mathematical linguistics – use of mathematical modeling approach to linguistics


  • Login