420 likes | 534 Views
CS 385 Fall 2006 Chapter 14. Understanding Natural Language (omit 14.4). The Problem. Language is fuzzy I feel funny Fruit flies like bananas Is there water in the fridge? Early history: dictionary translation, word by word
E N D
CS 385 Fall 2006Chapter 14 Understanding Natural Language (omit 14.4)
The Problem Language is fuzzy • I feel funny • Fruit flies like bananas • Is there water in the fridge? Early history: • dictionary translation, word by word • out of sight, out of mind → the person is blind and insane • did not address interrelation among words • more to it: what you know beyond the simple meaning of a word • Doug Lenat's CYC project (1984), now Cyc Corporation represent world knowledge via logic and frames 12 years, 35 million dollars questionable results http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/nlu
Levels of Analysis (big picture) Prosody • rhythm and intonation of language Phonlogy • the sounds which comprise language (phonemes) • speech analysis: identify phonemes and conglomerated into words. Morphology • the components that make up words (ing, ed,...) Syntax • rules for combining words into legal (syntactically correct) sentences • used to parse a sentence • the most successful level, because it is formalized Semantics • attaching meaning to words, phrases, and sentences Pragmatics • how is language usually used? “How are you" → "fine" World knowledge • general background necessary to interpret text or conversation • “My thesis draft is due tomorrow” makes you think of ...?..
Today Acceptance that a general conversationalist is unlikely Scale back to interpretation in restricted applications • MS word grammar and style checker • others? Audio to text, but little interpretation • cell phone speed dial • United Airlines customer service • medical transcription • Acura TL recognizes 650 voice commands Normal steps of linguistic analysis: • parsing • semantic interpretation • expanded representation
Specifying a Grammar 1. sentence np vp 2. np n 3. np art n rewrite rules 4. vp v 5. vp v np 6. art a 7. art the 8. n man terminals 9. n dog 10. v likes 11. v bites Legal sentence: a string of terminals that can be derived from these rules.
Interpret it with a Semantic Net Construct a semantic net describing mammals: • mammals are covered with hair • tigers are a subclass with stripes that growls • Tony is a tiger • humans are a subclass of mammal that are frightened by tigers • Bob is a human
Semantic Interpretation (conceptual graph) Growling has an agent and an object (from parse tree): Expanded representation of the sentencemeaning (from sem. net): we know Tony is a tiger and Bob should be frightened do a join:
Fig 14.2 Stages in producing an internal representation of a sentence.
Parsing The man bites the dog Top down: start at sentence symbol and work down to a string of terminals sentence → np vp → art n vp → the n vp → the man vp → the man v np → the man bites np → the man bites art n → the man bites the n → the man bites the dog 1. sentence np vp 2. np n 3. np art n 4. vp v 5. vp v np 6. art a 7. art the 8. n man 9. n dog 10. v likes 11. v bites
Resulting parse tree for “The man bites the dog” Problem: we needed to know where we are going Goal driven: need to back up and retrace a lot New approach: transition net parsers 5
Transition Net Parsers Grammar: a set of finite state machines or transition nets One for each non terminal Successful transition through the network == replacing the nonterminal by the rhs of a grammar rule E.g. first arc in sentence ATN replaced by a path through the np ATN
sentence nonn phrase What paths would be examined to parse it? Begin with sentence network and try to move along top arc Go to np network Try to move along bottom arc Go to noun network Try man. fail Try dog. fail Try to move along top arc Go to article network Try “a” fail Try “the” fail Fail article Fail np network
sentence verb phrase Try to move along bottom arc Go to vp network Go to v network Try likes. fail Try bites. fail Try bite. success Go to np network Go to art network Try a. fail Try the. succeed Go to n network Try man. fail Try dog. succeed Succeed (np) Succeed (vp)
What Next? Note, this does not build the parse tree, it just identifies correct sentences To build a tree: • Each terminal returns success and a tree with the terminal as a single node • Each non-terminal network returns a set of subtrees whose root is the nonterminal symbol and whose leaves are the trees for the branches taken Add the tree to the steps for "bite the dog"
sentence verb phrase Go to vp network Go to v network Try likes. fail Try bites. fail Try bite. success {return verb Go to np network bite} Go to art network Try a. fail Try the. succeed {return article Go to n network the} Try man. fail Try dog. succeed {return noun Succeed (np) dog } Succeed (vp) {return vp v np bite art n the dog}
Pseudo-code for a transition network parser Defined using two mutually recursive functions, parse and transition function parse(grammar_symbol) continued…
Fig 14.5 Trace of a transition network parse of the sentence “Dog bites.” parse(sentence) transition(Noun_phrase) parse(Article) terminals don't match Dog terminal matches Dog parse(Noun) Red corresponds to function calls 9
14.2.3 The Chomsky Hierarchy and Context Sensitive Languages Chomsky hierarchy: • of languages by increasing linguistic complexity • we will be concerned with context-free context sensitive Context-free: • one non-terminal symbol on the lhs of a rewrite rule • problem: no requirement that dog is followed by bites, not bite • e.g. no relation between dog and its appropriate verb because the two can’t both be on lhs. Is a programming language (C++) context-free? cast expressions template syntax
Context-Sensitive Grammars More than one symbol on lhs → a noun and verb can be related singular and plural are part of the spec via "number" Example: sentence ↔ noun_phrase verb_phrase noun_phrase ↔ article number noun article singular ↔ a singular article singular ↔ the singular article plural ↔ the plural singular noun ↔ man singular singular verb phrase ↔ singular verb singular verb ↔ bites Parse: The man bites: sentence noun_phrase verb_phrase article singular noun verb phrase The singular noun verb phrase The dog singular verb phrase The dog singular verb The dog bites
Data-Driven Parse? Example: 1. sentence ↔ noun_phrase verb_phrase 2. noun_phrase ↔ article number noun 3. article singular ↔ a singular 4. article singular ↔ the singular 5. article plural ↔ the plural 6. singular noun ↔ man singular 7. singular verb phrase ↔ singular verb 8. singular verb ↔ bites The man bites: Rule 8 matches bites The man singular verb Rule 7 The man singular verb-phrase Rule 6 The singular noun verb-phrase Rule 4 article singular noun verb-phrase Rule 6 noun_phrase verb-phrase Rule 1 sentence
Problems with Context-Sensitive More rules Obscured phrase structure, semantics mush in with syntax Still no semantic representation Next step: ATN parsers Terminals and non-terminals represented as identifiers (frames) with attached features (slots) Procedures attached to arcs of the network • executed when ATN traverses an arc • values assigned to grammatical features • tests performed and transition can fail, e.g. if no number agreement
Fig 14.8 An ATN grammar that checks number agreement and builds a parse tree. .NOUN-PHRASE ←typo checking for agreement →
Combining Syntax and Semantics Build conceptual graph the parse tree e.g. representation for sentence: get representation for subject from the noun phrase get representation for verb phrase bind subject to agent of the graph for the verb phrase When you reach a terminal, retrieve information from a knowledge base concepts. e.g. dog, man as in a type hierarchy (next slide) conceptual relations as in next slide
Knowledge Base Type hierarchy: Frames for likes and bites
Parse tree → Semantic Representation 1. call sentence 2. sentence calls noun_phrase 3. noun_phrase calls noun 4. noun returns concept for dog (1) 5. article is definite →bind a marker to dog (2) 6. sentence calls verb_phrase 7. verb_phrase calls verb which retrieves frame for like (3) 8. verb_phrase calls noun_phrase which calls noun to retrieve man (4) 9 . article is definite → leave concept generic (7)
14.5 Natural Language Applications Story understanding and question answering • goal: a program that can read a story and answer questions • why useful? What can we do so far? • parse and interpret a sentence (perform network joins between semantic interpretation of the input and conceptual graphs in the knowledge base) • can we expand this? Yes • answer questions • scripts • join semantic representations for multiple sentences
Answer Questions Answer questions: fido bit tony What did fido bite tony with? Scripts: fido bit tony tony has blood on his coat A script might infer that the blood came from the bite.
Join Semantic Representations Sentences Given fido bit tony fido has no teeth What?
14.5.2 Database Front End Information is structured select salary from employee_salary where employee ="John Smith" select salary from employee_salary, manager_of_hire where manager ="Ed Angel" and manager_of_hire.employee=employee_salary.employee What is John Smith's salary? List the salaries of employees who work for Ed Angel
Entity-relationship diagrams Knowledge base entry
Database query from natural language input "Who hired john smith?"
Fig 14.20 An architecture for information extraction, from Cardie (1997). As on preceding slide