1 / 21

LING/C SC/PYSC 438/538 Computational Linguistics

LING/C SC/PYSC 438/538 Computational Linguistics. Sandiway Fong Lecture 2: 8/24. Administrivia. Textbook Speech and Language Processing by Jurafsky & Martin. Prentice-Hall 2000 ( 2nd edition to come summer 2007 ) One copy on reserve in the library Homepage

egan
Download Presentation

LING/C SC/PYSC 438/538 Computational Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING/C SC/PYSC 438/538Computational Linguistics Sandiway Fong Lecture 2: 8/24

  2. Administrivia • Textbook • Speech and Language Processing by Jurafsky & Martin. Prentice-Hall 2000 • (2nd edition to come summer 2007) • One copy on reserve in the library • Homepage • errata, updated chapters from the forthcoming 2nd edition etc. • http://www.cs.colorado.edu/~martin/slp.html • Background Reading Assignment • Chapter 1: Introduction • background, history • available on-line • http://www.cs.colorado.edu/%~martin/slp-ch1.pdf

  3. Gentle introduction to Prolog (not used in SLP textbook) but you’ll need it for the homeworks we’ll be using it to encode finite state automata (FSA), grammars, inference rules Assignment install SWI-Prolog on your computer (free) current version: 5.6.17 www.swi-prolog.org manual is downloadable from the website run today’s exercises Something to consider Computer Lab Class today for LING 388: a hands-on first look at Prolog Time: 3:30pm–4:45pm Place: Social Sciences 224 Today’s Topic

  4. Prolog = Programming in Logic based on Horn clause logic subset of first-order predicate calculus meaning: can’t do everything 1st order logic can however, this does not limit Prolog Roots in Mechanical Theorem Proving people investigating automated proofs methods for mathematics Resolution Rule (Robinson, 65) Language invented by Colmerauer (implementor) and Kowalski (theoretician, textbook: Logic for Problem Solving) in the early 70s designed to support natural language processing … has grammar rules There was one experiment to teach Prolog to schoolkids 13 year olds in London (early 1980s?) yes, you can learn it too ! Some Background

  5. Interesting History • Prolog was adopted for Japan’s Fifth Generation Computer Project (80s) • 54.2 billion yen project • ≈ 500 million dollars • Was supposed to leapfrog the rest of the world • well, we know what (didn’t) happen [The term "fifth generation" was intended to convey the system as being a leap beyond existing machines. Computers using vacuum tubes were called the first generation, transistors and diodes the second, ICs the third, and those using microprocessors the fourth.] are we really still stuck here?

  6. Interesting History see also: http://en.wikipedia.org/wiki/Fifth_generation_computer The project imagined a parallel processing computer running on top of massive databases (as opposed to a traditional filesystem) using a logic programming language to access the data. They envisioned building a prototype machine with performance between 100M and 1G LIPS, LIPS = Logical Inference Per Second. Echoes of this in the more recent Human Genome Project W/MIT: LabBase “Queries (and updates) are posed in a non-recursive logic programming language: the syntax and semantics are essentially those of a subset of Prolog” http://www.icot.or.jp/ MUSEUM

  7. Name things (symbols) State facts (things that are true) State rules (relations between things that are true) Key Concepts Facts and rules are stored in Prolog’s database Prolog has the ability to do logical inference over its database We communicate with this database by posing logic queries Closed World Assumption: things are true only when given positive evidence Prolog never says “I don’t know” Prolog only knows what it can infer from the database if it cannot prove a query, it says the answer is “No” Example Medals: gold, silver, bronze Facts: medal(gold). medal(silver). medal(bronze). Database query: ?- medal(silver). Yes ?- medal(aluminum). No Prolog by Example

  8. Database query with variables ?- medal(X). X = gold ; X = silver ; X = bronze ; No Notation There are no variable declarations in Prolog How do we know a variable from a symbol? By convention: Variable names begin with an uppercase letter, e.g. X Ordinary symbols begin with a lowercase letter, e.g. gold Other things you need to know: ; (disjunction) here it means “next answer please” ?- is the Prolog interpreter’s “prompt” meaning it’s ready to receive a query . the period terminates a query or fact Prolog by Example

  9. Facts can be relations: better(gold,silver). better(silver,bronze). better(bronze,nothing). Encoding: “gold is better than silver” ... etc. Database query: ?- better(silver,bronze). Yes ?- better(bronze,X). X = nothing Another query: ?- better(gold,nothing). No (WHAT!!) Prolog can’t infer this from mere facts We have to explicitly state the rule involved in making this natural deduction Rule: (transitivity) X is better than Y if X is better than (some) Z, and Z is better than Y better(X,Y) :- better(X,Z), better(Z,Y). Notation: :- means “if” and , means “and” Prolog by Example

  10. Retry query: ?- better(gold,nothing). Yes (OK!!) How did Prolog prove it via database matching? ?- better(gold,nothing). Tries to match query to the facts (fails) Tries to match query to the rule (succeeds) better(X,Y) :- better(X,Z), better(Z,Y). when X = gold and Y = nothing Original query reduces to two subproblems or subqueries ?- better(gold,Z). ?- better(Z,nothing). Both subqueries must succeed for original query to succeed Notes: Prolog takes one query at a time (serial not parallel) in chronological order (of definition) behavior: depth-first search Prolog by Example

  11. Prolog by Example • Knowledge Base • Computation tree • better(gold,silver). • better(silver,bronze). • better(bronze,nothing). • better(X,Y) :- better(X,Z), better(Z,Y). better(gold,nothing) better(gold,Z) better(Z,nothing) better(silver,nothing) • better(gold,silver) better(silver,Z’) better(Z’,nothing) better(silver,bronze) better(bronze,nothing)

  12. Prolog by Example • Let’s trace the sequences of database queries for ?- better(gold,nothing). ?- better(gold,Z). Z = silver (by database fact) ?- better(Z,nothing). ?- better(silver,nothing). ?- better(silver,Z’). Z’ = bronze (by database fact) ?- better(Z’,nothing). ?- better(bronze,nothing). (database fact) Yes • Notation: Z’used to distinguish this instance Z in this subquery from the main Z • Computation tree can be explored mechanically in this fashion • Simple strategy lies at the heart of Prolog • Each inference step takes microseconds or less • Computers can make millions of inferences a second • Exploring long chains of inference is not a problem

  13. Prolog by Example • Prolog’s simple mechanical exploration strategy has its limitations • Query: • ?- better(silver,gold). No (EXPECTED ANSWER) • What actually happens with Prolog? • Doesn’t work!!!

  14. Prolog by Example • Computation tree: • ?- better(silver,gold). • ?- better(silver,Z). • ?- better(Z,gold). • ?- better(silver,Z). (FIRST SUBQUERY FROM ABOVE) • Z = bronze • ?- better(bronze,gold). (SECOND SUBQUERY WITH Z = BRONZE) • ?- better(bronze,Z’). • ?- better(Z’,gold). • ?- better(bronze,Z’). (FIRST SUBQUERY FROM SECOND SUBQUERY ABOVE) • Z’ = nothing • ?- better(nothing,gold). (SECOND SUBQUERY WITH Z’ = NOTHING) • ?- better(nothing,Z”). • ?- better(Z”,gold). • ?- better(nothing,Z”). (FIRST SUBQUERY FROM IMMEDIATELY ABOVE) • ?- better(nothing,Z”’). • ?- better(Z”’,Z”). • ?- better(nothing,Z”’). (OH NO!! WE’RE REPEATING OURSELVES AND GOING ROUND IN CIRCLES)

  15. What is happening to the computation tree? We’re in an infinite loop What does Prolog do with infinite loops? Nothing It just keeps going round and round generating sub-query after sub-query of the form ?- better(nothing,Zn). until memory is exhausted Zn n representing an ever increasing number of primes (‘) Then declares an error What would a more intelligent system do? Detect infinite loops automatically And declare failure to prove when one is encountered Well... Prolog has no loop detector Computational tradeoff: too expensive computationally speaking to check on every inference whether we are in a loop Besides, we can shift the burden to the programmer (you!) i.e. rely instead on the programmer to be smart enough to write rules that don’t generate infinite loops Prolog by Example

  16. How can a programmer rewrite this rule to fix the problem? X is better than Y if X is better than (some) Z and Z is better than Y better(X,Y) :- better(X,Z), better(Z,Y). Definition is highly recursive recursive in the sense that to prove better, we call better itself as a subquery i.e. better is defined in terms of better Notice: better calls betterleft recursively (FATAL MISTAKE) i.e. better if true if better is true, and ... Idea: restate definition without using left recursion X is better_than Y if X is better than Y or X is better than Z and Z is better_than Y better_than(X,Y) :- better(X,Y). better_than(X,Y) :- better(X,Z), better_than(Z,Y). Prolog by Example left recursive: means leftmost (i.e. first) call is to the same named predicate

  17. Prolog by Example • better_than(X,Y) :- better(X,Y). • better_than(X,Y) :- better(X,Z), better_than(Z,Y). • New query: • ?- better_than(silver,gold). • Computation tree: • ?- better_than(silver,gold). • ?- better(silver,Z). • ?- better_than(Z,gold). • ?- better(silver,Z). (FIRST SUBQUERY FROM ABOVE) • Z = bronze • ?- better_than(bronze,gold). (SECOND SUBQUERY WITH Z = BRONZE) • ?- better(bronze,Z’). • ?- better_than(Z’,gold). • ?- better(bronze,Z’). (FIRST SUBQUERY FROM SECOND SUBQUERY ABOVE) • Z’ = nothing • ?- better_than(nothing,gold). (SECOND SUBQUERY WITH Z’ = NOTHING) • ?- better(nothing,Z”). • ?- better_than(Z”,gold). • ?- better(nothing,Z”). • No

  18. better start state final state Recursion • Difference now is • subquery fails instead of looping (GOOD!!) • [Smarter interpreter (not Prolog): fail when a loop is detected.] • Another way to think about better_than: better_than(X,Y) :- better(X,Y). better_than(X,Y) :- better(X,Z), better_than(Z,Y). finite state machine diagram Fact database: better(gold,silver). better(silver,bronze). better(bronze,nothing).

  19. Yet another way to look at it Given a relation R (transitivity) xRy & yRz  xRz Transform definition into: Given relations R and R xRy  xRy xRy & yRz  xRz (non-left recursive version) This is exactly what we did with “better” Given better(X,Y) :- better(X,Z), better(Z,Y). We transformed it into: better_than(X,Y) :- better(X,Y). better_than(X,Y) :- better(X,Z), better_than(Z,Y). Note: the different directions for  and :- antecedent  consequent consequent :- antecedent Recursion

  20. Recursion • Recursion is a powerful concept • not unique to Prolog • or to programming languages in general • compact and powerful way to write certain kinds of definitions • a way to code infinite sets in a finite manner • use it carefully or your programs (or grammars) may not terminate • given Prolog’s depth-first strategy

  21. Prolog Resources (slide from Lecture 1) • Useful Online Tutorials • An introduction to Prolog • (Michel Loiseleur & Nicolas Vigier) • http://invaders.mars-attacks.org/~boklm/prolog/ • Learn Prolog Now! • (Patrick Blackburn, Johan Bos & Kristina Striegnitz) • http://www.coli.uni-saarland.de/~kris/learn-prolog-now/lpnpage.php?pageid=online

More Related