1 / 39

Prolog for Linguists Symbolic Systems 139P/239P

Prolog for Linguists Symbolic Systems 139P/239P. John Dowding Week 4, October 29, 2001 jdowding@stanford.edu. Office Hours. We have reserved 4 workstations in the Unix Cluster in Meyer library, fables 1-4 Skipping 4:30-5:30 on Thursday this week

luka
Download Presentation

Prolog for Linguists Symbolic Systems 139P/239P

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prolog for Linguists Symbolic Systems 139P/239P John Dowding Week 4, October 29, 2001 jdowding@stanford.edu

  2. Office Hours • We have reserved 4 workstations in the Unix Cluster in Meyer library, fables 1-4 • Skipping 4:30-5:30 on Thursday this week • Friday 3:30-4:30, after NLP Reading Group this week • If not, contact me and we can make other arrangements

  3. Course Schedule • Oct. 8 • Oct. 15 • Oct. 22 • Oct. 29 • Nov. 5 (double up) • Nov. 12 • Nov. 26 (double up) • Dec. 3 No class on Nov. 19

  4. Homework • By now, we have covered most of Chapters 3 and 4 of Clocksin and Mellish. Read them and let me know if you have any questions. • Homework to be handed in by noon on the 29th.

  5. subterm/2 % subterm(+SubTerm, +Term) subterm(Term1, Term2):- Term1 == Term2, !. subterm(SubTerm, Term):- compound(Term), functor(Term, _Functor, Arity), subterm_helper(Arity, SubTerm, Term).

  6. subterm_helper/3 % subterm_helper(+Index, +SubTerm, +Term) subterm_helper(Index, SubTerm, Term):- Index > 0, arg(Index, Term, Arg), subterm(SubTerm, Arg), !. subterm_helper(Index, SubTerm, Term):- Index > 0, NextIndex is Index - 1, subterm_helper(NextIndex, SubTerm, Term).

  7. occurs_in/2 % occurs_in(-Var, +Term) occurs_in(Var, Term):- var(Var), Var == Term, !. occurs_in(Var, Term):- compound(Term), functor(Term, _Functor, Arity), occurs_in_helper(Arity, Var, Term).

  8. occurs_in_helper/3 %occurs_in_helper(+Index, -Var, +Term) occurs_in_helper(Index, Var, Term):- Index > 0, arg(Index, Term, Arg), occurs_in(Var, Arg). occurs_in_helper(Index, Var, Term):- Index > 0, NextIndex is Index - 1, occurs_in_helper(NextIndex, Var, Term).

  9. (Or) occurs_in/2 % occurs_in(-Var, +Term):- occurs_in(Var, Term):- var(Var), subterm(Var, Term).

  10. replace_all/4 % replace_all(+Element, +Term, +NewElement, -ResultTerm) replace_all(Element, Term, NewElement, NewElement):- Element == Term, !. replace_all(_Element, Term, _NewElement, Term):- atomic(Term). replace_all(_Element, Term, _NewElement, Term):- var(Term). replace_all(Element, Term, NewElement, ResultTerm):- compound(Term), functor(Term, Functor, Arity), functor(ResultTerm, Functor, Arity), replace_all_helper(Arity, Element, Term, NewElement, ResultTerm).

  11. replace_all_helper/5 replace_all_helper(0, _Element, _Term, _NewElement, _ResultTerm):- !. replace_all_helper(Index, Element, Term, NewElement, ResultTerm):- arg(Index, Term, Arg), arg(Index, ResultTerm, ResultArg), replace_all(Element, Arg, NewElement, ResultArg), NextIndex is Index - 1, replace_all_helper(NextIndex, Element, Term, NewElement, ResultTerm).

  12. flatten/2 % flatten(+List, -ListOfAtoms). flatten([], []) :- !. flatten(Atomic, [Atomic]):- atomic(Atomic). flatten([Head|Tail], ListOfAtoms):- flatten(Head, ListOfAtoms1), flatten(Tail, ListOfAtoms2), append(ListOfAtoms1, ListOfAtoms2, ListOfAtoms).

  13. flatten_dl/2 % flatten_dl(+List, -ListOfAtoms) flatten_dl(List, ListOfAtoms):- flatten_dl_helper(List, (ListOfAtoms-[])). % flatten_dl_helper(+List, +DelayList) flatten_dl_helper([], (Empty-Empty)):- !. flatten_dl_helper(Atomic, ([Atomic|Back]-Back)):- atomic(Atomic). flatten_dl_helper([Head|Tail], (Front-Back)):- flatten_dl_helper(Head, (Front-NextBack)), flatten_dl_helper(Tail, (NextBack-Back)).

  14. Could have written subterm/2 as: %subterm(+SubTerm, +Term) subterm(SubTerm, Term):- replace_all(SubTerm, Term, _AnyThing, NewTerm), \+ Term == NewTerm. • But this would be slower

  15. Accumulators • Build up partial results to return at the end list_length([], 0). list_length([_Head|Tail], Result):- list_length(Tail, N), Result is N +1. list_length(List, Result) :- list_length_helper(List, 0, Result). list_length_helper([], Result, Result). list_length_helper([_Head|Tail], Partial, Result):- NextPartial is Partial + 1, list_length_helper(Tail, NextPartial, Result).

  16. flatten/2 with an accumulator %flatten_acc(+ListOfLists, -ListOfAtoms) flatten_acc(List, ListOfAtoms):- flatten_acc_helper(List, [], ListOfAtoms). %flatten_acc_help(+ListOfLists, +PartialResult, -FinalResult) flatten_acc_helper([], PartialResult, PartialResult):- !. flatten_acc_helper(Atomic, Partial, [Atomic|Partial]):- atomic(Atomic), !. flatten_acc_helper([Head|Tail], PartialResult, FinalResult):- flatten_acc_helper(Tail, PartialResult, NextResult), flatten_acc_helper(Head, NextResult, FinalResult).

  17. Difference Lists • Use two logical variables that point to different portions of the same list. • Compare stacks with queues:

  18. Queues • Queue represented as a pair of lists (Front-Back) • Back is always a variable %empty_queue(?Queue) – true if the queue is empty empty_queue(Queue-Queue). %add_to_queue(+Element, +Queue, -NewQueue) add_to_queue(Element, (Front-[Element| Back]), (Front-Back)). %remove_from_queue(+Queue, -Element, -NewQueue) remove_from_queue(([Element|Front]-Back), Element, (Front-Back)).

  19. Generate-and-Test • Popular (and sometimes efficient) way to write a program. Goal :- Generator, - generates candidate solutions Tester. - verifies correct answers

  20. One more generate and test example • N-Queens Problem

  21. Unification • Two terms unify iff there is a set of substitutions of variables with terms that makes the terms identical • True unification disallows cyclic terms: • X=f(X) ought to fail because there is no finite term that can substitute for X to make those terms identical. • This is called the occurs check. • Prolog unification does not enforce the occurs check, and may create cyclic terms • Occurs check is expensive • O(n) – n is the size of the smaller of the two terms • O(n+m) – n and m are the sizes of the two terms • In Prolog, it is quite typical to unify a variable with a larger term

  22. %unify_woc(?Term1, ?Term2) unify_woc(Var1, Term2):- var(Var1), !, \+ occurs_in(Var1, Term2), Var1 = Term2. unify_woc(Term1, Var2):- var(Var2), !, \+ occurs_in(Var2, Term1), Var2 = Term1. unify_woc(Atomic1,Atomic2):- atomic(Atomic1), atomic(Atomic2), !, Atomic1 == Atomic2. unify_woc(Term1, Term2):- compound(Term1), compound(Term2), functor(Term1, Functor, Arity), functor(Term2, Functor, Arity), unify_woc_helper(Arity, Term1, Term2). unify_woc/2 (with occurs check)

  23. unify_woc_helper/3 % unify_woc_helper(+Index, +Term1, +Term2) unify_woc_helper(0, _Term1, _Term2):- !. unify_woc_helper(Index, Term1, Term2):- arg(Index, Term1, Arg1), arg(Index, Term2, Arg2), unify_woc(Arg1, Arg2), NextIndex is Index - 1, unify_woc_helper(NextIndex, Term1, Term2).

  24. More about cut! • Common to distinguish between red cuts and green cuts • Red cuts change the solutions of a predicate • Green cuts do not change the solutions, but effect the efficiency • Most of the cuts we have used so far are all red cuts %delete_all(+Element, +List, -NewList) delete_all(_Element, [], []). delete_all(Element, [Element|List], NewList) :- !, delete_all(Element, List, NewList). delete_all(Element, [Head|List], [Head|NewList]) :- delete_all(Element, List, NewList).

  25. Green cuts • Green cuts can be used to avoid unproductive backtracking % identical(?Term1, ?Term2) identical(Var1, Var2):- var(Var1), var(Var2), !, Var1 == Var2. identical(Atomic1,Atomic2):- atomic(Atomic1), atomic(Atomic2), !, Atomic1 == Atomic2. identical(Term1, Term2):- compound(Term1), compound(Term2), functor(Term1, Functor, Arity), functor(Term2, Functor, Arity), identical_helper(Arity, Term1, Term2).

  26. Technique: moving unifications after the cut % parent(+Person, -NumParents) parent(adam, 0):- !. parent(eve, 0):- !. parent(_EverybodyElse, 2). • The goal parent(eve, 2). Succeeds % parent(+Person, ?NumParents). parent(adam, NumParent):- !, NumParents = 0. parent(eve, NumParent):- !, NumParent = 0. parent(_EverybodyElse, 2).

  27. Last Call Optimization • Generalization of Tail-Recursion Optimization • Turns recursions into iteration by reusing stackframe • When about to execute last Goal in a clause, • If there are no more choices points for the predicate, • And no choice points from earlier Goals in clause delete_all(_Element, [], []). delete_all(Element, [Element|List], NewList) :- !, delete_all(Element, List, NewList). delete_all(Element, [Head|List], [Head|NewList]) :- delete_all(Element, List, NewList).

  28. Advice on cuts • Dangerous, easy to misuse • Rules of thumb: • Use sparingly • Use with as narrow scope as possible • Know which choice points you are removing • Green cuts may be unnecessary, sometimes the compiler can figure it out.

  29. Input/Output of Terms • Input and Output in Prolog takes place on Streams • By default, input comes from the keyboard, and output goes to the screen. • Three special streams: • user_input • user_output • user_error • read(-Term) • write(+Term) • nl

  30. Example: Input/Output • repeat/0 is a built-in predicate that will always resucceed % classifing terms classify_term :- repeat, write('What term should I classify? '), nl, read(Term), process_term(Term), Term == end_of_file.

  31. I/O Example (cont) process_term(Atomic):- atomic(Atomic), !, write(Atomic), write(' is atomic.'), nl. process_term(Variable):- var(Variable), !, write(Variable), write(' is a variable.'), nl. process_term(Term):- compound(Term), write(Term), write(' is a compound term.‘), nl.

  32. Streams • You can create streams with open/3 open(+FileName, +Mode, -Stream) • Mode is one of read, write, or append. • When finished reading or writing from a Stream, it should be closed with close(+Stream) • There are Stream-versions of other Input/Output predicates • read(+Stream, -Term) • write(+Stream, +Term) • nl(+Stream)

  33. Characters and character I/O • Prolog represents characters in two ways: • Single character atoms ‘a’, ‘b’, ‘c’ • Character codes • Numbers that represent the character in some character encoding scheme (like ASCII) • By default, the character encoding scheme is ASCII, but others are possible for handling international character sets. • Input and Output predicates for characters follow a naming convention: • If the predicate deals with single character atoms, it’s name ends in _char. • If the predicate deals with character codes, it’s name ends in _code. • Characters are character codes is traditional “Edinburgh” Prolog, but single character atoms were introduced in the ISO Prolog Standard.

  34. Special Syntax I • Prolog has a special syntax for typing character codes: • 0’a is a expression that means the character codc that represents the character a in the current character encoding scheme.

  35. Special Syntax II • A sequence of characters enclosed in double quote marks is a shorthand for a list containing those character codes. • “abc” = [97, 98, 99] • It is possible to change this default behavior to one in which uses single character atoms instead of character codes, but we won’t do that here.

  36. Built-in Predicates: • atom_chars(Atom, CharacterCodes) • Converts an Atom to it’s corresponding list of character codes, • Or, converts a list of CharacterCodes to an Atom. • put_code(Code) and put_code(Stream, Code) • Write the character represented by Code • get_code(Code) and get_code(Stream, Code) • Read a character, and return it’s corresponding Code • Checking the status of a Stream: • at_end_of_file(Stream) • at_end_of_line(Stream)

  37. Tokenizer • A token is a sequence of characters that constitute a single unit • What counts as a token will vary • A token for a programming language may be different from a token for, say, English. • We will start to write a tokenizer for English, and build on it in further classes

  38. Tokenizer for English • Most tokens are consecutive alphabetic characters, separated by white space • Except for some characters that always form a single token on their own: . ‘ ! ? -

  39. Homework • Read section in SICTus Prolog manual on Input/Output • This material corresponds to Ch. 5 in Clocksin and Mellish, but the Prolog manual is more up to date and consistent with the ISO Prolog Standard • Improve the tokenizer by adding support for contractions • can’t., won’t haven’t, etc. • would’ve, should’ve • I’ll, she’ll, he’ll • He’s, She’s, (contracted is and contracted has, and possessive) • Don’t hand this in, but hold on to it, you’ll need it later.

More Related