CS 4700: Foundations of Artificial Intelligence

CS 4700:Foundations of Artificial Intelligence Carla P. Gomes gomes@cs.cornell.edu Module: FOL Inference (Reading R&N: Chapter 8)

Inference • How to perform inference in First Order Logic? How to derive new information? • “Similar” to propositional logic but it requires new “tricks” to deal with: • quantifiers and variables. • Unification a substitution to match atomic sentences involving variables • Skolemization  instantiations of existential variables to remove existential quantifiers

Outline • Reducing first-order inference to propositional inference • Unification • Generalized Modus Ponens • Forward chaining • Backward chaining • Resolution

Universal instantiation (UI) • Every instantiation of a universally quantified sentence is entailed by it: • vαSubst({v/g}, α) • for any variable v and ground term g • E.g., x King(x) Greedy(x)  Evil(x) yields: King(John) Greedy(John) Evil(John) King(Richard) Greedy(Richard) Evil(Richard) King(Father(John)) Greedy(Father(John)) Evil(Father(John)) . . .

Existential instantiation (EI) • For any sentence α, variable v, and constant symbol k that does not appear elsewhere in the knowledge base: • vα • Subst({v/k}, α) • E.g., xCrown(x) OnHead(x,John) yields: • Crown(C1) OnHead(C1,John) • provided C1 is a new constant symbol, called a Skolem constant

Reduction to propositional inference • Suppose the KB contains just the following: x King(x)  Greedy(x)  Evil(x) King(John) Greedy(John) Brother(Richard,John) • Instantiating the universal sentence in all possible ways, we have: King(John)  Greedy(John)  Evil(John) King(Richard)  Greedy(Richard)  Evil(Richard) King(John) Greedy(John) Brother(Richard,John) • The new KB is propositionalized: proposition symbols are King(John), Greedy(John), Evil(John), King(Richard), etc.

Reduction to Propositional Inference • Every FOL KB can be propositionalized so as to preserve entailment • (A ground sentence is entailed by new KB iff entailed by original KB) • Idea: propositionalize KB and query, apply resolution, return result Often quite effective to propositionalize a theory to take advantage of fast propositional solvers!!!

Reduction to Propositional Inference • But, at a more theoretical level…. • Problem: with function symbols, there are infinitely many ground terms, • e.g., Father(Father(Father(John))) • Theorem: Herbrand (1930). If a sentence αis entailed by a FOL KB, it is entailed by a finite subset of the propositionalized KB • Problem: works if α is entailed, loops if α is not entailed • Theorem: Turing (1936), Church (1936) Entailment for FOL is semidecidable (algorithms exist that say yes to every entailed sentence, but no algorithm exists that also says no to every nonentailed sentence.) Theoretical aspects beyond this course

Unification • Dealing with variable: • Unification: a substitution to match atomic sentences, that makes two clauses resolvable: • Unify (P.Q) takes two atomic sentences P and Q and returns a substitution that makes P and Q look the same. • Rules for substitutions: • Can replace a variable by a constant. (v1  C;) • Can replace a variable by a variable. (v2  v3; ) • Can replace a variable by a function expression, as long as the function expression does not contain the variable. (v4  f(…))

Unification • Given: • Knows(John,x)  Hates(John,x) • Knows(John,Jim) • To perform resolution we need: • Unifier  = {x/Jim} • Hates(John,Jim)

Unification • Knows(John,x)  Hates(John,x) • And • Knows(John,Jim) • How to resolve them? First match them • Solution: UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim}) • Unifier  = {x/Jim} • Gives • Knows(John,Jim)  Hates(John,Jim) • And • Knows(John,Jim) • Conclude by resolution: Hates(John,Jim)

Unification • General rule: • Knows(John,x)  Hates(John,x) • Facts: • Knows(John , Jim) • Knows(y , Leo) • Knows(y , Mother(y)) • Knows(y , Jane) Matching facts to the general rules

Unification:Standardizing Variables • UNIFY(Knows(John,x),Knows(John,Jim)) = {x/Jim}) • UNIFY(Knows(John,x),Knows(y,Leo)) = {x/Leo,y/John}) • UNIFY(Knows(John,x),Knows(y,Mother(y))) = {y/John,x/Mother(John)}) • UNIFY(Knows(John,x),Knows(x,Jane)) = fail •  but intuitively we know that everyone John knows he hates and that everyone knows Jane so we should be able to infer that John hates Jane. • This is why we require, if possible, that every variable has a separate name. • UNIFY(Knows(John,x),Knows(y,Jane))  works!!! {y/John,x/Jane}) • Standardizing apart eliminates overlap of variables, e.g., Knows(y,Jane)

Unification:Most General Unifier • To unify Knows(John,x) and Knows(y,z), • θ = {y/John, x/z } or θ= {y/John, x/z, z/Mary} or θ= {y/John, x/John, z/John} • Choose the substitutionthat makes the least commitment • (most general) about the bindings • The first unifier is more general than the second. • MGU = { y/John, x/z } • There is a single most general unifier (MGU) that is unique up to renaming of variables. See page 277 and 278 for unification algorithm, O(n2) (size of expressions being checked).

Generalized Modus Ponens (GMP) • p1', p2', … , pn', ( p1 p2 …  pnq) • qθ • p1' is King(John) p1 is King(x) • p2' is Greedy(y) p2 is Greedy(x) • θ is {x/John,y/John} q is Evil(x) • q θ is Evil(John) • All variables assumed universally quantified • Horn Definite clauses (exactly one positive literal) are a suitable normal • form for use with GMP. where pi'θ = piθ for all i

Example knowledge base • The law says that it is a crime for an American to • sell weapons to hostile nations. The country Nono, • an enemy of America, has some missiles, and all of • its missiles were sold to it by Colonel West, who is • American. • Prove that Colonel West is a criminal

facts Example knowledge base contd. • ... it is a crime for an American to sell weapons to hostile nations: American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) • Nono … has some missiles, i.e., x Owns(Nono,x)  Missile(x): Owns(Nono,M1) and Missile(M1) • … all of its missiles were sold to it by Colonel West Missile(u)  Owns(Nono,u)  Sells(West,u,Nono) • Missiles are weapons: Missile(v)  Weapon(v) • An enemy of America counts as "hostile“: Enemy(t,America)  Hostile(t) • West, who is American … American(West) • The country Nono, an enemy of America … Enemy(Nono,America) Added skolem constant M1 All variables assumed universally quantified; quantifiers omitted.

Forward chaining proof All variables assumed universally quantified; quantifiers omitted.

Forward chaining proof {v/M1} {t/Nono} {u/M1} • Missile(u)  Owns(Nono,u)  Sells(West,u,Nono) • Enemy(t,America)  Hostile(t) • Missile(v)  Weapon(v) All variables assumed universally quantified; quantifiers omitted.

Forward chaining proof {x/West,y/M1,z/Nono} {v/M1} {u/Nono} {u/M1} American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) All variables assumed universally quantified; quantifiers omitted.

Properties of forward chaining • Sound and complete for first-order Horn definite clauses • Datalog = first-order definite clauses + no functions • FC terminates for Datalog in finite number of iterations • May not terminate in general if α is not entailed • This is unavoidable: entailment with definite clauses is semidecidable • Improvements: Incremental forward chaining Forward chaining is widely used in deductive databases

Backward chaining example

Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x)

Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(x)  Weapon(x)

Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Missile(x)  Weapon(x) Missile(M1)

Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Owns(Nono,M1) Missile(x)  Weapon(x) Missile(M1)

Backward chaining example Note: once one subgoal succeeds in a conjunction, its substitution is applied to subsequent sub-goals. E.g. y is bound to M1 and and z is bound to Nono. American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Owns(Nono,M1) Missile(x)  Weapon(x) Enemy(Nono,America) Missile(M1)

Backward chaining example American(x)  Weapon(y)  Sells(x,y,z)  Hostile(z)  Criminal(x) Owns(Nono,M1) Missile(x)  Weapon(x) Enemy(Nono,America) Missile(M1)

Properties of backward chaining • Depth-first recursive proof search: space is linear in size of proof • Incomplete due to infinite loops •  fix by checking current goal against every goal on stack • Inefficient due to repeated subgoals (both success and failure) •  fix using caching of previous results (extra space) • Widely used for logic programming

Logic programming: Prolog • Algorithm = Logic + Control • Basis: backward chaining with Horn clauses + bells & whistles • Widely used in Europe, Japan (basis of 5th Generation project) • Interpreted or Compiled (intermediate language, e.g., Lisp C) • Program = set of clauses = head :- literal1, … literaln. • criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z). • Depth-first, left-to-right backward chaining • Built-in predicates for arithmetic etc., e.g., X is Y*Z+3 • Built-in predicates that have side effects (e.g., input and output • predicates, assert/retract predicates) • Closed-world assumption ("negation as failure") • e.g., given alive(X) :- not dead(X). • alive(joe) succeeds if dead(joe) fails

Prolog (Programming in Logic) • What is Prolog? • Full-featured programming language • Programs consist of logical formulas • Running a program means proving a theorem • Syntax of Prolog • Predicates, objects, and functions: • cat(tuna), append(a,pair(b)) • Variables: X, Y, List (capitalized) • Facts: • university(cornell). • prepend(a,pair(a,X)). • Rules: • animal(X) :- cat(X). • student(X) :- person(X), enrolled(X,Y), university(Y). • implication “:-” with single predicate on left and only non-negated predicates on the right. All variables implicitly “forall” quantified. • Queries: • student(X). • All variables implicitly “exists” quantified.

Resolution

Resolution • p q • p  r •  q  r Propositional logic FOL: “Similar” to propositional logic but it requires new “tricks” to deal with: quantifiers and variables.

Resolution • 1 – Put in clausal form • All variables universally quantified (standardize names apart) • Main trick: “Skolemization” to remove existential quantifiers. • Idea: Invent names for unknown objects known to exist. • Two cases: • Constant - existential variable not in the scope of any other variable  skolem constant • Function – existential variable in the scope of other variables •  Skolem function • 2 – Use unification to match atomic sentences • 3 – Apply resolution rule to the clausal set combine with the negated goal. Attempt to derive empty clause.

Eliminate Existential Quantifiers: Skolemization • Existential variable not in the scope of any other variable Existential variable in the scope of other variable There is one argument for each universally quantified variable whose scope contains the Skolem function.

Resolution: brief summary • The two clauses are assumed to be standardized apart so that they share no variables. • For example, • Rich(x) Unhappy(x) • Rich(Ken) • Unhappy(Ken) • with θ = {x/Ken} • Apply resolution steps to CNF(KB α); refutation complete for FOL

Algorithm: Putting Axioms into Clausal Form • Eliminate biconditionals and implications. • Move the negations inwards. • Eliminate the existential quantifiers. • Rename the variables, if necessary. • Move the universal quantifiers to the left. • Move the disjunctions down to the literals. • Eliminate the conjunctions. • Rename the variables, if necessary. • Eliminate the universal quantifiers.

Conversion to CNF • Everyone who loves all animals is loved by someone: x { [y Animal(y) Loves(x,y)]  [y Loves(y,x)] } • 1. Eliminate biconditionals and implications • x {[y Animal(y) Loves(x,y)] [yLoves(y,x)]} (eliminate main implication) • x {[y (Animal(y) Loves(x,y))] [yLoves(y,x)]} (eliminate other implication) • 2. Move  inwards: x p ≡x p,  x p ≡x p x [y (Animal(y) Loves(x,y))]  [y Loves(y,x)] x [y Animal(y) Loves(x,y)]  [y Loves(y,x)] de Morgan’s law x [y Animal(y) Loves(x,y)]  [y Loves(y,x)] double negation

Conversion to CNF contd. • Standardize variables: each quantifier should use a different one x [y Animal(y) Loves(x,y)]  [z Loves(z,x)] • Skolemize: a more general form of existential instantiation. Each existential variable is replaced by a Skolem function of the enclosing universally quantified variables: x [Animal(F(x)) Loves(x,F(x))] Loves(G(x),x) • Drop universal quantifiers: [Animal(F(x)) Loves(x,F(x))] Loves(G(x),x) • Distribute  over  : [Animal(F(x)) Loves(G(x),x)]  [Loves(x,F(x)) Loves(G(x),x)]

Resolution:Example • Jack owns a dog. • Every dog owner is an animal lover. • No animal lover kills an animal. • Either Jack or Curiosity killed the cat, who is named Tuna. • Did Curiosity kill the cat?

: : (v) (w,v) Original Sentences (Plus Background Knowledge) Jack owns a dog. Skolem constant  x p ≡x p Every dog owner is an animal lover. No animal lover kills an animal. Either Jack or Curiosity killed the cat, who is named Tuna. Theorem: Kills (Curiosity,Tuna)

{w/Jack, v/Tuna} Animal(z)Cat(z) AnimalLover(Jack) Animal(Tuna) {z/Tuna} AnimalLover(Jack) Cat(Tuna) Cat(Tuna) {} AnimalLover(Jack) Dog(y) Owns(x,y) AnimalLover(x) {x/Jack} Dog(y)  Owns(Jack,y) Dog(D) Owns(Jack,D) {y/D} Owns(Jack,D) NIL Proof by Resolution Negation of theorem kills(Curiosity,Tuna) kills(Jack,Tuna)  kills(Curiosity,Tuna) {} kills(Jack,Tuna) AnimalLover(w) Animal(v)  kills(w,v)

Resolution proof: definite clauses

Resolution is Refutation Complete • Resolution with unification applied to clausal form, is refutation complete! • Interesting proof! Based on building an “artificial” domain of interpretation called the Herbrand universe. • See R&N pages 300-303.

Proofs can be Lengthy • A relatively straightforward KB can quickly overwhelm general resolution methods. • Theorem provers are in general based on resolution strategies that can reduce the problem somewhat, but not completely. • As a consequence, many practical Knowledge Representation formalisms in AI use a restricted form and specialized inference. • Logic programming (Prolog) • Datalog – definite clauses, no functions • Production or expert systems (rule based systems) • Frame systems and semantic networks (chapter 10) • Description logics (chapter 10)

Successes in Rule-Based Reasoning • Expert systems • DENDRAL (Buchanan et al., 1969) • MYCIN (Feigenbaum, Buchanan, Shortliffe) • PROSPECTOR (Duda et al., 1979) • R1 (McDermott, 1982)

Successes in Rule-Based Reasoning • DENDRAL (Buchanan et al., 1969) • Infers molecular structure from the information provided by a mass spectrometer • Generate-and-test method

Successes in Rule-Based Reasoning • MYCIN (Feigenbaum, Buchanan, Shortliffe) • Diagnosis of blood infections • 450 rules; performs as well as experts • Incorporated certainty factors

Successes in Rule-Based Reasoning • PROSPECTOR (Duda et al., 1979) • Correctly recommended exploratory drilling at geological site • Rule-based system founded on probability theory • R1 (McDermott, 1982) • Designs configurations of computer components • About 10,000 rules • Uses meta-rules to change context

CS 4700: Foundations of Artificial Intelligence