Logical reasoning systems

Logical reasoning systems Recall that the two most important characteristics of AI agents are: • Clear separation between the agent’s knowledge and inference engine. • High degree of modularity of the agent’s knowledge. We have already seen how these features are utilized in forward and backward chaining programs (these are referred to as production systems). Next, we discuss three different implementations of AI agents based on the same ideas, namely: • AI agents utilizing logic programming (typically implemented in a PROLOG-like language). • AI agents utilizing frame representation languages. • AI agents utilizing semantic networks.

Logic programming and Prolog. Consider a knowledge base containing only Horn formulas, and a backward chaining program where all of the inferences are performed along a given path until a dead end is encountered (i.e. the underlying control strategy is depth- first search); when a dead end is encountered, the program backs up to the most recent step which has an alternative continuation. Example: Let the KB contain the following statements represented as PROLOG clauses (next to each clause, a LISP-based implementation is given). mammal(bozo). (remember-assertion ‘(Bozo is a mammal)) mammal(Animal) :- hair(Animal). (remember-rule ‘(identify1 ((? animal) has hair) ((? animal) is a mammal))) ?- mammal(deedee). (backward-chain ‘(Deedee is a mammal)) ?- mammal(X). (backward-chain ‘((? X) is a mammal)) Note that PROLOG uses a prefix notation, variables start with uppercase letters, and constants are in a lowercase. Each statement ends with a period.

PROLOG syntax If a rule has more than one premise, a comma is used to separate the premises, i.e. A & B & C & D => H is written in PROLOG as H :- A, B, C, D. head body Rule of the form A v B v C v D => H can be presented as follows: H :- A; B; C; D. Note that this is equivalent to H :- A. H :- B. H :- C. H :- D. To represent negated antecedents, PROLOG uses the nagation as failure operator, i.e. A & ¬B => H is represented in PROLOG as H :- A, not(B).

The negation as failure operator Note that notis not a logical negation. This operator works as follows: to satisfy not(B), PROLOG tries to prove B; if it fails to prove B, not(B) is considered proved. Negation as failure operator makes PROLOG a non-monotonic reasoning system. It is based on the so-called Closed World Assumption (CWA), whichstates that if a statement were true, then an axiom would exist stating it as being true. If such an axiom does not exist, we can assume that the theorem is false. Compare this to OWA in DLs. Note: CWA may introduce inconsistencies in the KB. For example, let not(B) be Assumed, because B is not declared true. If later B is entered as an axiom, the KB Will contain both, B and not(B).

Managing assumptions and retractions Consider the following PROLOG program: D :- not(A), B. D :- not(C). Rules F :- E, H. N :- D, F. E. Premises H. ?- N. Query To prove N, Prolog searches for a rule, whose head is N. Here N :- D, F. is such a rule. For this rule to fire, D and F must be proven in turn: • To prove D, consider rule D :- not(A), B.  fails, because of B. • To prove D, consider rule D :- not(C).  succeeds. • F can be easily proved, because E and H are declared as premises  Therefore, N is proved (based on the fact that E and H are true, and assuming that C is false).

Example (cont.) Assume now that the system learns C. What happens to D and N, respectively? Obviously, D and N must be retracted. The BIG question is how to dothis. Note that there are other reasons for which we may want to retract a sentence, such as: • To make the system “forget”. • To update the current model of the world. Note the difference between retracting a sentence and adding the negation of that sentence. In our example, if not(C) is retracted, the system will not be able to infer either C, or ¬C. Whereas, if ¬C  KB and we add C, then the system can infer both C and ¬C. PROLOG cannot do retractions. To be able to do retractions, the system must keep track of dependencies between statements to know which additional statements must be retracted when we retract not(C) (D and N in our example). Explicitly recording dependencies between statements is called truth maintenance.

Truth Maintenance Systems A Truth Maintenance System (TMS) is part of the KBS responsible for: • Enforcing logical relations among beliefs. • Generating explanations for conclusions. • Finding solutions to search problems (Dependency-directed backtracking vs Chronological backtracking). • Supporting default reasoning. • Identifying causes for failure and recover from inconsistencies. The TMS / IE relationship is the following: Justifications, assumptions Inference Engine TMS Beliefs, contradictions Knowledge-based System

Enforcing logical relations (constrains) among beliefs. Every AI problem which is not completely specified requires search. Search utilizes assumptions, which may eventually change. Changing assumptions requires updating consequences of beliefs. Re-deriving those consequences is most often not desirable or Efficient, therefore we need a mechanism to maintain and update relations among beliefs. Example: If (cs-501) and (math-218) then (cs-570). If (cs-570) and (CIT-core-completed) then (AI-related-capstone). If (AI-related-capstone) then (AI-experience). The following are relations among beliefs following from these statements: (AI-experience) if (AI-related-capstone). (AI-related-capstone) if (cs-570), (CIT-core-completed). etc. Beliefs can be viewed as propositional variables, and a TMS can be viewed as a mechanism for processing large collections of logical relations on propositional variables.

Generating explanations for conclusions The ability to explain its reasoning is one of the most important features of KBS. There are different ways to do that: • To record the reasoning process, and keep track of the data upon which the conclusion depends. • To keep track of the sources of each data item, for example “provided by the user”, “inferred by rule xx”, etc. • To keep a special note as part of the rule that contains an explanation. Example: Given the following rules H & B => Y, C => B, ¬C => ¬B, A => X & Y, C => D, ¬A => C. Given that H is true, prove Y. Explanation1Explanation 2 Y is a conclusion of rule H & B => Y, Y because A is unknown (meaning and premises B and H (how B was that by the negation as failure rule proved is not relevant for the we can assume ¬A) and H is true. explanation of Y). Or, Y because of H while ¬A.

That is, to provide explanations, TMS uses cached inferences. The fundamental assumption behind this idea is that caching inferences once is more beneficial than running inference rules that have generated those inferences more than once. Example Q: Shall I have an AI experience after completing the CIT program? A: Yes, because of the AI related capstone. Q: What do I need to enroll in a AI related capstone? A: CS 570 and completed core. Note: There are different types of TMSs that provide different ways of explaining conclusions (JTMS vs ATMS). In this example, explaining conclusions in terms of their immediate predecessors works better.

Reasoning with incomplete knowledge: default reasoning (AIMA, page 458) Consider the conditions under which Y is true in the above example: for Y to be true, H must be true and it must be reasonable to assume ¬A. This can be represented by means of the following rule: H : ¬A Rules of this type are called default rules. Y The general form of default rules is: A : B1, B2, … , Bn C where: A, B1, B2, … , Bn, C are FOL sentences; A is called the prerequisite of the default rule; B1, B2, … , Bn are called justifications of the default rule; C is called the consequent of the default rule.

Example Let Tweety be a bird, and because we know that birds fly we want to be able to conclude that Tweety flies. Bird(Tweety) Bird(X): Flies(X) This rule says “If X is a bird, and it is consistent to Flies(X) believe that X flies, then X flies”. Given only this information about Tweety, we can infer Flies(Tweety). Assume that we learn Penguin(Tweety). Because penguins do not fly, we must have the following rule in the KB to handle penquines: Penguin(X): ¬Flies(X) ¬Flies(X) We can infer now Flies(Tweety) (according to the first rule), and ¬Flies(Tweety) according to the second rule. To resolve this contradiction, we may want to always prefer the “more specific rule”, which in this case will first derive ¬Flies(Tweety) making the first rule inapplicable.

Dependency networks The following dependency network presents exceptions in a more descriptive graphical form. Rather than enumerating exceptions, we may “group” them under the property “abnormal”. Flies(X) Bird(X) Penguin(X) Abnormal(X) Ostrich(X) Dead(X) Stuffed(X)

Semi-normal default rules allow us to capture exceptions Defaults where the justification and the conclusion of the rule are the same are called normal defaults; otherwise, the default is called semi-normal. Although semi-normal defaults allow “exceptions” to be explicitly enumerated (as part of the rule), they cannot guarantee the correctness of derived conclusions. Example: Consider the following set of sentences Bird(Tom) Penguin(Tom) v Ostrich(Tom) Bird(X) : Flies(X) & ¬Penguin(X) Flies(X) Bird(X) : Flies(X) & ¬ Ostrich(X) Flies(X) We can infer Flies(Tom), which is semantically incorrect, because neither penguins nor ostriches fly.

Non-monotonic TMSs and default reasoning The problem with the example above is that in default reasoning systems once inferred, the conclusion is no longer related to its justification. Non - monotonic TMSs have a mechanism for retracting sentences from the KB as a result of a retraction of another sentence. They offer an implementation framework for default reasoning by identifying causes for inconsistencies and providing a mechanism for recovering from inconsistencies. Example: Consider the following set of rules: :A :B :C A B C A & B -->  B & C --> 

A B C IN IN Example (cont.) Assume that A, B and C are all declared as enabled assumptions. Here is the corresponding network: A contradiction is produced. To retract it, the IE must retract one of the two antecedents of J1. IN J1 | IN

OUT A B C C A B J2 J1 | | OUT IN IN IN IN Example (cont.) Assume the IE decides to retract A. The resulting network is the following: A new justification, J2, is recorded and another contradiction becomes IN. To get rid of the new contradiction, assume that the IE decides to retract B. OUT OUT J2 J1 | | OUT OUT

A B C IN Example (cont.) The resulting network does not satisfy the requirement that in a default theory an assumption must be IN unless it causes a contradiction. To correct this, A must be enabled again. The resulting network is the following: IN OUT J2 J1 | | OUT OUT

Alternative knowledge representations: Semantic networks A semantic net is a labeled directed graph, where each node represents an object (a proposition), and each link represents a relationship between two objects. Example: sister-of husband-of wife-of wife-of husband-of mother-of father-of mother-of wife-of father-of husband-of mother-of father-of Note that only binary relationships can be represented in this model. Ann Carol David Bill Tom Susan John

Semantic nets represent propositional information. Relations between propositions are of primary interest because they provide the basic structure for organizing knowledge. Some important relations are: • “IS-A” (is an instance of). Refers to a member of a class, where a class is a group of objects with one or more common attributes (properties). For example, “Tom IS-A bird”. • “A-KIND-OF”. Relates one class to another, for example “Birds are A-KIND-OF animals”. • “HAS-A”. Relates attributes to objects, for example “Mary HAS-A cat”. • “CAUSE”. Expresses a causal relationship, for example “Fire CAUSES smoke”. Note that semantic nets can be easily converted into a set of FOL formulas, and vice versa. Semantic nets, however, have two important advantages: • A very simple execution model. • Very readable representation, which makes it easy to visualize inference steps.

Inference in semantic networks. The inference procedure in semantic nets is called inheritance, and it allows one node’s characteristics to be duplicated by a descendent node. Example: Consider a class “aircraft”, and assume that “balloons”, “propeller-driven objects” and “jets” are subclasses of it, i.e. “Balloons are A-KIND-OF aircrafts” “Propeller-driven objects are A-KIND-OF aircrafts”, etc. Assume that the following attributes are assigned to aircrafts: “Aircraft IS-A flying object”, “Aircraft HAS-A wings”, “Aircraft HAS-A engines” All properties assigned to the superclass, “aircraft”, will be inherited by its subclasses, unless there is an “exception” link capturing a non-monotonic inference relation.

Multiple inheritance may result in a conflicting inference In some semantic networks, one class can inherit properties of more than one superclass. The “Nixon diamond” example: It is widely accepted that Quakers tend to be pacifists, and Republicans tend not to be. Nixon is known to be both - a Quaker, and a Republican. Not IS-A IS-A IS-A IS-A The resulting conflict can be resolved only if additional information stating a preference to one of the conflicting inferences is provided. Pacifists Republicans Quakers Nixon

Object-attribute-value triplets One problem with semantic nets is that there is no standard definition of link names. To avoid this ambiguity, we can restrict this formalism to a very simple kind of a semantic network, which has only two types of links, “HAS-A” and “IS-A”. Such a formalism is called Object-Attribute-Value (OAV) triplets, and it provides an underlying knowledge representation framework for the Semantic Web, called Resource Description Framework (RDF). Example: Consider object “airplane”. Some of its attributes are: • number of engines; • type of engines; • type of wing design. Possible values of these attributes are: • number of engines: 2, 3, 4. • type of engines: jet, propeller-driven. • type of wing design: conventional, swept-back.

Object-attribute-value triples (example contd) Object Attribute Value Airplane NumberOfEngines 2 Airplane NumberOfEngines 3 Airplane NumberOfEngines 4 Airplane TypeOfEngines Jet Airplane TypeOfEngines Propeller Airplane TypeOfWings Conventional Airplane TypeOfWings SweptBack Or, as predicates NumberOfEngines (Airplane, 2), … TypeOfEngines (Airplane, Jet), … TypeOfWings (Airplane, Propeller), …

Problems with semantic nets and OAV triplets • There is no standard definition of link and node names. This makes it difficult to understand the network, and whether or not it is designed in a consistent manner. • Inheritance is a combinatorially explosive search, especially if the response to a query is negative. Plus, it is the only inference mechanism built in semantic nets, which may be insufficient for some applications. • Initially, semantic nets were proposed as a model of human associative memory (Quinllian, 1968). But, are they an adequate model? It is believed that human brain contains about 10^10 neurons, and 10^15 links. Consider how long it takes for a human to answer “NO” to a query “Are there trees on the moon?” Obviously, humans process information in a very different way, not as suggested by the proponents of semantic networks. • Semantic nets are logically and heuristically very weak. Statements such as “Some books are more interesting than others”, “No book is available on this subject”, “If a fiction book is requested, do not consider books on history, health and mathematics” cannot be represented in a semantic network.

Frames (Minsky, 1975) Semantic nets represent shallow knowledge, because all of the information must be represented in terms of nodes (propositions) and links (binary relations). What if the objects in the domain are, in turn, complex structures? For example, consider the object “animal”. We may want to incorporate as part of the object’s description all of the important properties of this object with attributes and relations to other objects or categories. The underlying assumption of the frame theory is that when one encounters a new situation (or a substantial change in one’s view of the situation occurs), one selects from their memory the frame representing a given concept and changes it to reflect the new reality. Case-based reasoning systems exclusively utilize frames as a KR formalism.

Problems with Frames • Frames cannot represent exceptions. By definition, frames represent typical objects. But, consider Tom, an ostrich which do not fly. Is he a bird? • Frames lack a well-defined semantics. Consider Sam, a bat. Is he a typical mammal? • It is difficult to incorporate heuristic information into a frame. For example, consider a medical ES which uses frames to represent a typical patient or a typical disease. It is difficult to represent how specific symptoms relate to each other and how they must be used in the diagnostic process. In general, frames are good for representing class hierarchies to model domains where objects are well defined and the classification is clear cut.

Logical reasoning systems