Empirical Explorations with The Logical Theory Machine: A Case Study in Heuristics

Empirical Explorations with The Logical Theory Machine:A Case Study in Heuristics by Allen Newell, J. C. Shaw, & H. A. Simon

The Logic Theory Machine (LT) : a system for finding proofs of theorems in elementary symbolic logic : devised to understand the complex problem-solving process : the first heuristic program (1956)

The Logic Theory Machine in Operation - LT’s Task is to prove theorems in the sentential calculus ( in Principia Mathematica ) - The proof is the sequence of expressions that leads from the axioms and known theorems to the desired expression e.g.) - Why are such problems difficult ? - What features of LT account for its successes and failures ?

Problems, Algorithms, and Heuristics • problem : if he is given some process for generating the elements of a set of possible solutions in some order, and a test for verifying whether a given element of this set is in fact a solution to his problem. • problem solving : if the costs are not too large in relation to the time and computing power available for solution

Problems, Algorithms, and Heuristics • algorithm : the process that guarantee the solution of a given problem. ex) opening a combination safe • heuristic : the process that may solve a given problem but offers no guarantees of doing so. ex) playing chess

E : all sequences of logic expressions P :proof sequence Tx : sequences ending in X Proofs of X • The Problem of Proving Theorems in Logic - the set of all possible sequences of logic expression : E ( given implicitly by rules for generating) - the set of sequences that are proofs : P - the expression to be proved : X - the set of sequences that have X as their final expression : Tx

The difficulty of proving theorems depends on 1. The scarcity of elements in the intersection of P and Tx , relative to the number of elements in E  the cost and speed of the available generators, the cost and speed of making tests (whether P or Tx) 2. Whether generators can be found that guarantee that any element they produce automatically satisfies some of the conditions. 3. What heuristics can be found to guide the selection.

The British Museum Algorithm : reveal the basic nature of theorem proving • the algorithm constructs all possible proofs in systematic manner, checking each time (1) to eliminate duplicates, and (2) to see if the final theorem in the proof coincides with the expression to be proved • the set of n-step proofs : obtained from the set of (n-1)-step proofs by making all the permissible substitutions and replacements, detachments.

Figure 2 . Number of proofs generated by first few steps of British Museum algorithm : only use the simple substitution and the replacements : If specializations are included 11-steps, 1000 theorems : among the first 246 theorems, only 5 theorem ( in chap. 2 of principia), 1000 theorems, one more • how many theorem to prove all theorems (60-odd) : a hundred million • how many times : hundreds of thousands of years

The Logic Theory Machine • Methods : the major type of heuristic that LT uses The Substitution Method : by finding an axiom or proved theorem that can be transformed, by a series of substitutions for variables and replacements of connectives The Detachment Method : if problem B , search for an axiom or theorem “ A implies B ” , A is set up as a new subproblem

Methods (continued) The Chaining Method : use the tansitivity of the relation of implication to create a new subproblem - forward chaining method : “ a implies c ”, search an axiom or theorem “ a implies b ”, set up “ b implies c ” as new subproblem - backward chaining method : search “ b implies c ”, set up “ a implies b ” as subproblem

Methods (continued) • no guarantee : a target axiom or a theorem can be found : the generated subproblem is part of the desired proof sequence, or part of any proof sequence : the beginning of the sequence can be completed with axioms or proved theorems : the combination of four methods comprise a sufficient set of methods to prove all theorems ex) “ p or not-not-not-p ” (2.13)

Methods (continued) • Why do the methods transform LT into an effective problem-solver ? 1. unitizing effect : the methods organize the sequences of individual processing steps into larger units 2. backward approach : only one theorem to be proved, a number of known true theorems

The Executive Routine 1. For a theorem list, the substitution method is tried 2. The detachment method is tried, following the substitution, if fail, the subproblem is added to the subproblem list 3. Forward chaining, backward chaining. 4. For the subproblem list, repeat.

The Executive Routine (continued) This process continues until (1) a proof is found, (2) the time allotted for finding a proof is used up, (3) there is no more available memory space, (4) no untried problems remain on the subproblem list Examples

The Executive Routine (continued) not (p or q) implies not-p (A implies B) implies (not-B implies not-A) (theorem 2.16) subprblems [ p implies (p or q)] by substitution method A implies (A or B)

1 3 2 7 6 4 5 11 10 8 9 12 • The Matching Process • What allows the efficient search ? • the required times : BM algorithm vs LT ( 5~100:1) ex) “ p implies (q implies p) ” (158 sec : 10 sec) p implies (q or p) (axiom 1.3) p implies (not-q or p) p imlies (q implies p)

The Matching Process (continued) • Matching in the Substitution Methods : componentwise matching- a feedback of the results of a substitution or replacement : Of the 52 theorems of Principia, 38 proofs substitution alone - 17 proof • Matching in Detachment and Chaining : matching the subexpressions of the problem or theorems.

Similarity Tests and Descriptions • Screening Process : reject any theorem for matching that has low likelihood of success • The Similarity Test : 1) the maximum numbers of levels , 2) the number of distinct variables, 3) the number of variable places • table 1.

Similarity Tests and Descriptions (cont.) • similarity test의 문제점 : type II error ex) p implies (p or p)  p implies (q or p) : type I error : the cost of similarity test • Effort in LT • measuring efforts : the total number of primitives executed • computing effort and performance : substitution vs detach. vs chain. ( 1: 2 : 3)

Similarity Tests and Descriptions (cont.) - precomputed, recomputed description - the effort required is proportional to the number of theorems considered. - the number of theorems considered is an effort measure for evaluating a heuristic • Evaluation of the Similarity Test : the full similarity vs the modified similarity vs no similarity : when precomputed, full : no (10 to 60 perc. 3534/5206) when recomputed, (26,739/22,914) : modified similarity test (18,035/22,914)

Subproblems figure 6. Distribution of LT’s proofs by effort • the limitation of LT : a “plateau” require quite different heuristics • The Subproblem Tree

The Subproblem Tree • theorem 2.17 : the proof that cost LT 89,000 primitives. (not-q implies not-p) implies (p implies q) (theorem 2.17) 1. A implies not-not-A 2. p implies not-not-p 3. (A implies B) implies [(B implies C) implies (A implies C)] (theorem 2.06) 4. (p implies not-not-p) implies [(not-not-p implies q) implies (p implies q) 5. (not-not-p implies q) implies (p implies q) (det. 4 from 3) 6. (not-A implies B) implies (not-B implies A) 7. (not-q implies not-p) implies (not-not-p implies q) 8. (not-q implies not-p) implies (p implies q) (chain 7 and 5)

The Subproblem Tree step 3 step 4 : the difference in total effort can be attributed largely to the difference in number of subproblems generated : an algorithmic procedure to govern its generation of subproblem, an algorithm to determine the order to try

Modification of the Logic Theory Machine 1. The unit cost of processing subproblems can be reduced. 2. LT can be modified to screen subproblems before they are put on the subproblem list. 3. Another way is to reduce selectively the number of subproblems generated, by limiting the lists for theorems available for generating subproblems. Cf) Fig. 7. A list fo 20 theorems, fig. 8. 5 theorems ex) theorem (2.48) two condition: all theorems vs axiom + one theorem the longer list : in two steps, 51,450 primitives of effort the shorter list : in tree step, 18,558 primitives

Conclusion - Heuristics give the program power to solve problems in reasonable computing time - The limitations of the present program of LT - Some of directions that improvement would have to take to extend it powers to problems at new levels of difficulty

The Definitions of the Sentential Calculus - variablesp, q, r, … A, B, C … - connectives not, or, implies - expression “not-p”, “p or q”, “p implies q” - axiomthe universally true expressions - theorem the true expressions that are derived from axioms by means of various rules of inference - rules of inference formalization of logical operation ( e.g. deduction)

the system of axioms ( p or p ) implies p (1.2) p implies ( q or p ) (1.3) ( p or q ) implies ( q or p ) (1.4) [ p or ( q or r ) ] implies [ q or ( p or r ) ] (1.5) ( p implies q ) implies [ ( r or p ) implies ( r or q ) ] (1.6) • three rules of inference 1. Substitution : “ A” for “ p ” , “ p or q ” for “ p ” 2. Replacement : “ p implies q ” with “ not-p or q ” 3. Detachment : if “ A ” and “ A implies B ”, then “ B”

( p implies not-p) implies not-p(theorem 2.01) 1. ( A or A ) implies A (axiom 1.2) 2. ( not-A or not-A ) implies not-A 3. ( A implies not-A ) implies not-A 4. ( p implies not-p ) implies not-p : LT works for about 10 seconds

not ( p or q ) implies not-p (theorem 2.45) 1. A implies (A or B ) (theorem 2.2) 2. p implies ( p or q ) 3. (A implies B ) implies (not-B implies not-A) (theorem 2.16) 4. [ p implies ( p or q )] implies [ not (p or q) implies not-p) 5. not ( p or q ) implies not-p  works for about 12 minutes and success • [ p or ( q or r ) ] implies [ ( p or q ) or r ] (theorem 2.31)  works for about 23 minutes and fails to prove

Empirical Explorations with The Logical Theory Machine: A Case Study in Heuristics