Automatic Generation of First Order Theorems

Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network

Overview of Talk • Automated Theory Formation • Principles • Implementation in the HR system • Applications • Application to Theorem Generation • HR adds to the TPTP library • HR becomes a MathWeb service • Future Directions

Scientific Theories • Scientific theories about a domain contain: • Concepts, examples, definitions, • hypotheses, explanations, etc. • e.g. chemistry:acids • Concepts: Acid, Base, Salt • Hypothesis: Acid + Base  Salt + Water • Experiments for plausibility/evidence • Reaction pathways for explanation

Theories in Pure Mathematics • Concepts have examples and definitions • Hypotheses are “conjectures” • Explanations are proofs • Conjectures become “theorems” • e.g pure maths:group theory • Concepts: cyclic groups, Abelian groups • Conjecture: cyclic groups are Abelian • Examples provide empirical evidence • Proof for explanation

HR: Theory Formation Cycle • Start with background knowledge • user-supplied axioms + concepts • Invent a new concept (machine learning) • Look for conjectures empirically (d-mining) • Prove the conjectures (theorem proving) • Disprove the conjectures (model generation) • Assess all concepts w.r.t. new concept • Invent a new concept • Build it from the most interesting old concepts

Inventing New Concepts • Ten General Production Rules (PR) • Work in all domains (math + non math) • Build new concept from one (or two) old ones • Example: Abelian groups • Given: [G,a,b,c] : a*b=c • Compose PR: [G,a,b,c] : a*b=c & b*a=c • Exists PR: [G,a,b] :  c (a*b=c & b*a=c) • Forall PR: [G] :  a b ( c (a*b=c & b*a=c))

Making Conjectures • Theory formation step • Attempt to invent a new concept • Concept has same examples as previous one • HR makes an equivalence conjecture • Concept has no examples • HR makes a non-existence conjecture • HR can also make implication conjectures • Examples of one concept are all examples of another concept

Proving Theorems • HR relies on third party theorem provers • Equivalence conjectures: • Sets of implication conjectures • From which prime implicates are extracted • E.g.  a (a*a=a a=id) • a*a=a  a=id, a=id  a*a=a • HR uses the Otter theorem prover • William McCune • Only uses this for finite algebras

Disproving Non-Theorems • Any conjectures which Otter can’t prove • HR looks for a counterexample • Using the MACE model generator • Also written by William McCune • Other possibilities: CAS, CSP • Counterexamples are added to the theory • Fewer similar non-theorems are made later

Assessing Interestingness • New concepts from interesting old ones • Concepts measured in terms of: • Intrinsic values, e.g. complexity of definition • Relational values, e.g. novelty of categorisation • Concepts also assessed by conjectures • Quality, quantity of conjectures involving conc. • Conjectures also assessed • Difficulty of proof (proof length from Otter) • Surprisingness (of lhs and rhs definitions)

Bootstrapping ATF Cycle

Applications of ATF • Machine Learning • Learn concept definitions: e.g. seq. ext. • Theory for prediction tasks • Theory for puzzle generation • Constraint Satisfaction Problems • Conjectures: induced constraints • Concepts: implied constraints • Mathematical Discovery • Exploration of new domains • Invention of Integer Sequences (NWN)

Application to ATP • Big project: using ATF to improve ATP • Sub-project: • Using AFT to assess ATP programs • Compare first order ATP programs • Using a large set of HR’s conjectures • Facilitate comparison: • Using MathWeb (Zimmer,Franke,…) • Using SystemOnTPTP (Sutcliffe)

First Attempt • Aim: add to the TPTP library • 5882 test problems for first order provers • Otter, SPASS, E, Vampire, etc. • New provers are tested using TPTP • HR produced 46,000 group conjectures • In ten minutes. • Around 200 of these were worthy of TPTP • All provable by SPASS in 120 seconds • 153 provable by only SPASS and E only • 42 provable by only SPASS

Example Theorem • Otter and E could not prove this: • x y (( z (inv(z)=x & z*y=x) &  u (x*u=y &  v (v*x=u & inv(v)=x)))  ( a (inv(a)=x & a*y=x) & • b (b*y=x & inv(b)=y))) [about pairs of identity elements]

Interface of HR into MathWeb • MathWeb project in Saarbrücken • Has access to many first order ATP progs. • E, Otter, SPASS, Vampire, Bliksem, … • Idea: HR passes conjectures to MathWeb • MathWeb translates conjectures using tptp2x • MathWeb calls the provers • Interface • Via sockets at the moment • Later by XMLRPC for better standardization

Additional Implementation • By Zimmer, Colton and Franke • Changes to HR • Improvements in quantity of theorems • Ability to write conjectures in TPTP format • Changes to MathWeb • Calling one prover after another (1000s of times in a row) • Quicker interaction with tptp2x • Integration of the E system

Experiments • Possible experiments: • Which one proves most of HR’s theorems 1st • Compare the average times • How many timeouts for each prover • Watch this space for results….. • Saturday: 9000 group theory theorems proved by SPASS, E & Otter, before a crash! • Preliminary (unsurprising) result • Average times: SPASS < E < Otter

Future Work: MathWeb #1 • Try HR on more provers in MathWeb • Vampire, Bliksem • Offer HR as a new MathWeb service • User says: “Give me 1,000 theorems which SPASS and E take over 10 secs. to prove” • Interface HR and model generators in MW • Use MACE, etc. to disprove theorems • Interface HR and CSP, CAS in MW • Infinite Group theory with Bundy and Sorge

Future Work: MathWeb #2 • Aim: Beat SPASS…… • SPASS is too good for HR in group theory • 46,000 theorems and SPASS proved them all! • Part two of my Calculemus project: • With Jacques Calmet & Clemens Ballarin in Karlsruhe • HR invents new domains • Adds and constrains new operators for finite algebras • “Grow” difficult theorems from prime implicates

Future Work: HR Project • Colton: Express HR as a ML program • Try domains other than maths • Walsh: Integrate HR • With every maths program ever written • Bundy: • Build an automated mathematician

Web Pages • Mathweb: • www.mathweb.org • HR: • www.dai.ed.ac.uk/~simonco/research/hr • NumbersWithNames program: • www.machine-creativity.com/programs/nwn • Demonstration: Tomorrow @ 2pm? Room 208.

Automatic Generation of First Order Theorems

Automatic Generation of First Order Theorems

Presentation Transcript

Automatic Generation of Animated Presentations

Automatic Generation of Animated Presentations

Automatic Metadata Generation

Lightweight Automatic Generation of Traceability Information

First Generation

Automatic Generation of Ontology Editors

Automatic Generation Tools

Automatic Test Generation

Automatic Test Generation

First Generation

Semi-automatic Generation of R2RML Mappings

Automatic Metadata Generation

Automatic Transcript Generation

Automatic PhotoHunt Generation

Automatic First Downs

Automatic Generation of Dynamics Models

Automatic Generation of Verbal Analogy Items

Automatic Interface Generation

Automatic Code Generation

Automatic Picking of First Arrivals

Automatic Test Generation

Automatic Metadata Generation