A Theory of Theory Formation Simon Colton Universities of Edinburgh and York.

Slides:



Advertisements
Similar presentations
Artificial Intelligence 9. Resolution Theorem Proving
Advertisements

Discrete Mathematics Lecture 3
Mathematical Induction
Introduction to Proofs
1 In this lecture  Number Theory ● Rational numbers ● Divisibility  Proofs ● Direct proofs (cont.) ● Common mistakes in proofs ● Disproof by counterexample.
Proofs, Recursion and Analysis of Algorithms Mathematical Structures for Computer Science Chapter 2 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProofs,
Elementary Number Theory and Methods of Proof
Notes 1.1.
Automated Exploration of Bioinformatics Spaces Simon Colton Computational Bioinformatics Laboratory.
Induction and recursion
Logic and Proof. Argument An argument is a sequence of statements. All statements but the first one are called assumptions or hypothesis. The final statement.
Proofs, Recursion and Analysis of Algorithms Mathematical Structures for Computer Science Chapter 2.1 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProofs,
The essential quality of a proof is to compel belief.
Copyright © Cengage Learning. All rights reserved.
Automated Puzzle Generation Simon Colton Universities of Edinburgh and York.
1.3 – AXIOMS FOR THE REAL NUMBERS. Goals  SWBAT apply basic properties of real numbers  SWBAT simplify algebraic expressions.
Automated Theory Formation for Tutoring Tasks in Pure Mathematics Simon Colton, Roy McCasland, Alan Bundy, Toby Walsh.
ILP for Mathematical Discovery Simon Colton & Stephen Muggleton Computational Bioinformatics Laboratory Imperial College.
Creative Logic Programming Simon Colton Computational Bioinformatics Laboratory Imperial College London.
The HOMER System for Discovery in Number Theory Simon Colton Imperial College, London.
Chapter 5 Number Theory © 2008 Pearson Addison-Wesley. All rights reserved.
 2012 Pearson Education, Inc. Slide Chapter 5 Number Theory.
SECTION 5-3 Selected Topics from Number Theory Slide
Methods of Proof & Proof Strategies
Mathematical Maxims and Minims, 1988
Lakatos-style Methods in Automated Reasoning Alison Pease University of Edinburgh Simon Colton Imperial College, London.
Automated Theory Formation: First Steps in Bioinformatics Simon Colton Computational Bioinformatics Laboratory.
The TM System for Repairing Non-Theorems Alison Pease – University of Edinburgh Simon Colton – Imperial College, London.
Mathematics – A new Domain for Datamining? Simon Colton Universities of Edinburgh & York United.
Numbers, Operations, and Quantitative Reasoning.
Automated Theory Formation in Bioinformatics Simon Colton Computational Bioinformatics Lab Imperial College, London.
Automated Reasoning for Classifying Finite Algebras Simon Colton Computational Bioinformatics Laboratory Imperial College, London.
Methods of Proof. This Lecture Now we have learnt the basics in logic. We are going to apply the logical rules in proving mathematical theorems. Direct.
Week 15 - Wednesday.  What did we talk about last time?  Review first third of course.
Descriptive ILP for Mathematical Discovery Simon Colton Computational Bioinformatics Lab Department of Computing Imperial College, London.
Methods of Proofs PREDICATE LOGIC The “Quantifiers” and are known as predicate quantifiers. " means for all and means there exists. Example 1: If we.
1 Sections 1.5 & 3.1 Methods of Proof / Proof Strategy.
Cyclic Groups (9/25) Definition. A group G is called cyclic if there exists an element a in G such that G =  a . That is, every element of G can be written.
Methods of Proof Lecture 3: Sep 9. This Lecture Now we have learnt the basics in logic. We are going to apply the logical rules in proving mathematical.
Edinburgh and Calculemus Simon Colton Universities of Edinburgh and York.
Math 3121 Abstract Algebra I Lecture 9 Finish Section 10 Section 11.
The HR Program for Theorem Generation Simon Colton Mathematical Reasoning Group University of Edinburgh.
Making Conjectures About Maple Functions Simon Colton Universities of Edinburgh & York.
Working Group 4 Creative Systems for Knowledge Management in Life Sciences.
A Theory of Theory Formation Simon Colton Universities of Edinburgh and York.
Chap 3 –A theorem is a statement that can be shown to be true –A proof is a sequence of statements to show that a theorem is true –Axioms: statements which.
Methods of Proof Dr. Yasir Ali. Proof A (logical) proof of a statement is a finite sequence of statements (called the steps of the proof) leading from.
Automated Discovery in Pure Mathematics Simon Colton Universities of Edinburgh and York.
The Homer System Simon Colton – Imperial College, London Sophie Huczynska – University of Edinburgh.
Math 344 Winter 07 Group Theory Part 2: Subgroups and Isomorphism
Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.
Lesson 1.2 Inductive Reasoning Pages Observe Look for patterns Develop a hypothesis (or conjecture) Test your hypothesis.
Automated Theorem Discovery Simon Colton Universities of Edinburgh and York.
Chapter 5. Section 5.1 Climbing an Infinite Ladder Suppose we have an infinite ladder: 1.We can reach the first rung of the ladder. 2.If we can reach.
Calculation Invention and Deduction Dr. Simon Colton Imperial College London (Formerly at Edinburgh) YVR in Karlsruhe & Saarbrucken.
Machine Creativity Edinburgh Simon Colton Universities of Edinburgh and York.
MA/CSSE 473 Day 10 Primality Testing. MA/CSSE 473 Day 10 In-class exam: Friday, Sept 28 –You may bring a two-sided 8.5x11 inch piece of paper containing.
Foundations of Discrete Mathematics Chapter 1 By Dr. Dalia M. Gil, Ph.D.
Methods of Proof Lecture 4: Sep 20 (chapter 3 of the book, except 3.5 and 3.8)
Discovery Systems Author: Kenneth W. Hasse Jr. Presenter: Peter Yoon.
Week 15 - Wednesday.  What did we talk about last time?  Review first third of course.
Mathematical Induction
ESFOR Panel Application Developers’ Wish Lists for Automated Theorem Provers.
Objective - To use properties of numbers in proofs. Logical Reasoning Deductive ReasoningInductive Reasoning - process of demonstrating that the validity.
Lecture 2: Proofs and Recursion. Lecture 2-1: Proof Techniques Proof methods : –Inductive reasoning Lecture 2-2 –Deductive reasoning Using counterexample.
Chapter 1 Logic and Proof.
Methods of Proof A mathematical theorem is usually of the form pq
Induction and recursion
CS 220: Discrete Structures and their Applications
Copyright © Cengage Learning. All rights reserved.
Presentation transcript:

A Theory of Theory Formation Simon Colton Universities of Edinburgh and York

Overview What is a theory? Four components of the theory of ATF – Techniques inside the components Cycles of theory formation – Case Studies Applications (briefly) – Of both the theories and the process

What is a Theory? Theories are (minimally) a collection of: – Objects of interest – Concepts about the objects – Hypotheses relating the concepts – Explanations which prove the hypotheses Finite Group Theory: – All cyclic groups are Abelian Inorganic Chemistry: – Acid + Base  Salt + Water

So, We Require: Object Generator Concept Generator Hypothesis Generator Explanation Generator

In Principle, These Could Be: Database, CAS, CSP, Model Generator Machine Learning Program Data Mining Program ATP System, Pathway Finder, Visualisation

In Practice, Current Implementation: Database, Model Generator, (CAS, CSP nearly) The HR Program ATP Systems

Object Generation and Explanation Generation Object Generation: – Machine learning – reading a file, database In Mathematics – CSP (e.g., FINDER, Solver), CAS (e.g., Maple) – Davis Putnam method (e.g., MACE) – Resolution Theorem Proving (e.g., Otter) HR must be able to communicate – Read models and concepts from MACE’s output – Read proofs and statistics from Otters output

Concept Generation Build a new concept from old ones – 10 general production rules (demonstrated later) – Produce both a definition and examples Throw away concepts using definitions – Tidy definitions up – Repetitions, function conflict, negation conflict Decide which concepts to use for construction – Plethora of measures of interestingness – Weighted sum of measures

Concept Generation: Lakatos-inspired Techniques Monster Barring – Remove an object of interest from theory Counterexample Barring – Except a finite subset of objects from a theorem – E.g., all primes except 2 are odd Concept Barring – Except a concept from a theorem – All integers other than squares have an even number of divisors Credit to Alison Pease

Hypothesis Generation: Finding Empirical Relationships Equivalence conjectures – One concept has the same examples as another Subsumption conjectures – All examples of one concept are examples of other Non-existence conjectures – A concept has no examples Assessment of conjectures – Used to assess the concepts mentioned in them

Hypothesis Generation Extracting Prime Implicates Extract implications, then prime implicates Equivalence conjectures are split: – A & B & C  D & E & F becomes – A & B & C  D, A & B & C  E, etc. Non-existence conjectures are split: – ¬(A & B & C) becomes: A & B  ¬C, etc. Extract Prime implicates: – A & B & C  D, try A  D, then B  D, C  D, then A & B  D, etc.

Hypothesis Generation: Imperfect Conjectures User sets a percentage minimum, say 80% Near-subsumption conjectures – E.g., primes  odd (99% true) – Also returns the counterexamples: here, 2 Near-equivalence conjectures – Prime  odd (70% true) Applicability conjectures – A concept has a (small) finite number of examples – E.g., even prime numbers: 2 is only example

Cycles of Theory Formation How the individual techniques are employed Concept driven conjecture making – Finding conjectures to help understand concepts – Exploration techniques Conjecture driven concept formation – Inventing concepts to fix faulty conjectures – Imperfect conjectures, Lakatos techniques

Concept Driven Cycle (cut-down) Invent Concept EquivalenceNon ExistenceNew Concept Subsumptions Implications Reject

Concept Driven Cycle Continued Implications Counterexample Proof Prime Implicates CounterexampleProof

Conjecture Driven Cycle Invent ConceptReject Near EquivalenceApplicabilityNear Subsumption Concept Barring Counterex Barring New/Old Concept Equivalence Implications Monster Barring New Concept Counterex Barring Concept Barring New/Old Concept

Case Study: Groups Given: Group theory axioms

Case Study: Groups MACE model generator finds a model of size 1 Davis Putnam Method

Case Study: Groups Extracts concepts: Element, Multiplication, Identity, Inverse HR Reads MACE’s Output

Case Study: Groups Invents the concept idempotent elements (a*a=a) Match Production Rule

Case Study: Groups Makes Conjecture: a*a=a  a is the identity element Equivalence Finding

Case Study: Groups Otter proves this in less than a second Resolution Theorem Proving

Case Study: Groups a*a = a  a=identity, a=identity  a*a=a End of cycle Extracts Prime Implicates

Case Study: Groups Later: Invents the concept of triples of elements (a,b,c) for which a*b=c & b*a=c Compose Production Rule

Case Study: Groups Invents concept of pairs (a,b) for which there exists an element c such that: a*b=c & b*a=c Exists Production Rule

Case Study: Groups Invents the concept of groups for which all pairs of elements have such a c: Abelian groups Forall Production Rule

Case Study: Groups Makes the Conjecture: G is a group if and only if it is Abelian Equivalence Finding

Case Study: Groups Otter fails to prove this conjecture Sorry

Case Study: Groups MACE finds a counterexample: Dihedral Group of size 6 (non-Abelian) Davis Putnam Method

Case Study: Groups Concept of Abelian groups allowed into theory Theory recalculated in light of new object of interest Assessment of Concepts

Case Study: Goldbach Given: Integers 1 to 100, Concepts: Divisors, Addition

Case Study: Goldbach Invents: Even Numbers (divisible by 2) Split Production Rule

Case Study: Goldbach Invents: Number of Divisors (tau function) Size Production Rule

Case Study: Goldbach Invents: Prime numbers (2 divisors) Split Production Rule

Case Study: Goldbach Half an hour later: Invents: Goldbach numbers (sum of 2 primes) Compose Production Rule

Case Study: Goldbach Conjectures: Even numbers are Goldbach numbers (with one exception, the number 2) Near Equivalence Finding

Case Study: Goldbach Forces: Concept of being the number 2 Counterexample Barring (Split)

Case Study: Goldbach Forces concept: Even numbers except 2 Counterexample Barring (Negate)

Case Study: Goldbach Conjectures: Even numbers except 2 are Goldbach Numbers (Goldbach’s Conjecture) Subsumption Finding

Case Study: Goldbach Passes the conjecture to an inductive theorem prover? Absolutely No Chance

Applications of Theories Puzzle generation – Which is the odd one out: 4, 9, 16, 24 – Which is the odd one out: 2, 9, 8, 3 Problem generation – TPTP library: find theorem to differentiate Spass & E – See AI and Maths paper Prediction tests: (e.g., Progol animals file) – P(mammal | has_milk) = 1.0 – P(mammal | habitat(water)) = – Take average over all Bayesian probabilities

Applications of Theory Formation Identifying concepts (e.g., Michalski trains) – Forward look ahead mechanism (see ICML-00 paper) Simplifying problems – Lemma generation for ATP – Constraint generation for CSP (see CP-01 paper) Identifying outliers – How unique an object of interest is Inventing concepts – Integer sequences (and conjectures), – See AAAI-00 paper, Journal of Integer Sequences

Conclusions Presented a snapshot of the theory of ATF – Autonomous – Four components, numerous techniques – Uses third party software – Concept driven and conjecture driven cycles Applies to many machine learning tasks – Concept identification, puzzle generation, – Predictions, problem simplification

Welcome to the Next Level For any of the four components – Substitute a human for interactive ATF – Roy McCasland (hopefully), mathematician – Work on Zariski spaces with HR For any of the four components – Substitute another agent for multi-agent ATF – Alison Pease’s PhD, cognitive modelling – Lakatos style reasoning and machine creativity

Theory Formation in Bioinformatics? Can work with non-maths data Can form near-conjectures Needs to relax notion of equality Multi-agent approach definitely needed