Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B.

Similar presentations


Presentation on theme: "Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B."— Presentation transcript:

1 Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B. Kell, P. Reiser and R.D. King Presenter: Mark H. Rich 2/7/2003 University of Wisconsin - Madison CS 838 Learning and Modeling Biological Networks

2 Discovering Gene Function Yeast (S. cerevisiae) has 6,000 protein- encoding genes Only 60% can be assigned function with confidence The cell is a bio-chemical machine Logic can help us discover these metabolic functions and networks

3 ASE-Progol Robot Scientist Background Knowledge Analysis Learning Engine Results Experiment Selection New Knowledge

4 Outline Introduction Abduction and Active Learning Functional Genomics Metabolism in Logic Experiments Results

5 Logic in AI Deduction Given facts with sound and complete proof theory, show that other facts can be proven Induction Given positive and negative examples of facts and background knowledge, find hypothesis that explains difference between positives and negatives

6 Abduction and TCIE Given a theory and partial facts, discover what facts are missing to form one consistent hypothesis Lateral Thinking Puzzles Presented with a confusing situation There is an Oracle that knows what happened You can only ask yes or no questions

7 The Mysterious Package One day a man received a parcel in the post. Carefully packed was a human arm. He examined it, repacked it and then sent it on to another man. The second man also carefully examined the arm before taking it to the woods and burying it. Why did they do this?

8 The Mysterious Package Was the arm cut off intentionally? Is the arm’s person still alive? Is he a doctor? Did the three men know each other? Are the other men also missing an arm? Were they ever stuck on a desert island with no food, make a pact to each cut off an arm to eat and survive, but were rescued before the doctor could cut off his own arm, and the doctor later fulfilled his commitment? YES!

9 Lateral Thinking Lessons Certain questions are valuable and lead to large leaps of information... How do we form hypotheses? How can we pick good questions? probability that question leads to consistent hypotheses cost of asking question We want to find quickest cheapest path to consistent hypotheses

10 Hypothesis Generation Use contra-positives for inverse entailment Background Knowledge hasbeak(X) :- bird(X). bird(X) :- vulture(X). Example hasbeak(tweety). Hypotheses bird(tweety). bird(X). vulture(tweety). vulture(X).

11 Trial Selection Theory e1e2e3e4 H10111 H21101 H31011 e1 e2 H1 H2H3 One possible trial path t f t f

12 Hypothesis Probability Each trial partitions H into {H [t],H [t’] } Assuming optimal encoding scheme… Prior probability of each hypothesis Compression is rounded f measure

13 Experiment Cost C t is the cost of a trial t

14 Functional Genomics Want to learn gene-enzyme mapping Genes encode for Enzymes that catalyze reactions between Metabolites to eventually create Amino Acid Products Perform auxotrophic growth experiments to determine phenotype

15 Functional Genomics: Simple A, B and C are Enzymes X is ubiquitous metabolite, Y and Z optional If we knock out gene2, we need to add nutrient Z to produce Trp want to learn codes(gene2, B, [Y], [Z]) but only ask: pheno_effect(gene2,[Y]) is false pheno_effect(gene2,[Z]) is true pheno_effect(gene2,[Y,Z]) is true X Y Z Trp gene1gene2gene3 ABC

16 Aromatic amino acid pathway aromatic amino acids enzymes metabolites

17 Metabolism in Logic Hypotheses: codes(‘YDR254W’, ‘4.2.1.11’, [‘C00631’],[‘C00074’]). codes(‘YDR254W’, ‘5.3.1.24’, [‘C04302’],[‘C01302’]). etc... Background Knowledge: enzyme(‘4.2.1.11’,[‘C00631’],[‘C00074’]). enzyme(‘5.3.1.24’,[‘C04302’],[‘C01302’]). etc... generated_by_other_pathways([‘C00002’, ‘C00005’, ‘C00006’,..., ‘C03356’]). ends([‘C00078’, ‘C00079’, ‘C00082’]).

18 Metabolism in Logic What the Oracle answers: phenotypic_effect(ORF, Growth_medium):- generated_by_other_pathways(Ubiquitous_metabolites), union(Ubiquitous_metabolites, Growth_medium, Starts), connected(Starts, Wild_products), ends(Ends), subset(Wild_products, Ends), enz(Enzyme, Reactants, Products), encodes(ORF, Enzyme, Reactants, Products), connected_without_this_step(Starts, Mutant_products, Enzyme, Reactants, Products), not(subset(Mutant_products, Ends)).

19 Experiments Learn function of 17 genes by removing ORF Growth Media 13 optional nutrients, at most 3 at a time 378 possible experiments for each ORF Cost of Optional Nutrients Determined from www.sigmaaldrich.com catalogwww.sigmaaldrich.com Strategies for Comparison Random Naïve Cheapest ASE-Progol

20 Experiments Remove all codes(…) facts Loop Generate random sample of trials Generate hypotheses using Theory Completion by Inverse Entailment Find minimum EC(H,T) trial and perform Add results to known examples until hypotheses consistent with trials

21 Results: Cost

22 Results: Time

23 Conclusions and Future Work ASE-Progol finds hypotheses inexpensively and quickly 5 of 17 genes had only negative examples… why? Look into inhibitors and nonmonotonic logics. Limited answers to yes/no. Probabilities? Can this be applied to gene regulatory networks, using microarray technology? What other networks have similar frameworks?


Download ppt "Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B."

Similar presentations


Ads by Google