Download presentation
Presentation is loading. Please wait.
1
Education and Computational Biology Dean L. Zeller Kent State University OCCBIO ‘06 July 28-30, 2006
2
Education of Computational BiologySlide 2 of 29 “…the great Tree of Life fills with its dead and broken branches the crust of the earth, and covers the surface with its ever-branching and beautiful ramifications.” Charles Darwin (1809-1882) Father of Evolution
3
Education of Computational BiologySlide 3 of 29 Initial Inspiration Colloquium by Dr. Lonnie Welsh on March 15 th for KSU department of computer science: Extraterrestrials, Cryptanalysis, and Genomes: Perspectives on Bioinformatics Research Looking for new perspectives in bioinformatics. My perspective: educate a younger audience of computational biologists
4
Education of Computational BiologySlide 4 of 29 Outline Goals of research Evolution trees Assignment 1 – Atlas of Evolution Trees Assignment 2 – Atlas of Distance Graphs Assignment 3 – Phylogeny Reconstruction Future Work
5
Education of Computational BiologySlide 5 of 29 Goals of Research Specific Goals Create “teachable” lessons on bioinformatics suitable for a mid-level computer science, mathematics, or biology class. Make use of and create more adequate evolution models. Long Term Goals Discover methods of phylogeny reconstruction from a new perspective. Educate the next generation of computational biologists.
6
Education of Computational BiologySlide 6 of 29 Evolution Tree example Tree inferred by Unweighted Pair Group Method with Arithmetic mean (UPGMA) clustering of the Sarich (1969) immunological distance data set. [Felsenstein, p166]
7
Education of Computational BiologySlide 7 of 29 Evolution Tree example
8
Education of Computational BiologySlide 8 of 29 Class Assignments Assignment 1 – Drawing Trees –The student will use a graphics package to create diagrams of binary evolution trees. Assignment 2 – Phylogenetic Distance Graphs –The student will use a graphics package to construct distance graphs (k-leaf powers) for the evolution trees created in Assignment 1. Assignment 3 – Phylogeny Reconstruction –The student will demonstrate an algorithm of phylogeny reconstruction from the results of theoretical experiments using the incremental k-leaf power. (Tested on CS10051 students, Spring 2006)
9
Education of Computational BiologySlide 9 of 29 Assumptions By making simple assumptions, the problem complexity is greatly reduced. 1.Redundant nodes removed 2.Multiple splits nodes replaced with isomorphic approximations 3.Only consider isomorphically unique trees
10
Education of Computational BiologySlide 10 of 29 Assumption #1 Redundant nodes are removed without loss of data. It is already assumed the species is slowly changing over time. It does not add to the problem to consider a single point along the way.
11
Education of Computational BiologySlide 11 of 29 Assumption #2 Multiple split nodes replaced with isomorphic approximations Some loss of data, but greatly reduces the problem complexity
12
Education of Computational BiologySlide 12 of 29 Assumption #3 Isomorphically unique trees
13
Education of Computational BiologySlide 13 of 29 Assignment 1: Atlas of Evolution Trees Inspired by An Atlas of Graphs [Read and Wilson, 1999] Elegant yet simple way to analyze graphs and trees, useful for instructional purposes. Apply same style to phylogenies.
14
Education of Computational BiologySlide 14 of 29 Atlas of Evolution Trees ( 5 leaves)
15
Education of Computational BiologySlide 15 of 29 Atlas of Evolution Trees (6 leaves)
16
Education of Computational BiologySlide 16 of 29 Assignment 2: Atlas of Distance Graphs (k-leaf powers) Builds on Assignment 1 – create the associative k-leaf powers for each tree. Useful as a reference for studying relationship between clicks, k-leaf powers, and k-leaf roots.
17
Education of Computational BiologySlide 17 of 29 Atlas of Distance Graphs k=2 k=3 k=2k=3k=4
18
Education of Computational BiologySlide 18 of 29 Atlas of Distance Graphs k=2k=3k=4 k=2k=3k=4k=5
19
Education of Computational BiologySlide 19 of 29 Distance Graph Simulator a b d f g h c i e Graph complete k = 2 k = 3 k = 4 k = 5 k = 6 k = 7 k = 8
20
Education of Computational BiologySlide 20 of 29 Phylogeny Reconstruction from Binary Genetic Data Test returns 1 if species x and y are genetically close to a certain degree, and 0 otherwise. Data collected to form a similarity grid and distance graph (k-leaf power).
21
Education of Computational BiologySlide 21 of 29 Reconstruction Step 1 – Difference Summary Table abcdef a 11000 b 1000 c 100 d 11 e 1 f Step 2 – k-leaf power Step 3 – phylogeny (k-leaf root)
22
Education of Computational BiologySlide 22 of 29 Reconstruction Linear time solution exists for k = 3 [Brandstädt and Le, 2006] … and k = 4 [Brandstädt et al, 2006] An open problem for k 5 –Severely limits analysis capability.
23
Education of Computational BiologySlide 23 of 29 Assignment 3: Phylogeny Reconstruction from Discrete Genetic Data Genetic test returns a discrete value (k=2,3,4,…) denoting distance between x and y in tree. Data collected to form a distance grid. Create k-leaf powers incrementally.
24
Education of Computational BiologySlide 24 of 29 Reconstruction Difference Summary Table abcdef a 23566 b 3566 c 455 d 33 e 2 f k 2 k 3 k 4k 5 k 6
25
Education of Computational BiologySlide 25 of 29 Incremental k-leaf power Distance 2 Direct Neighbors Distance 3 Close relatives Distance 4 Tree complete
26
Education of Computational BiologySlide 26 of 29 Literature Review of Related Methods Additive and Ultrametric Trees [Wu and Chao, 2004] Minimum Increment Evolution Tree (MEIT) [Wu and Chao, 2004] Evolutionary Tree Insertion with Minimum Increment (ETIMI) [Wu and Chao, 2004] Maximum Homeomorphic Agreement Subtree (MHT) [Gasieniec et al 1997] Maximum Agreement Subtree (MAST) [Gąsieniec et al, 1997] Maximum Inferred Consensus Tree (MICT) [Lingas et al, 1999] Maximum Inferred Local Consensus Tree (MILCT) [Lingas et al, 1999] Balanced Randomized Tree Splitting (BRTS) [Kao et al, 1999] Merging Partial Evolution Trees (MPET) [Lingas et al, 1999]
27
Education of Computational BiologySlide 27 of 29 Future Work Additional class assignments Implement the Phylogeny Reconstruction Simulator using NetworkX Remove redundant node and isomorphic approximation assumptions
28
Education of Computational BiologySlide 28 of 29 References [Br06a]Brandstädt, A. and V. B. Le (2006). “Structure and Linear Time Recognition of 3-Leaf Powers”, Information Processing Letters (98), 133-138. [Br06b]Brandstädt, A., V.B. Le, and R. Sritharan (2005). “Structure and Linear Time Recognition of 4-Leaf Powers”, Unpublished manuscript. [Fe04]J. Felsenstein (2004). Inferring Phylogenies, Sinauer Associates, Inc. [Ga97]L. Gąsieniec, J. Jansson, A. Lingas, and A. Östlin (1997), “On the complexity of computing evolutionary trees,” Proceedings of Computing and Combinatonics Third Annual International Conference COCOON ’97, Shanghai, China, pp. 134 to 145, Aug 97. [Ka99]Y. Kao, A. Lingas, and A. Östlin (1999), “Balanced Randomized Tree Splitting with Applications to Evolutionary Tree Constructions,” Proceedings of the 16 th Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, pp. 184 to 196, March 1999. [Li99]A. Lingas, H. Olsson, and A. Östlin (1999), “Efficient Merging, Construction, and Maintenance of Evolutionary Trees,” Proceedings of the 26 th International Colloquium on Automata, Languages, and Programming (ICALP) ’99, Prague, Chech Republic, pp. 544 to 553, July 1999. [Re99]Read, R.C. and R.J. Wilson (1999). An Atlas of Graphs, Oxford Science Publications. [Wu04]Wu, B.Y. and K.M. Chao (2004). Spanning Trees and Optimization Problems. Chapman & Hall/CRC.
29
Education of Computational BiologySlide 29 of 29 Thank You The full text of the paper, assignments, this presentation, and student examples are available on the author’s web page: http://www.cs.kent.edu/~dzeller/research
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.