MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 15, 2013: Brief intro to tangles & Phylogeny.

Slides:



Advertisements
Similar presentations
Lecture 6: Creating a simplicial complex from data. in a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305) Topics in Topology:
Advertisements

HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2011 Hawkes Learning Systems. All rights reserved. Hawkes Learning Systems: College Algebra.
Lecture 5: Triangulations & simplicial complexes (and cell complexes). in a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305)
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Molecular Evolution Revised 29/12/06
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Finding generators for H1.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
DNA TOPOLOGY De Witt Sumners Department of Mathematics Florida State University Tallahassee, FL
Sequencing a genome and Basic Sequence Alignment
Department of Biomedical Informatics Biomedical Data Visualization Kun Huang Department of Biomedical Informatics OSUCCC Biomedical Informatics Shared.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Recombination:. Different recombinases have different topological mechanisms: Xer recombinase on psi. Unique product Uses topological filter to only perform.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
DNA TOPOLOGY De Witt Sumners Department of Mathematics Florida State University Tallahassee, FL
Topological Data Analysis
DNA TOPOLOGY: EXPERIMENTS AND ANALYSIS
Sequencing a genome and Basic Sequence Alignment
Introduction to Phylogenetics
Isabel K. Darcy Mathematics Department University of Iowa
Isabel K. Darcy Mathematics Department University of Iowa ©2008 I.K. Darcy. All rights reserved.
Optional Lecture: A terse introduction to simplicial complexes in a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305)
Isabel K. Darcy Mathematics Department Applied Mathematical and Computational Sciences (AMCS) University of Iowa ©2008.
November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee,
Isabel K. Darcy Mathematics Department Applied Mathematical and Computational Sciences (AMCS) University of Iowa ©2008.
Lecture 3 1.Different centrality measures of nodes 2.Hierarchical Clustering 3.Line graphs.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 9, 2013: Create your own homology. Fall 2013.
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 16, 2013: Persistent homology III Fall 2013.
Recombination:. Different recombinases have different topological mechanisms: Xer recombinase on psi. Unique product Uses topological filter to only perform.
Recombination:. Different recombinases have different topological mechanisms: Xer recombinase on psi. Unique product Uses topological filter to only perform.
Presented By: Farid, Alidoust Vahid, Akbari 18 th May IAUT University – Faculty.
Sept 25, 2013: Applicable Triangulations.
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density.
Introduction to Bioinformatics Resources for DNA Barcoding
TDA is a form of Exploratory Data Analysis (EDA)
Vol. 110 no. 46, 18566–18571,
Difference topology experiments and skein relations
Distance based phylogenetics
We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.
Sept 23, 2013: Image data Application.
“DNA is a helical structure” with “two co-axial molecules.”
Research in Computational Molecular Biology , Vol (2008)
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 20, 2013: Intro to RNA & Topological Landscapes.
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density.
Dec 4, 2013: Hippocampal spatial map formation
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 8, 2013: DNA Topology Fall 2013 course offered.
A Hybrid Algorithm for Multiple DNA Sequence Alignment
Graph Analysis by Persistent Homology
Tutorial: Introduction to DNA topology
Frequency of Nonallelic Homologous Recombination Is Correlated with Length of Homology: Evidence that Ectopic Synapsis Precedes Ectopic Crossing-Over 
Clustering Via Persistent Homology
Complex phylogenetic relationships among sand-dwelling Malawi cichlids
Graphs Chapter 11 Objectives Upon completion you will be able to:
Comparisons of the circulating latent reservoir and rebound viruses.
Origins of Human Virus Diversity
WPV1 complete VP1 gene phylogeny including the RC2010 viruses.
Volume 3, Issue 1, Pages (July 2016)
Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems
SEG5010 Presentation Zhou Lanjun.
Feature- and Order-Based Timing Representations in the Frontal Cortex
Anastasia Baryshnikova  Cell Systems 
What Integration Sites Tell Us about HIV Persistence
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Tangle analysis of protein-DNA complexes.
Visualization of lineage radius increase.
Neighbor-joining tree of the 262 S
Genome Architecture: Domain Organization of Interphase Chromosomes
Lecture 5: Triangulations & simplicial complexes (and cell complexes).
Evolution-informed forecasting of seasonal influenza A (H3N2)‏
Presentation transcript:

MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 15, 2013: Brief intro to tangles & Phylogeny and Persistent Homology Fall 2013 course offered through the University of Iowa Division of Continuing Education Isabel K. Darcy, Department of Mathematics Applied Mathematical and Computational Sciences, University of Iowa http://www.math.uiowa.edu/~idarcy/AppliedTopology.html

Recombination:

Different recombinases have different topological mechanisms: Ex: Cre recombinase can act on both directly and inversely repeated sites. Xer recombinase on psi. Unique product Uses topological filter to only perform deletions, not inversions

PNAS 2013

Tangle Analysis of Protein-DNA complexes

Mathematical Model Protein = = = = DNA =

protein = three dimensional ball protein-bound DNA = strings. C. Ernst, D. W. Sumners, A calculus for rational tangles: applications to DNA recombination, Math. Proc. Camb. Phil. Soc. 108 (1990), 489-515. protein = three dimensional ball protein-bound DNA = strings. Protein-DNA complex Heichman and Johnson Slide (modified) from Soojeong Kim

Solving tangle equations Tangle equation from: Path of DNA within the Mu transpososome. Transposase interactions bridging two Mu ends and the enhancer trap five DNA supercoils. Pathania S, Jayaram M, Harshey RM. Cell. 2002 May 17;109(4):425-36.

vol. 110 no. 46, 18566–18571, 2013 http://www.pnas.org/content/110/46/18566.full

Background

http://ghr.nlm.nih.gov/handbook/mutationsanddisorders/possiblemutations

http://ghr.nlm.nih.gov/handbook/mutationsanddisorders/possiblemutations

http://ghr.nlm.nih.gov/handbook/mutationsanddisorders/possiblemutations

Recombination:

Homologous recombination http://en.wikipedia.org/wiki/File:HR_in_meiosis.svg

http://www.web-books.com/MoBio/Free/Ch8D2.htm

Where do we get distances from? Distances can be derived from Multiple Sequence Alignments (MSAs). The most basic distance is just a count of the number of sites which differ between two sequences divided by the sequence length. These are sometimes known as p-distances. Cat Dog Rat Cow 0.2 0.4 0.7 0.5 0.6 0.3 Cat ATTTGCGGTA Dog ATCTGCGATA Rat ATTGCCGTTT Cow TTCGCTGTTT http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Perfectly “tree-like” distances Cat Dog Rat 3 4 5 Cow 6 7 Cat Rat 2 1 1 2 4 Dog Cow http://www.allanwilsoncentre.ac.nz/massey/fms/AWC/download/SK_DistanceBasedMethods.ppt

Cat Dog Rat 3 4 5 Cow 6 7 Cat Dog Rat Cow Rat Dog Cat 3 4 5 Cow 6 7 1 2 4 Rat Dog Cat 3 4 5 Cow 6 7 Rat Dog Cat Cow 1 2 4

Linking algebraic topology to evolution. Linking algebraic topology to evolution. (A) A tree depicting vertical evolution. (B) A reticulate structure capturing horizontal evolution, as well. (C) A tree can be compressed into a point. (D) The same cannot be done for a reticulate structure without destroying the hole at the center. Chan J M et al. PNAS 2013;110:18566-18571 ©2013 by National Academy of Sciences

Linking algebraic topology to evolution. Linking algebraic topology to evolution. (A) A tree depicting vertical evolution. (B) A reticulate structure capturing horizontal evolution, as well. (C) A tree can be compressed into a point. (D) The same cannot be done for a reticulate structure without destroying the hole at the center. Reticulation Chan J M et al. PNAS 2013;110:18566-18571 ©2013 by National Academy of Sciences

Reassortment http://www.virology.ws/2009/06/29/reassortment-of-the-influenza-virus-genome/

Reconstructing phylogeny from persistent homology of avian influenza HA. (A) Barcode plot in dimension 0 of all avian HA subtypes. Reconstructing phylogeny from persistent homology of avian influenza HA. (A) Barcode plot in dimension 0 of all avian HA subtypes. Each bar represents a connected simplex of sequences given a Hamming distance of ε. When a bar ends at a given ε, it merges with another simplex. Gray bars indicate that two simplices of the same HA subtype merge together at a given ε. Solid color bars indicate that two simplices of different HA subtypes but same major clade merge together. Interpolated color bars indicate that two simplices of different major clades merge together. Colors correspond to known major clades of HA. For specific parameters, see SI Appendix, Supplementary Text. (B) Phylogeny of avian HA reconstructed from the barcode plot in A. Major clades are color-coded. (C) Neighbor-joining tree of avian HA (SI Appendix, Supplementary Text). ©2013 by National Academy of Sciences Chan J M et al. PNAS 2013;110:18566-18571

Persistent homology of reassortment in avian influenza. Persistent homology of reassortment in avian influenza. Analysis of (A) HA and (B) NA reveal no significant one-dimensional topological structure. (C) Concatenated segments reveal rich 1D and 2D topology, indicating reassortment. For specific parameters, see SI Appendix, Supplementary Text. (D) Network representing the reassortment pattern of avian influenza deduced from high-dimensional topology. Line width is determined by the probability that two segments reassort together. Node color ranges from blue to red, correlating with the sum of connected line weights for a given node. For specific parameters, see SI Appendix, Supplementary Text. (E) b2 polytope representing the triple reassortment of H7N9 avian influenza. Concatenated genomic sequences forming the polytope were transformed into 3D space using PCoA (SI Appendix, Supplementary Text). Two-dimensional barcoding was performed using Vietoris–Rips complex and a maximum scale ε of 4,000 nucleotides. Chan J M et al. PNAS 2013;110:18566-18571 ©2013 by National Academy of Sciences

Persistent homology of reassortment in avian influenza Persistent homology of reassortment in avian influenza. Analysis of (A) HA and (B) NA reveal no significant one-dimensional topological structure. (C) Concatenated segments reveal rich 1D and 2D topology, indicating reassortment. For specific parameters, see SI Appendix, Supplementary Text. (D) Network representing the reassortment pattern of avian influenza deduced from high-dimensional topology. Line width is determined by the probability that two segments reassort together. Node color ranges from blue to red, correlating with the sum of connected line weights for a given node. For specific parameters, see SI Appendix, Supplementary Text. (E) b2 polytope representing the triple reassortment of H7N9 avian influenza. Concatenated genomic sequences forming the polytope were transformed into 3D space using PCoA (SI Appendix, Supplementary Text). Two-dimensional barcoding was performed using Vietoris–Rips complex and a maximum scale ε of 4,000 nucleotides. http://www.pnas.org/content/110/46/18566.full http://www.sciencemag.org/content/312/5772/380.full http://www.virology.ws/2009/04/30/structure-of-influenza-virus/

Reassortment http://www.virology.ws/2009/06/29/reassortment-of-the-influenza-virus-genome/

Barcoding plots of HIV-1 reveal evidence of recombination in (A) env, (B), gag, (C) pol, and (D) the concatenated sequences of all three genes. Barcoding plots of HIV-1 reveal evidence of recombination in (A) env, (B), gag, (C) pol, and (D) the concatenated sequences of all three genes. One-dimensional topology present for alignments of individual genes as well as the concatenated sequences suggests recombination. (E) b2 polytope representing a complex recombination event with multiple parental strains. Sequences of the [G4] generator of concatenated HIV-1 gag, pol, and env were transformed into 3D space using PCoA (SI Appendix, Supplementary Text) of the Nei–Tamura pairwise genetic distances. Two-simplices from the [G4] generator defined a polytope whose cavity represents a complex recombination. Each vertex of the polytope corresponds to a sequence that is color-coded by HIV-1 subtype. For specific parameters, see SI Appendix, Supplementary Text. ©2013 by National Academy of Sciences Chan J M et al. PNAS 2013;110:18566-18571

Left-over slides

Persistent homology characterizes topological features of vertical and horizontal evolution. Persistent homology characterizes topological features of vertical and horizontal evolution. Evolution was simulated with and without reassortment (SI Appendix, Supplementary Text). (A) A metric space of pairwise genetic distances d(i,j) can be calculated for a given population of genomic sequences g1,…, gn. We visualize these data points using principal coordinate analysis (PCoA) (SI Appendix, Supplementary Text). (B) In the construction of simplicial complexes, two genomes are considered related (joined by a line) if their genetic distance is smaller than ε. Three genomes within ε of each other form a triangle, and so on (SI Appendix, Supplementary Text). From there, we calculate the homology groups at different genetic scales. In the barcode, each bar in different dimensions represents a topological feature of a filtration of simplicial complexes persisting over an interval of ε. A one-dimensional cycle (red highlight) exists at ε = [0.13, 0.16 Hamming distance] and corresponds to a reticulate event. The evolutionary scales I where b1 = 0 are highlighted in gray. Chan J M et al. PNAS 2013;110:18566-18571 ©2013 by National Academy of Sciences

The tangle equations corresponding to the electron micrograph:

Different recombinases have different topological mechanisms:

TopoICE in Rob Scharein’s KnotPlot.com There are an infinite number of solutions to Can solve by using TopoICE in Rob Scharein’s KnotPlot.com

http://en.wikipedia.org/wiki/File:Cdmb.svg

http://en.wikipedia.org/wiki/File:Centraldogma_nodetails.GIF http://ghr.nlm.nih.gov/handbook/mutationsanddisorders/possiblemutations

http://ghr.nlm.nih.gov/handbook/mutationsanddisorders/possiblemutations

http://www. sciencemag. org/content/277/5326/690. full. pdf, Science

www.biophysics.org/Portals/1/PDFs/Education/Vologodskii.pdf

Knotplot.com

Knotplot.com .

Knotplot.com .

GEL VELOCITY IDENTIFIES KNOT COMPLEXITY * 07/16/96 GEL VELOCITY IDENTIFIES KNOT COMPLEXITY Vologodskii et al, JMB 278 (1988), 1 *

DNA is Crowded in the Cell * 07/16/96 DNA is Crowded in the Cell *

Radial Loop Chromosome * 07/16/96 Radial Loop Chromosome *

Replication Obstruction * 07/16/96 Replication Obstruction *

DNA Structure

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC38453/?page=1

Typical simulated conformation of a knotted DNA with a hairpin G segment (red). Typical simulated conformation of a knotted DNA with a hairpin G segment (red). Another segment of the 7-kb model chain is inside the hairpin in this conformation, which was selected from the set generated by a Metropolis Monte Carlo procedure. Vologodskii A V et al. PNAS 2001;98:3045-3049 ©2001 by National Academy of Sciences