Poster: Session B #114: 1pm-2pm

Slides:



Advertisements
Similar presentations
Detecting Statistical Interactions with Additive Groves of Trees
Advertisements

Unravelling the biochemical reaction kinetics from time-series data Santiago Schnell Indiana University School of Informatics and Biocomplexity Institute.
Inferring Quantitative Models of Regulatory Networks From Expression Data Iftach Nachman Hebrew University Aviv Regev Harvard Nir Friedman Hebrew University.
Single-cell RNA-Seq Profiling Identified Molecular Signatures And Transcriptional Networks Regulating Lung Maturation Yan Xu Sept, 8, 2014 Cincinnati Children’s.
Computational Modelling of Waddington’s Epigenetic Landscape for Stem Cell Reprogramming Zheng Jie Assistant Professor Medical Informatics Research Lab.
Inferring regulatory networks from spatial and temporal gene expression patterns Y. Fomekong Nanfack¹, Boaz Leskes¹, Jaap Kaandorp¹ and Joke Blom² ¹) Section.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Introduction to virtual engineering László Horváth Budapest Tech John von Neumann Faculty of Informatics Institute of Intelligent Engineering.
Genome-wide mapping of transcription factor Oct4, Sox2 and Nanog binding-sites in mouse embryonic stem cells Genome Institute of Singapore Department of.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Model-based learning: Theory and an application to sequence learning P.O. Box 49, 1525, Budapest, Hungary Zoltán Somogyvári.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
High-throughput genomic profiling of tumor-infiltrating leukocytes
Big data classification using neural network
Pathway Informatics 16th August, 2017
Presenter: Zheng “Alex” Fu, Ph.D. LIAI, Bioinformatics Core
Amos Tanay Nir Yosef 1st HCA Jamboree, 8/2017
Semi-Supervised Clustering
LECTURE 01: Introduction to Algorithms and Basic Linux Computing
Deep Learning Amin Sobhani.
Variational filtering in generated coordinates of motion
Learning gene regulatory networks in Arabidopsis thaliana
RNA-Seq analysis in R (Bioconductor)
Gene expression.
Unsupervised Trajectory Analysis of Single-Cell RNA-Seq and Imaging Data Reveals Alternative Tuft Cell Origins in the Gut  Charles A. Herring, Amrita.
Day 4 Session 22: Questions and follow-up…. James C. Fleet, PhD
Recovering Temporally Rewiring Networks: A Model-based Approach
Machine Learning Today: Reading: Maria Florina Balcan
Computational Tools for Stem Cell Biology
Hierarchical clustering approaches for high-throughput data
Low Dimensionality in Gene Expression Data Enables the Accurate Extraction of Transcriptional Programs from Shallow Sequencing  Graham Heimberg, Rajat.
1 Department of Engineering, 2 Department of Mathematics,
Hidden Markov Models Part 2: Algorithms
1 Department of Engineering, 2 Department of Mathematics,
A Correlated Random Effects Hurdle Model for Detecting Differentially Expressed Genes in Discrete Single Cell RNA Sequencing Data Michael Sekula Department.
1 Department of Engineering, 2 Department of Mathematics,
Pathway Informatics December 5, 2018 Ansuman Chattopadhyay, PhD
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
Schedule for the Afternoon
Single-Cell Transcriptomics Meets Lineage Tracing
Jacob H. Hanna, Krishanu Saha, Rudolf Jaenisch  Cell 
Dimension reduction : PCA and Clustering
A twin approach to unraveling epigenetics
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
SEG5010 Presentation Zhou Lanjun.
Volume 6, Issue 1, Pages e9 (January 2018)
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Data Analysis – Part1: The Initial Questions of the AFCS
Volume 28, Issue 18, Pages e2 (September 2018)
Computational Models of Grid Cells
Volume 7, Issue 3, Pages e12 (September 2018)
Volume 25, Issue 6, Pages (June 2013)
Topological Signatures For Fast Mobility Analysis
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Statistics for genomics
Computational Tools for Stem Cell Biology
The Technology and Biology of Single-Cell RNA Sequencing
Machine Learning and Its Applications in Molecular Biophysics Jacob Andrzejczyk and Harish Vashisth Department of Chemical Engineering, University of New.
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.
Presentation transcript:

Poster: Session B #114: 1pm-2pm HopLand: Single-cell pseudotime recovery using continuous Hopfield network based modeling of Waddington’s epigenetic landscape Jing Guo1,2 and Jie Zheng1,3,4 1Biomedical Informatics Lab, School of Computer Science and Engineering, Nanyang Technological University 2Bioinformatics Institute, Agency for Science, Technology, and Research (A*STAR) 3Genome Institute of Singapore, A*STAR 4Complexity Institute, Nanyang Technological University ISMB/ECCB 2017

Pseudo-time recovery in single-cell data What Biological Time Physical Time Input: DN*S Tree=Mapping(D) Output: Tree multiple time points

Pseudo-time recovery in single-cell data Why Understand the transition of gene expression profiles Increases the temporal resolution

Current methods Wanderlust (Bendall et al., 2014) Wishbone (Setty et al., 2016) Monocle (Trapnell et al., 2014) Diffusion map (Haghverdi et al., 2015) Dimensionality reduction SCUBA (Marco et al., 2014) Modeling system dynamics Topslam (Zwiessele and Lawrence, 2016) Waddington’s epigenetic landscape Current methods… 1. Common… 2. specific one…. Summary: Dimensionality reduction Mapping cells to the latent space drawbacks….. 1. common… 2. specific one….

The overview of HopLand A landscape inferred from single cell sequencing data Individual cells are placed on the surface regions corresponding to their developmental stages The order of cells is determined by the geodesic distances in the landscape

Waddington’s epigenetic landscape Figure from (Mohammad and Baylin, 2010) Attractor system

Continuous Hopfield network (John Hopfield in 1984) 1) continuous variables and predicts continuous responses 2) Neurons == Genes describe the regulatory function of a gene What they do… W The discrete Hopfield network (John Hopfield in 1982) has been used to study biological systems with each neuron representing a gene (Taherian Fard et al., 2016, Maetschke and Ragan, 2014, Lang et al., 2014). discrete variables and predicts discrete responses

1. Data preprocessing Temporal single cell data Physical Time How…….. edgeR, DESeq Identify differentially expressed genes genes are detected in only a subset of cells and such dropout events are thought to be rooted in the stochasticity of single-cell library preparation [8] Dropout effect  Over-dispersion Data types: qPCR, scRNA-seq MAST,SCDE… (Scadden, 2014, Nature Methods)

2. Parameter estimation CHN Initial parameters Generate real trajectories Optimize objective function CHN W There are several parameters in the ODE model of kinetics in Eq. 4.1. To infer these parameters = fi; Ii;Ci;Wij ; i; j = 1; 2; :::;Ng from the sequencing data, an optimization method (Algorithm 4) which ts the simulated and observed single-cell data is proposed based on the premise that a realistic model should be able to generate simulated data consistent with the real data.

3. Landscape construction GP-LVM Dimensionality reduction Generate a grid Get high-dimensional vectors Calculate energies Plot the landscape Altitude Comp2 Comp1 Interpolation

4. Pseudotime recovery Calculate geodesic distances Build the minimum spanning tree

Evaluation Synthetic dataset: Applied to real single-cell gene expression data from different types of biological experiments and compared against other methods, HopLand outperformed most of the other methods in most cases. In addition, our method could also be used to identify key regulators and interactions which is helpful for the understanding of underlying mechanisms.

Dataset: mouse pre-implantation development The expression profiles of 438 cells with 48 genes per cell covering the developmental stage from the 1-cell to 64-cell stages (Guo et al., 2010).

(a) The Waddington’s epigenetic landscape recovered using HopLand (a) The Waddington’s epigenetic landscape recovered using HopLand. (b) The minimum spanning tree constructed from Waddington’s epigenetic landscape. The dots are colored according to the developmental stages of the cells in the dataset of GUO2010.

Mapping gene expression values to the landscape using (a) FGF4, (b) GATA4, (c) CDX2 and (d) SOX2. The value decreases from dark red to white.

The core network found by searching GeneGo MetaCore for interactions overlapping with HopLand’s top interactions and genes.

Summary Poster: Session B #114: 1pm-2pm Limitations & Future work: Continuous Hopfield network Provide a non-linear mapping between the Waddington's epigenetic landscape and the phenotype space. Waddington’s epigenetic landscape Construct the landscape based on biological interactions between genes that allows to simulate real biological processes. Geodesic distance Use the geodesic distances in the landscape to refine the estimation of distances between cells. Waddington's epigenetic landscape which merges the genotype-phenotype mapping with the last developments of genomics on transcriptional regulation, epigenetic modification and signalling pathways, is applied to analyze and visualize the system kinetic. Limitations & Future work: Supervised method ---- unsupervised cluster methods Computationally costly ---- Stochastic gradient descent (SGD), HPC Data noise ---- different types of sequencing technologies Poster: Session B #114: 1pm-2pm

Acknowledgement Jie Zheng The SCE-BII joint PhD program