1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1,

Slides:



Advertisements
Similar presentations
Autonomic Scaling of Cloud Computing Resources
Advertisements

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Background Reinforcement Learning (RL) agents learn to do tasks by iteratively performing actions in the world and using resulting experiences to decide.
The multi-layered organization of information in living systems
Dynamic Bayesian Networks (DBNs)
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Introduction of Probabilistic Reasoning and Bayesian Networks
Bayesian networks and how they can help us to explore fish species interaction in the Northern gulf of St Lawrence Dr Allan Tucker Centre for Intelligent.
Review: Bayesian learning and inference
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel.
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
Extending Evolutionary Programming to the Learning of Dynamic Bayesian Networks Allan Tucker Xiaohui Liu Birkbeck College University of London.
Spatial Operators for Evolving Dynamic Bayesian Networks from Spatio-Temporal Data Allan Tucker Xiaohui Liu David Garway-Heath Moorfields Eye Hospital.
Learning Dynamic Bayesian Networks with Changing Dependencies Allan Tucker Xiaohui Liu IDA 2003.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Who am I and what am I doing here? Allan Tucker A brief introduction to my research
The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker.
Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series X Liu, S Swift & A Tucker Department of Computer Science Birkbeck College.
Generating Robust and Consensus Clusters from Gene Expression Data Allan Tucker a, Stephen Swift a, Xiaohui Liu a, Nigel Martin b, Christine Orengo c,
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
Explaining Multivariate Time Series to Detect Early Problem Signs Architectures and Efficient Learning Algorithms for Dynamic Bayesian Networks Allan Tucker,
Making the Most of Small Sample High Dimensional Micro-Array Data Allan Tucker, Veronica Vinciotti, Xiaohui Liu; Brunel University Paul Kellam; Windeyer.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Bayesian Classification and Forecasting of Visual Field Deterioration Allan Tucker, Xiaohui Liu; Brunel University David Garway-Heath; Moorfield’s Eye.
Non-invasive Techniques for Human Fatigue Monitoring Qiang Ji Dept. of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute
Non-invasive Techniques for Human Fatigue Monitoring Qiang Ji Dept. of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Judgment and Decision Making in Information Systems Computing with Influence Diagrams and the PathFinder Project Yuval Shahar, M.D., Ph.D.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Gene expression analysis
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
CRESST ONR/NETC Meetings, July 2003, v1 17 July, 2003 ONR Advanced Distributed Learning Greg Chung Bill Bewley UCLA/CRESST Ontologies and Bayesian.
A rapid algorithm for generating minimal pathway distances: Pathway distance correlates with genome distance but not enzyme function Stuart Rison 1*, Evangelos.
Making Time: Pseudo Time-Series for the Temporal Analysis of Cross-Section Data Emma Peeling, Allan Tucker Centre for Intelligent Data Analysis Brunel.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
Probability and Statistics in Vision. Probability Objects not all the sameObjects not all the same – Many possible shapes for people, cars, … – Skin has.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard Fenn, Damon Oberholster Supervisors: Anet Potgieter,
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Introduction on Graphic Models
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Perfect recall: Every decision node observes all earlier decision nodes and their parents (along a “temporal” order) Sum-max-sum rule (dynamical programming):
Biological data representation and data mining Xin Chen
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Abstract Our research mainly applies Maximum Likelihood Method (MLE), Dynamic Programming, and Neighbor Joining Method in an attempt of shortening the.
14th Crisp user meeting at UCL1 Some observations on the use of swap elements in staged construction Anthony Swain Professor of Transport Infrastructure.
Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.
Some tools and a discussion.
Presented By S.Yamuna AP/CSE
Learning Bayesian Network Models from Data
Inconsistent Constraints
A Short Tutorial on Causal Network Modeling and Discovery
CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks
Learning Probabilistic Graphical Models Overview Learning Problems.
Probabilistic Reasoning
Presentation transcript:

1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1, Xiaohui Liu 2, Nigel Martin 3, Christine Orengo 4, Stephen Swift 2, Allan Tucker 2 1 Dept of Immunology and Molecular Pathology, UCL, UK 2 Dept of Information Systems and Computing, Brunel University, UK 3 Dept of Computer Science, Birkbeck College, London, WC1E 7HX, UK 4 Dept of Biochemistry and Molecular Biology, UCL, WC1E 6BT, UK

2 Framework Expression Data Clustering Algorithms Cluster Fusion Model Building Clusters Robust Clusters ForecastsExplanations

3 Clustering Algorithms  Hierarchical  The Grouping Genetic Algorithm  K-Means  The Self Organising Map

4 Cluster Fusion (1) Construct Agreement Matrix Clusterfusion Cluster Method 1 Cluster Method 2 Cluster Method N

5 The Agreement Matrix F = To Gene From Gene

6 Viral Gene Expression Data  Kaposi's Sarcoma-Associated Human Herpesvirus 8 (HHV8)  106 viral and human genes  Induced with 12-O-TetradecoylPhorbol 13-Acetate (TPA)  13 Measurements over time  Normalised expression levels

7 Evaluation  Compare cluster similarity using Weighted-Kappa  Compare clusters against biological domain knowledge  Clusterfusion

8 Weighted-Kappa Results Hx :Hierarchical Clustering with x Clusters Kx :K-Means Clustering with x Clusters Sx :Self Organising Map with x Clusters Gx :Grouping Genetic Algorithm with x Clusters

9 Domain Knowledge Results

10 Clusterfusion Results  48 out of 106 genes unassigned  Mostly pairs or triples  Only 3 of feature 2 are present!  Although there are some interesting results, e.g. unknown function genes placed with those of known function

11 Modelling  We have focussed on the Dynamic Bayesian Network Models a temporal domain probabilisticallyModels a temporal domain probabilistically Consists of a graphical representation and conditional probability distributionsConsists of a graphical representation and conditional probability distributions Facilitates the combining of expert knowledge and dataFacilitates the combining of expert knowledge and data Models can be queried to investigate the relationships discovered from dataModels can be queried to investigate the relationships discovered from data Requires data discretisationRequires data discretisation

12 Dynamic Bayesian Networks g0g1g2g3g4g0g1g2g3g4 t-5 t-4 t-3 t-2 t-1 t Genes Time Lag

13 Modelling Results Example DBNs (compact representation without lags included):

14 Forecast Results

15 Explanation  Apply inference given observations about certain nodes: Insert observations into DBNInsert observations into DBN Apply inference back in timeApply inference back in time Construct explanations using posterior probabilitiesConstruct explanations using posterior probabilities

16 Explanation - Results An example explanation using a discovered DBN P(C7 is 2) =1.000 P(H8 is 2) = 0.999P(B12 is 2) = P(C7 is 1) =1.000 P(B6 is 2) = P(A7 is 1) = P(B12 is 1) =

17 Conclusions  Modelling gene expression data is a challenging task  Introduced a framework for modelling such data  Encouraging preliminary results when applied to viral gene expression data  More rigorous testing on different datasets

18 Acknowledgements  Biotechnology and Biological Sciences Research Council (BBSRC), UK  The Engineering and Physical Sciences Research Council (EPSRC), UK