Reverse engineering gene networks using singular value decomposition and robust regression M.K.Stephen Yeung Jesper Tegner James J. Collins.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Chapter Outline 3.1 Introduction
Pattern Recognition and Machine Learning
AGC DSP AGC DSP Professor A G Constantinides©1 Modern Spectral Estimation Modern Spectral Estimation is based on a priori assumptions on the manner, the.
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
Solving Linear Systems (Numerical Recipes, Chap 2)
1 Chapter 4 Interpolation and Approximation Lagrange Interpolation The basic interpolation problem can be posed in one of two ways: The basic interpolation.
Iterative Methods and QR Factorization Lecture 5 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Visual Recognition Tutorial
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Chapter 5 Orthogonality
Principal Component Analysis
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Tirgul 9 Amortized analysis Graph representation.
Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
Development of Empirical Models From Process Data
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
ECIV 301 Programming & Graphics Numerical Methods for Engineers REVIEW II.
Lecture 9 Interpolation and Splines. Lingo Interpolation – filling in gaps in data Find a function f(x) that 1) goes through all your data points 2) does.
Linear and generalised linear models
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Gene Network Inference From Microarray Data
Linear and generalised linear models
Basics of regression analysis
Assigning Numbers to the Arrows Parameterizing a Gene Regulation Network by using Accurate Expression Kinetics.
Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM.
Radial Basis Function Networks
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Computing the Fundamental matrix Peter Praženica FMFI UK May 5, 2008.
 Row and Reduced Row Echelon  Elementary Matrices.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
CISE301_Topic11 CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4:
© 2005 Yusuf Akgul Gebze Institute of Technology Department of Computer Engineering Computer Vision Geometric Camera Calibration.
1 Integer transform Wen - Chih Hong Graduate Institute of Communication Engineering National Taiwan University, Taipei,
Boundary Value Problems and Least Squares Minimization
SVD: Singular Value Decomposition
Control Engineering Lecture# 10 & th April’2008.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
6. Introduction to Spectral method. Finite difference method – approximate a function locally using lower order interpolating polynomials. Spectral method.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 4. Least squares.
Efficient computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel.
Numerical Methods.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
College Algebra Sixth Edition James Stewart Lothar Redlin Saleem Watson.
CHAP 3 WEIGHTED RESIDUAL AND ENERGY METHOD FOR 1D PROBLEMS
Chapter 2 Determinants. With each square matrix it is possible to associate a real number called the determinant of the matrix. The value of this number.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Optimization in Engineering Design 1 Introduction to Non-Linear Optimization.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Fourier Approximation Related Matters Concerning Fourier Series.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Geology 5670/6670 Inverse Theory 6 Feb 2015 © A.R. Lowry 2015 Read for Mon 9 Feb: Menke Ch 5 (89-114) Last time: The Generalized Inverse; Damped LS The.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Depth from disparity (x´,y´)=(x+D(x,y), y)
Singular Value Decomposition
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
Outline Sparse Reconstruction RIP Condition
Presentation transcript:

Reverse engineering gene networks using singular value decomposition and robust regression M.K.Stephen Yeung Jesper Tegner James J. Collins

General idea Reverse-engineer: Genome-wide scale Small amount of data No prior knowledge Using SVD for a family of possible solutions Using robust regression to choose from them

If the system is near a steady state, dynamics can be approximated by linear system of N ODEs: x i = concentration of mRNA (reflects expression level of genes) λ i = self-degradation rates b i = external stimuli ξ i = noise W ij = type and strength of effect of jth gene on ith gene

Suppositions made: No time-dependency in connections (so W is not time-dependent), and they are not changed by the tests System near steady state Noise will be discarded, so exact measurements are assumed can be calculated exactly enough

In M experiments with N genes, each time apply stimuli (b 1,…,b N ) to the genes measure concentrations of N mRNAs (x 1,…,x N ) using a microarray You get: subscript i = mRNA number superscript j = experiment number

Goal is to use as few measurements as possible. By this method (with exact measurements): M = O(log(N)) e.g. in 1st test, the results will be:

System becomes: With A = W + diag(-λ i ) Compute by using several measurements of the data for X. (e.g. using interpolation) Goal = deduce W (or A) from the rest If M=N, compute (X T ) -1, but mostly M << N (this is our goal: M = log(N))

Therefore, use SVD (to find least squares sol.): Here, U and V are orthogonal (U T = U -1 ) and W is diag(w 1,…,w N ) with w i the singular values of X Suppose all w i = 0 are in the beginning, so w i = 0 for i = 1…L and w i ≠ 0 (i=L+1...L+N)

Then the least squares (L2) solution to the problem is: With 1/w j replaced by 0 if w j = 0 So this formula tries to match every datapoint as closely as possible to the solution.

But all possible solutions are: with C = (c ij ) NxN where c ij = 0 if j > L and otherwise just a scalar coefficient How to choose from the family of solutions ? The least squares method tries to match every datapoint as closely as possible →a not-so-sparse matrix with a lot of small entries.

1.Basing on prior biological knowledge, impose this on the solutions. e.g.: when we know 2 genes are related, the solution must reflect this in the matrix 2.Work from the assumption that normal gene networks are sparse, and look for the matrix that is most sparse thus: search c ij to maximize the number of zero-entries in A

So: get as much zero-entries as you can therefore get a sparse matrix the non-zero entries form the connections fit as much measurements as you can, exactly: “robust regression” (So you suppose exact measurements)

Do this using L1 regression. Thus, when considering we want to “minimize” A. The L1 regression idea is then to look for the solution C where is minimal. This causes as many zeros as possible. Implementation was done using the simplex method (linear adjustment method)

Thus, to reverse-engineer a network of N genes, we “only” need Mc = O(logN) experiments. Then Mc << N, and the computational cost will be O(N 4 ) (Brute-force methods would have a cost of O(N!/(k!(N-k)!)) with k non-zero entries)

Test 1 Create random connectivity matrix: for each row, select k entries to be non-zero - k < kmax << N (to impose sparseness) - non-zero entry random from uniform distrib. Do random perturbations Do measurements while system relaxes back to its previous steady state → X Computeby interpolation Do this M times

Test 1 Then apply algorithm to become approximation of A Computed error (with the computed A):

Results: Mc = O(log(N)) Better than only SVD, without regression:

Test 2 One-dimensional cascade of genes Result for N = 400: Mc = 70

Test 3 Large sparse gene network, with ran- dom connections, external stimuli,… Results the same as in previous tests

Discussion Advantages: Very few data needed, in comparison with neural networks, Bayesian models No prior knowledge needed Easy to parallelize, as it recovers the connectivity matrix row by row (gene by gene) Also applicable to protein networks

Discussion Disadvantages: Less efficient for small networks (M≈N) No quantification yet of the necessary “sparseness”, though avg. 10 connections is good for a network containing > 200 genes Uncertain Especially useful with exact data, which we don’t have

Improvements Other algorithms to impose sparseness: alternatives are possible both for L1 (basic criterion) as for simplex (implementation) By using a deterministic linear system of ODEs, a lot has been neglected (noise, time delays, nonlinearities) Connections could change by experiments; then the use of time-dependent W is necessary