Construction of large signaling pathways using an adaptive perturbation approach with phosphoproteomic data Ioannis N. Melas, Alexander Mitsos, Dimitris.

Slides:



Advertisements
Similar presentations
Protein – Protein Interactions Lisa Chargualaf Simon Kanaan Keefe Roedersheimer Others: Dr. Izaguirre, Dr. Chen, Dr. Wuchty, ChengBang Huang.
Advertisements

Proteomics Examination Yvonne (Bonnie) Eyler Technology Center 1600 Art Unit 1646 (703)
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Biological pathway and systems analysis An introduction.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Dynamic Bayesian Networks (DBNs)
Introduction to Statistics
Breast cancer is a complex and heterogeneous disease Tumor samples Protein expression Clinical features Mutational status Adapted from TCGA, Nature 2012.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Hypothesis Testing and Dynamic Treatment Regimes S.A. Murphy Schering-Plough Workshop May 2007 TexPoint fonts used in EMF. Read the TexPoint manual before.
1 A Confidence Interval for the Misclassification Rate S.A. Murphy & E.B. Laber.
6.3 Two-Sample Inference for Means November 17, 2003.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
6. Gene Regulatory Networks
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
1 11 Lecture 12 Overview of Probability and Random Variables (I) Fall 2008 NCTU EE Tzu-Hsien Sang.
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Molecular Physiology: Enzymes and Cell Signaling.
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Proteomics Understanding Proteins in the Postgenomic Era.
Introduction to ModelingMonte Carlo Simulation Expensive Not always practical Time consuming Impossible for all situations Can be complex Cons Pros Experience.
Muhammad Moeen YaqoobPage 1 Moment-Matching Trackers for Difficult Targets Muhammad Moeen Yaqoob Supervisor: Professor Richard Vinter.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
Finish up array applications Move on to proteomics Protein microarrays.
Combined Experimental and Computational Modeling Studies at the Example of ErbB Family Birgit Schoeberl.
Geo597 Geostatistics Ch9 Random Function Models.
Identification of Cancer-Specific Motifs in
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 23/10/2015 9:22 PM 1 Two-sample comparisons Underlying principles.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
Functions of random variables Sometimes what we can measure is not what we are interested in! Example: mass of binary-star system: We want M but can only.
CS Statistical Machine learning Lecture 24
RAW264.7 Cell Ligand Screen Summary Progress Report and Perspectives AfCS 5/24/04.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets Adam Kirsch, Michael Mitzenmacher, Havard University Andrea.
Efficient Point Coverage in Wireless Sensor Networks Jie Wang and Ning Zhong Department of Computer Science University of Massachusetts Journal of Combinatorial.
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
Microarray: An Introduction
Motivation Give the users a quick overview of the signaling pathways activated by selected ligands. Provide an easy way to navigate through the data. Offer.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Data Transformation: Normalization
Lecture8 Test forcomparison of proportion
Data Mining: Concepts and Techniques
Chapter Six Normal Curves and Sampling Probability Distributions
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
SEG5010 Presentation Zhou Lanjun.
Areas of Research … Causal Discovery Application Integration
Hepatic Dysfunction Caused by Consumption of a High-Fat Diet
I. N. Melas, A. D. Chairakaki, E. I. Chatzopoulou, D. E. Messinis, T
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

Construction of large signaling pathways using an adaptive perturbation approach with phosphoproteomic data Ioannis N. Melas, Alexander Mitsos, Dimitris E. Messinis, Thomas S. Weiss, Julio-Saez Rodriguez, Leonidas G. Alexopoulos Naga Srinivas Sooraj Vedula NetID:nvedul2 Spring 2015

Outline -Introduction -Proteomic technologies -Experimental Procedure:Day1(Collecting sample cells from patients) -Day2(Ligand Selection and GMD) -Day3(Combinatorial experiment, Hill function and ILP formulation) -Results -Research questions

Introduction - Cell signaling refers to how information or a message moves inside of the cytoplasm of a cell. - Ligand(stimuli). - Signaling pathways - entire set of cell changes induced by receptor activation. - Perturbation is caused due to stimuli.

Introduction(Contd.) -Hepatocytes – liver cells that have proteins. -Phosphoproteins – chemically bounded to phosphoric acid. -Phosphorylation signals – signal flow initiated by key phosphoprotein.

Proteomic technologies 1)Technologies that make no prior assumption about the sample’s protein content. e.g. Mass Spectrometry(MS)– breaking down to peptide level and using their sequence. Tedious. 2) Affinity based methods – response to stimuli. e.g. xMAP technology – Using dyed spheres with different combination of different dyes. Making use of Fluorophore. So we use xMAP technology as it can test thousands of cells and fast result generation.

Experiment:Day1(Patient Interaction) - Liver tissue samples are obtained from patients with liver tumor secondary or higher degree cancer. -Hepatocyte are isolated from samples obtained. -Primary human hepatocytes were place 96-well plates. Source :

Day2:Ligand Screening and Data Acquisition -Ligand Screening - A library of 81 stimuli was put together with specific concentrations(text mining). e.g. cytokines, chemokines key phosphoproteins were chosen based upon significance of pathways involved. - Result after exposure to laser.

Day2:Ligand Selection -The Gaussian Mixture Distribution (GMD) was used for ligand selection procedure. Smooth bell curve can be attributed to continuous random variable since phosphorylation activity can have multiple outcomes. Below is Probability distribution function. Phosphorylation Activity Frequency AKT

Contd. Gaussian Mixture Model Model

Contd. Gaussian Mixture Model -Discretization of experimental data can be attributed to bell curve comparison in both modes. -Discrete part- If the probability distribution function of the phosphorylation signals are compared and the one with highest frequency is state of the signal (ON or OFF). -From Statistics Toolbox of Mat lab gmdistribution.fit() and pdf() were used. -Ultimately 15 out of 81 stimuli that activated at least one of the signals were allowed to progress.

Day3-Combinatorial Experiment

Day3-Combinatorial Experiment(contd.,)

Normal Hill Function

Day3- Generic Pathway Ligand/Stimuli reactions Phosphoproteins Active phosphoproteins

Pathway pre-processing: controllability, observability and feedback loops -Enabled using CellNetOptimizer. -Making use of DFS we remove the feedback loops. -Controllability and observability observed using Warshall’s algorithm and unnecessary edges are removed. egf egfr shc grb2

Observable-controllable pathway

Integer Linear Programming formulation

ILP formulation

optimized pathway conserved by ILP

Statistics of ILP - Earlier there were 365 reactions and we removed 204 using ILP. 53 reactions are included in minimum pathway. 161 included in maximum pathway.

Results -Just by using 14 phosphoprotein signals used in this study were sufficient to give a pathway coverage equal to 68.5% of the generic. -Predicted reactions are close to experimentally observed reactions. -Authors were able to effectively identify the cell reaction to stimuli by identifying optimal pathways.

References - “Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization Formulation on Phosphoproteomic Data” Alexander Mitsos, Ioannis N. Melas, Paraskeuas Siminelakis, Aikaterini D. Chairakaki, Julio Saez-Rodriguez, Leonidas G. Alexopoulos - “Functional genomics and proteomics as a foundation for systems biology” Kunal Aggarwal and Kelvin H. Lee “Networks Inferred from Biochemical Data Reveal Profound Differences in Toll- like Receptor and Inflammatory Signaling between Normal and Transformed Hepatocytes” Leonidas G. Alexopoulos,Julio Saez-Rodriguez,Benjamin D. Cosgrove, Douglas A. Lauffenburger and Peter K. Sorger

Thank You