Learning disjunctions in Geronimo’s regression trees Felix Sanchez Garcia supervised by Prof. Dana Pe’er.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Bayesian network for gene regulatory network construction
Statistical methods and tools for integrative analysis of perturbation signatures Mario Medvedovic Laboratory for Statistical Genomics and Systems Biology.
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
TCGA(The cancer genome atlas) catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics The TCGA is sequencing the.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
By Russell Armstrong Supervisor Mrs Wei Ji Diagnosis Analysis of Lung Cancer by Genome Expression Profiles.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Indiana University Bloomington, IN Junguk Hur Computational Omics Lab School of Informatics Differential location analysis A novel approach to detecting.
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Introduction to Hierarchical Clustering Analysis Pengyu Hong 09/16/2005.
Computational Approaches in Epigenomics Guo-Cheng Yuan Department of Biostatistics and Computational Biology Dana-Farber Cancer Institute Harvard School.
Module Networks Discovering Regulatory Modules and their Condition Specific Regulators from Gene Expression Data Cohen Jony.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Generate Affy.dat file Hyb. cRNA Hybridize to Affy arrays Output as Affy.chp file Text Self Organized Maps (SOMs) Functional annotation Pathway assignment.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Data Mining to Aid Beam Angle Selection for IMRT Stuart Price-University of Maryland Bruce Golden- University of Maryland Edward Wasil- American University.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Whole Genome Expression Analysis
Radiogenomics in glioblastoma multiforme
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Analysing Microarray Data Using Bayesian Network Learning Name: Phirun Son Supervisor: Dr. Lin Liu.
The Broad Institute of MIT and Harvard Classification / Prediction.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Wang Y 1,2, Damaraju S 1,3,4, Cass CE 1,3,4, Murray D 3,4, Fallone G 3,4, Parliament M 3,4 and Greiner R 1,2 PolyomX Program 1, Department.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Apostolos Zaravinos and Constantinos C Deltas Molecular Medicine Research Center and Laboratory of Molecular and Medical Genetics, Department of Biological.
Supplementary Figure S1 eQTL prior model modified from previous approaches to Bayesian gene regulatory network modeling. Detailed description is provided.
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks From Nature Medicine 7(6) 2001 By Javed.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Lectures 9 – Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
An Overview of Clustering Methods Michael D. Kane, Ph.D.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Feature (Gene) Selection MethodsSample Classification Methods Gene filtering: Variance (SD/Mean) Principal Component Analysis Regression using variable.
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Clustering by soft-constraint affinity propagation: applications to gene- expression data Michele Leone, Sumedha and Martin Weight Bioinformatics, 2007.
Module Networks BMI/CS 576 Mark Craven December 2007.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Evaluation of inferred networks
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
… 1 2 n A B V W C X 1 2 … n A … V … W … C … A X feature 1 feature 2
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Learning disjunctions in Geronimo’s regression trees Felix Sanchez Garcia supervised by Prof. Dana Pe’er

Motivation Gliobastoma: most common primary brain tumour in adults. Newly diagnosed patients have an average survival of 1 year. Need for better models of the network. Data used to create models: microarrays  # genes  8000  # candidate regulators  800  # samples  120

Module networks Bayesian model that benefits from high correlation of groups of variables [2] Algorithm similar to EM (but hard decisions). Loop: –Module assignment step: assign variables to modules –Structure search step: calculate CPD for each module Module 1 Module 2 Module 4 Module 3

Regression trees as CPD Regression trees are used for each module’s CPD Internal nodes: condition on a single variable Leaf nodes: parameters for normal distribution Bayesian score Exhaustively calculates score for each split for each regulator …… target gene’s values sorted by regulator pdf of normal-gammaprior on structure (complexity+biological penalties) x<0.3 y>-0.2

Incorporating pathway information Biological pathways: contain sets of genes and represent chains of biochemical reactions that perform some function Aberrations in gliobastoma tend to occure as disjunctions within pathways: derregulating 1 component is usually enough to alter the function of the whole pathway [4] Idea: use pathway information to obtain a better model Methodology: extend node conditions to disjunctions of conditions on pathway elements We will use 15 sets of regulators (20-30 genes per set) –5 sets of regulators of pathways known to be related to cancer. –5 sets of regulators of other pathways –5 sets of regulators chosed at random

Problem setting Concept class: disjunction of threshold functions on a single variable Loss functions: -Bayesian score (biological penalty?) Potential number of hypotheses: 2^{m} Related classification problem tackled by Marchand and Shah (2005) and Kestler et al. (2006).

Bibliography 1.Pe'er, D., Bayesian Network Analysis of Signaling Networks: A Primer. Sci. STKE, (281): p. pl4-. 2.Segal, E., et al., Module networks: identifying regulatory modules and their condition- specific regulators from gene expression data. Nat Genet, (2): p Lee, S.-I., et al., Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proceedings of the National Academy of Sciences, (38): p Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature, (7216): p Kestler, H., W. Lindner, and A. Müller, Learning and Feature Selection Using the Set Covering Machine with Data-Dependent Rays on Gene Expression Profiles, in Artificial Neural Networks in Pattern Recognition p Marchand, M. and M. Shah, PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data, in Advances in Neural Information Processing Systems , MIT Press: Cambridge, MA. p