Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns.

Slides:



Advertisements
Similar presentations
OPC Koustenis, Breiter. General Comments Surrogate for Control Group Benchmark for Minimally Acceptable Values Not a Control Group Driven by Historical.
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University Presented by: John Paisley Duke University.
Expression profiles for prognosis and prediction Laura J. Van ‘t Veer The Netherlands Cancer Institute, Amsterdam.
Computational Diagnostics We are a new research group in the department of Computational Molecular Biology at the Max Planck Institute for Molecular Genetics.
Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Part II: Discriminative Margin Clustering Joint work with: Rob Tibshirani, Dept of Statistics Patrick O. Brown, School of Medicine Stanford University.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
. Differentially Expressed Genes, Class Discovery & Classification.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Genomic signatures to guide the use of chemotherapeutics Authors: Anil Potti et. al Presenter: Jong Cheol Jeong.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Geonomics in Breast Cancer Decoding Human Genome Luis Barreras, M.D., FACP.
Ensemble Learning (2), Tree and Forest
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
Gene expression profiling identifies molecular subtypes of gliomas
Gianni L et al. Proc SABCS 2012;Abstract GS6-7.
1 A Presentation of ‘Bayesian Models for Gene Expression With DNA Microarray Data’ by Ibrahim, Chen, and Gray Presentation By Lara DePadilla.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
A Comparative Study on Variable Selection for Nonlinear Classifiers C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
Sgroi DC et al. Proc SABCS 2012;Abstract S1-9.
Sample classification using Microarray Data. AB We have two sample entities malignant vs. benign tumor patient responding to drug vs. patient resistant.
Michael Birrer Ian McNeish New Developments in Biology and Targets of Epithelial Ovarian Cancer.
Statistical Review: Recursive Partitioning Identifies Patients at High and Low Risk for Ipsilateral Tumor Recurrence After Breast- Conserving Surgery and.
Gene Expression Signatures for Prognosis in NSCLC, Coupled with Signatures of Oncogenic Pathway Deregulation, Provide a Novel Approach for Selection of.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Randomized Algorithms for Bayesian Hierarchical Clustering
Computational Diagnostics A new research group at the Max Planck Institute for molecular Genetics, Berlin.
An Overview of Clustering Methods Michael D. Kane, Ph.D.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Journal Club Meeting Sept 13, 2010 Tejaswini Narayanan.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the.
Learning disjunctions in Geronimo’s regression trees Felix Sanchez Garcia supervised by Prof. Dana Pe’er.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
CGH Data BIOS Chromosome Re-arrangements.
Computer vision: models, learning and inference Chapter 2 Introduction to probability.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Supervised learning in high-throughput data  General considerations  Dimension reduction with outcome variables  Classification models.
RECITATION 4 MAY 23 DPMM Splines with multiple predictors Classification and regression trees.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
Classification with Gene Expression Data
Heping Zhang, Chang-Yung Yu, Burton Singer, Momian Xiong
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Computational Diagnostics
William Norris Professor and Head, Department of Computer Science
Rainer Spang, Max Planck Institute for Molecular Genetics, Berlin
OVERVIEW OF BAYESIAN INFERENCE: PART 1
Boosting For Tumor Classification With Gene Expression Data
Rainer Spang, Max Planck Institute for Molecular Genetics, Berlin
Robust Full Bayesian Learning for Neural Networks
Ensemble learning Reminder - Bagging of Trees Random Forest
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
HGSOC mutational processes are established early and are patient-specific. HGSOC mutational processes are established early and are patient-specific. A,
Presentation transcript:

Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns Issues in Bayesian Tree Modeling of Clinical and Gene Expression Data

Current Areas of Application Breast Cancer lymph node status disease recurrence Ovarian Cancer tumor location

Lymph Node Involvement Is a Key Breast Cancer Risk Factor But -- lymph node dissection also carries morbidity and inaccuracy

Identifying Metagenes Associated With Lymph Node Status Tumor Sample Gene

Metagenes/ Expression Signatures Dimension reduction: Signal improvement Clustering Singular value decomposition Empirical or model-based factor analysis Characterize patterns in data

Gene Clustering

Gene Clustering (cont’d)

Factor extraction (SVD)

Differential Gene Expression

Differential Gene Expression (Threshold 1)

Differential Gene Expression (Threshold 2)

Differential Gene Expression (Threshold 3)

Nonlinear Expression

Nonlinear Expression (Threshold 1)

Nonlinear Expression (Threshold 2)

Lymph Node Metastasis Metagenes

Ovarian Tumor Site Genes

Statistical Tree Models for Clinico-Genomic Prediction Regression trees: Non-linear, interactions Recursive partitioning Retrospective studies Many trees: Model uncertainty Predictions average across trees

Binary Outcomes Retrospective Sampling LN +

Binary Outcomes: Prospective Inference from Retrospective Model

Binary Outcomes: Retrospective Model Model conditionals for predictors Nonparametric Bayes: Dirichlet model Modeling in x space – joint structure Implies Beta priors on

Growing Binary Trees Node split: Each candidate predictor:threshold pair 2x2 table: 2 Bernoulli’s, fixed columns (Y=0/1) Assess and select split, or stop Conservative Bayesian tests Multiple trees: Multiple splits at any node

Inference with Many Binary Trees Within-tree inference & prediction: Sequences of beta posteriors for Simulate: Impute Pr(Y=1|leaf) Multiple trees: Likelihood across trees Average predictions across trees Model (predictor:threshold)s uncertainty “Smoothing” classification boundaries

Binary Outcome: Lymph Node Metastasis Tumor Sample Gene Predictive trees: Nonparametric Bayes’ Metagene expression Retrospective sampling Lancet 2003 (Huang, West et al) Lancet 2003 (Huang, West et al)

Predicting Lymph Node Status With Metagenes LN+ LN- Probability of LN+ Out-of-sample cross validation Sample

Forests of Clinico-Genomic Trees Select from potential clinical and genomic predictor variables multiple trees variable combination – co-occurrence multiple subtypes

… With Metagenes and Clinical Predictors LN+ LN- Probability of LN+ Out-of-sample cross validation Sample

Lymph Node Clinico-Genomic Predictors

Predicting Ovarian Tumor Site Omentum Ovary Probability of Omentum Out-of-sample cross validation Sample

Gene Identification Implicated metagenes – gene subsets Genes correlated with key metagenes Breast Cancer – nodal metastasis: Interferon pathway/inducible gene subset Interferons mediate anti-tumor response Evidence of dysfunction of normal anti-tumor response? Ovarian Cancer – tumor site: Growth regulatory pathway/inducible gene subset Evidence of dysfunction of normal cell growth?

Ongoing Research Stochastic search (sequential,annealing) Representation of tree ‘forest’ Metagene definition/ creation Cluster implementation of tree models

Computational & Applied Genomics Program Joseph Nevins Mike West Erich Huang Ed Iversen Holly Dressman Duke University Koo Foundation-Sun Yat Sen Cancer Center Andrew Huang, Skye Cheng, Mei-Hua Tsou Department of Obstetrics and Gynecology John Lancaster Andrew Berchuck

Growing Binary Trees (2x2) ?