Genomic Signal Processing: Ensemble Dependence Model for Classification and Prediction of Cancer Based on Gene Expression Data Joseph DePasquale Engineering.

Slides:



Advertisements
Similar presentations
Estimating the False Discovery Rate in Multi-class Gene Expression Experiments using a Bayesian Mixture Model Alex Lewin 1, Philippe Broët 2 and Sylvia.
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Outlines Background & motivation Algorithms overview
Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Model-based clustering of gene expression data Ka Yee Yeung 1,Chris Fraley 2, Alejandro Murua 3, Adrian E. Raftery 2, and Walter L. Ruzzo 1 1 Department.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Genetic algorithms applied to multi-class prediction for the analysis of gene expressions data C.H. Ooi & Patrick Tan Presentation by Tim Hamilton.
Gene expression analysis summary Where are we now?
Author: Jason Weston et., al PANS Presented by Tie Wang Protein Ranking: From Local to global structure in protein similarity network.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Agilent: The Company, The Myth, The Lengend. Agilent: Agilent Technologies Inc. (NYSE: A) is a world-wide, diverse technology company focused on expansion.
Introduction to Hierarchical Clustering Analysis Pengyu Hong 09/16/2005.
Fuzzy K means.
Feature Selection and Its Application in Genomic Data Analysis March 9, 2004 Lei Yu Arizona State University.
Spanish Inquisition Final Project Week 4 - 5/21/09 Breast Cancer Gene Expression Data Leon Kay, Yan Tran, Chris Thomas Chris Yan Leon.
DIMACS Workshop on Machine Learning Techniques in Bioinformatics 1 Cancer Classification with Data-dependent Kernels Anne Ya Zhang (with Xue-wen.
1 April, 2005 Chapter C4.1 and C5.1 DNA Microarrays and Cancer.
Assigning Numbers to the Arrows Parameterizing a Gene Regulation Network by using Accurate Expression Kinetics.
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Web Usage Mining Sara Vahid. Agenda Introduction Web Usage Mining Procedure Preprocessing Stage Pattern Discovery Stage Data Mining Approaches Sample.
(2) Ratio statistics of gene expression levels and applications to microarray data analysis Bioinformatics, Vol. 18, no. 9, 2002 Yidong Chen, Vishnu Kamat,
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
ENN: Extended Nearest Neighbor Method for Pattern Recognition
Classification (Supervised Clustering) Naomi Altman Nov '06.
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
1 A Presentation of ‘Bayesian Models for Gene Expression With DNA Microarray Data’ by Ibrahim, Chen, and Gray Presentation By Lara DePadilla.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Diagnosis of multiple cancer types by shrunken centroids of gene expression Course: Topics in Bioinformatics Presenter: Ting Yang Teacher: Professor.
Finish up array applications Move on to proteomics Protein microarrays.
Bioinformatics Brad Windle Ph# Web Site:
1 Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting Authors: A. Dupuy and R.M. Simon.
Analysis and Management of Microarray Data Previous Workshops –Computer Aided Drug Design –Public Domain Resources in Biology –Application of Computer.
Ranjit Ganta, Raj Acharya, Shruthi Prabhakara Department of Computer Science and Engineering, Penn State University DATA WAREHOUSE FOR BIO-GEO HEALTH CARE.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
EECS 730 Introduction to Bioinformatics Microarray Luke Huan Electrical Engineering and Computer Science
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Cluster validation Integration ICES Bioinformatics.
Applications of Supervised Learning in Bioinformatics Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Analyzing Expression Data: Clustering and Stats Chapter 16.
Disease Diagnosis by DNAC MEC seminar 25 May 04. DNA chip Blood Biopsy Sample rRNA/mRNA/ tRNA RNA RNA with cDNA Hybridization Mixture of cell-lines Reference.
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Tutorial on "GRID Computing“ EMBnet Conference 2008 CNR - ITB GRID distribution supporting chaotic map clustering on large mixed microarray.
Statistical Applications in Biology and Genetics
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Dept of Biomedical Informatics University of Pittsburgh
KEY CONCEPT Cell cycle regulation is necessary for healthy growth.
Subspace Clustering for Microarray Data Analysis:
BIOLOGY 12 Cancer.
Loyola Marymount University
10.3 Regulating the Cell Cycle
Significant differences in translational efficiencies of DNA damage repair pathway genes between patient clusters. Significant differences in translational.
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Presentation transcript:

Genomic Signal Processing: Ensemble Dependence Model for Classification and Prediction of Cancer Based on Gene Expression Data Joseph DePasquale Engineering Frontiers 26 Apr 07

Overview Motivation Background –Genes, Cancer, DNA Microarrays Ensemble Dependence Model –Basic structure –Inclusion in a classification system Results Conclusions

Motivation Estimated 1.4 million new cases of cancer –Roughly 550,000 will die from their disease In New Jersey 43,910 new cases –17,720 deaths In 2005, NIH estimates that the overall cost for cancer → 210 billion dollars

Background What is cancer? –Uncontrolled division of damaged cells Apoptosis –Risk increases with age Cause of unregulated cell growth

Background What is a gene? –Components –Functionality What is the importance of protein? –Essential to all living things –Participate in all functions within cells What is the significance of gene products?

DNA Microarrays Expression profiling –Represents the simultaneous activity of thousands of individual genes Publicly available data –Complexity has led to a need for the standardization of experimental setup MIAME MAQC

Taken from:

Ensemble Dependence Model Genes with similar expression profiles are combined together into clusters –Expression profile of each cluster is the average profile of all genes in that cluster Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

Ensemble Dependence Model

Model-driven method –Feature selection Not all genes are relevant T-test –Gene clustering Number of clusters Gaussian mixture model –Model learning/classification Dependence matrices generated for two cases

Classification Maximum likelihood rule –Binary hypothesis-testing problem –Tests fit of unknown samples to each model Normal Case: Cancer Case:

EDM-Based Cancer Classification Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

Results Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

Results Here, 200 different subsets of gastric data are used to calculate 200 different dependence matrices, eigenvalues of these matrices are plotted Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

Results Eigenvalues = {1, 1, 1, -3}

Results Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

In Summary Taken from: P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan

Conclusions EDM is a model-based system that is used for cancer classification and prediction based on publicly available gene expression data –Dependence of clusters to other clusters Classification results are comparable with widely accepted ML algorithm Eigenvalues of dependence matrix could be a valuable cancer prediction tool

References [1] P. Qui, Z. J. Wang, and K.J.R. Liu. “Genomic Processing for Cancer Classification and Prediction,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp , Jan [2] P. Qui, Z. J. Wang, and K.J.R. Liu. “Ensemble dependence model for classification and prediction of cancer and normal gene expression data,” Bioinformatics, vol. 21, no. 14, pp , May [3] D. Anastassiou. “Genomic Signal Processing,” IEEE Signal Processing Magazine, vol. 18, no. 4, pp. 8-20, July [4] J. Astola, I. Tabus, I. Shmelevich, and, E. Dougherty. “Genomic Signal Processing,” Signal Processing (Elsevier), vol. 83, pp , [5] American Cancer Society. “Cancer Facts and Figures 2006,” ACS :: Statistics for 2006 [Online]. Available: [6] [7] [8] [9] [10] M. Karnick. “Genomic Signal Processing,” Engineering Frontiers, The presentation directly previous to mine, Apr 2007.