A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.

Slides:



Advertisements
Similar presentations
Relating Gene Expression to a Phenotype and External Biological Information Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
Advertisements

Microarray Pitfalls Stem Cell Network Microarray Course, Unit 3 October 2006.
4.1 (Part 1) Flow diagram for gene expression profiling.
Microarray for DNA & RNA Mosa Alzowelei BME 11/12/2014.
Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Getting the numbers comparable
Microarrays Dr Peter Smooker,
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
1 Reference: [DS1] D.A. Notterman, U. Alon, A.J. Sierk, and A.J. Levine (2001). Transcriptional Gene Expression Profiles of Colorectal Adenoma, Adenocarcinoma,
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Review of important points from the NCBI lectures. –Example slides Review the two types of microarray platforms. –Spotted arrays –Affymetrix Specific examples.
Emergent Biology Through Integration and Mining Of Microarray Datasets Lance D. Miller GIS Microarray & Expression Genomics.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Gene Set Enrichment Analysis Petri Törönen petri(DOT)toronen(AT)helsinki.fi.
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
Introduction The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics.
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Novel bioinformatics methods for the identification of coexpressed, differentially expressed, and differentially coexpressed genes with application to.
1. Abstract SAGE Serial analysis of gene expression (SAGE) is a method of large-scale gene expression analysis.that involves sequencing small segments.
A Meta-Analysis of Thyroid Cancer Gene Expression Profiling Studies Identifies Important Diagnostic Biomarkers Obi L Griffith 1, Adrienne Melck 2, Sam.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Jesse Gillis 1 and Paul Pavlidis 2 1. Department of Psychiatry and Centre for High-Throughput Biology University of British Columbia, Vancouver, BC Canada.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
Agenda Introduction to microarrays
BioQUEST / SCALE-IT Module From Omics Data to Knowledge Case 1: Microarrays Namyong Lee Minnesota State University, Mankato Matthew Macauley Clemson University.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
DEREK KENYENSO MENTOR: DR. JAYA SATAGOPAN HOSPITAL: MSKCC DEPARTMENT: EPIDEMIOLOGY/BIOSTTISTICS.
Scenario 6 Distinguishing different types of leukemia to target treatment.
Marco Magistri , Journal Club. A non-coding RNA (ncRNA) is any RNA molecule that is not translated into a protein “Structural genes encode proteins.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Gene Expression Platforms for Global Co-Expression Analyses A Comparison of spotted cDNA microarrays, Affymetrix microarrays, and SAGE Obi Griffith, Erin.
Gene Expression Platforms for Global Co-Expression Analyses A Comparison of spotted cDNA microarrays, Affymetrix microarrays, and SAGE Obi Griffith, Erin.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Statistical Testing with Genes Saurabh Sinha CS 466.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
Gene Expression Platforms for Global Coexpression Analyses Assessment and Integration for Study of Gene Deregulation in Cancer Obi Griffith, Erin Pleasance,
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Cluster validation Integration ICES Bioinformatics.
Project of CZ5225 Zhang Jingxian:
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Gene expression. Gene Expression 2 protein RNA DNA.
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Methodology U937 Human Immune Cells Control (No treatment) (n=4) Estrogen (5 uM) (n=4) 4-nonylphenol (5 uM) (n=4) Cultured Cells, RNA Isolation, RT (to.
Title: Assign Pathways to Gene Set June 21, 2007 Guanming Wu.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
NCode TM miRNA Analysis Platform Identifies Differentially Expressed Novel miRNAs in Adenocarcinoma Using Clinical Human Samples Provided By BioServe.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
AN INTRODUCTION TO GENE EXPRESSION ANALYSIS BY MICROARRAY TECHNIQUE (PART II) DR. AYAT B. AL-GHAFARI MONDAY 10 TH OF MUHARAM 1436.
Nature as blueprint to design antibody factories Life Science Technologies Project course 2016 Aalto CHEM.
David Amar, Tom Hait, and Ron Shamir
CDNA-Project cDNA project Julia Brettschneider (UCB Statistics)
Gene expression.
Functional Genomics in Evolutionary Research
UHRF1 is regulated by miR-9 in colorectal cancer
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
miRNA expression patterns in stools from healthy subjects.
Microarray analysis of phospho-Twist1–responsive genes.
Published online September 20, 2017 by JAMA Surgery
Volume 18, Issue 1, Pages (January 2017)
Didi Amar and Tom Hait Group meeting October 2013
Functional classification and visualization of differentially expressed genes. Functional classification and visualization of differentially expressed.
Transcripts enriched and depleted in NB TICs compared with SKPs and other tumor tissues. Transcripts enriched and depleted in NB TICs compared with SKPs.
Presentation transcript:

A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11 th, 2007

Introduction to Colorectal Cancer (CRC) Cancerous growths in the colon, rectum or appendix In 2007, an estimated 20,800 Canadians will be diagnosed with CRC and approximately 8,700 will die of it (Source: Canadian Cancer Society) Stages of CRC (Image Source: Cardoso J, et al. 2007)

High throughput gene expression analysis Many high throughput gene expression analyses have been performed and published: –Cancer versus Normal –Cancer versus Adenoma –Adenoma versus Normal Various technologies used: –Serial Analysis of Gene Expression (SAGE) –Oligo-nucleotide microarrays –cDNA microarrays Goal: To determine candidate diagnostic and prognostic molecular biomarkers in CRC

Problems Unfortunately, low overlap between expression profiling studies Why? –Different methods to obtaining tissues (ie Laser Capture Microdissection vs Microdissection) –Tissue heterogeneity –Inadequate sample numbers –Use of different gene expression platforms (SAGE, microarray, etc) –Different statistical methods, fold change thresholds, etc applied Questions: –Which genes are actually differentially expressed in CRC? Which genes would make good CRC biomarkers?

One Solution Determine the intersection of a comprehensive collection of high throughput gene expression studies. Expect that genes biologically relevant to CRC will be reported the most often. System-specific spurious genes should be under-represented.

However, the statistical significance of this overlap is often not considered A certain level of overlap among studies can be expected due to chance alone Table source: Cardoso J et al, 2007

Meta-analysis Method Developed a vote-counting strategy to rank differentially expressed genes based on the following criteria, in order of importance: –Number of studies reporting a gene as differentially expressed –Number of tissue samples showing this differential expression –Fold Change of differential expression

Published Gene Expression Studies Collected 25 published gene expression studies –23 studies compared Cancer versus Normal –7 studies compared Adenoma versus Normal –5 studies compared Cancer versus Adenoma PlatformCount (Total: 25) Commerical cDNA microarrays12 Custom cDNA microarrays7 Affymetrix oligo-nucleotide microarrays3 Oligo-nucleotide microarrays2 SAGE1

Study 1Study 2Study 25 Differentially expressed gene list 1 Differentially expressed gene list 2 Differentially expressed gene list 25 Platform gene list 1 Platform gene list 2 Platform gene list 25

Example Croner RS et al, 2005 –Compared Cancer versus Normal –Utilized Affymetrix HG-U133A GeneChip Obtained platform annotation file for HG-U133A from Affymetrix website –Mapped Affy probe ids to Enterz Gene IDs (platform gene list) Mapped differentially expressed genes to Entrez Gene IDs (differentially expressed gene list)

Therefore, for each study, two files would be produced: –File 1: All genes (represented by Entrez Gene IDs) covered on the platform: etc –File 2: Differentially expressed Entrez Gene IDs 759UP 1434DOWN 1112UP etc

Simulations –Developed custom Perl scripts to perform Monte Carlo simulation. –For 10,000 iterations, For each study, –Determine number of up-regulated (X) and down-regulated (Y) genes reported in the study –Randomly choose X genes from the platform gene list and label as up-regulated –Randomly choose Y genes from the platform gene list and label as down-regulated Determine number of overlapping genes across the studies in this simulation –Calculate the average number of genes with overlap of 2,3,4, etc and associated P-values

Cancer versus Normal

Summary of Comparisons Analyzed for Overlap Comparison Total Num of Studies Total Num of Differentially Expressed Genes Reported (mapped) Total Num of Differentially Expressed Genes with Multi-study Confirmation P-value Cancer versus Normal (5886)573<.0001 Adenoma versus Normal (986)39<.0001 Cancer versus Adenoma 5538 (415)5.08

Gene Name Description Studies Reporting this Gene Total Sample Sizes Mean Fold Change Validation TGFβI Transforming growth factor, beta induced, 68kDa RT-PCR IFITM1 Interferon induced Transmembrane Protein 1 (9-27) RT-PCR MYC V-myc Myelocytomatosis Viral Oncogene Homolg (avian) RT-PCR SPARC Secreted protein, acidic, cysteine-rich (osteonectin) IHC GDF15 Growth differentiation factor RT-PCR

Future Studies Purchased antibodies for certain high ranking candidates Validate protein expression level on colorectal tissue microarrays Correlate to certain prognostic outcomes

Conclusions Low overlap of results between many colorectal cancer high throughput gene expression studies Meta-analysis method identified consistently reported differentially expressed genes Cancer versus Normal and Adenoma versus Normal, but not Cancer versus Adenoma, studies resulted in genes consistently reported at a statistically significant frequency

Acknowledgements: Dr. Steven Jones Dr. Isabella Tai Obi Griffith Chan SK, Griffith OL, Tai IT, Jones SJM. Meta-analysis of Colorectal Cancer Gene Expression Profiling Studies Identifies Consistently Reported Candidate Biomarkers. Manuscript in review with Cancer Epidemiology, Biomarkers & Prevention.