Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.

Similar presentations


Presentation on theme: "A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11."— Presentation transcript:

1 A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, sichan@bcgsc.ca Thursday Trainee Seminar – October 11 th, 2007

2 Introduction to Colorectal Cancer (CRC) Cancerous growths in the colon, rectum or appendix In 2007, an estimated 20,800 Canadians will be diagnosed with CRC and approximately 8,700 will die of it (Source: Canadian Cancer Society) Stages of CRC (Image Source: Cardoso J, et al. 2007)

3 High throughput gene expression analysis Many high throughput gene expression analyses have been performed and published: –Cancer versus Normal –Cancer versus Adenoma –Adenoma versus Normal Various technologies used: –Serial Analysis of Gene Expression (SAGE) –Oligo-nucleotide microarrays –cDNA microarrays Goal: To determine candidate diagnostic and prognostic molecular biomarkers in CRC

4 Problems Unfortunately, low overlap between expression profiling studies Why? –Different methods to obtaining tissues (ie Laser Capture Microdissection vs Microdissection) –Tissue heterogeneity –Inadequate sample numbers –Use of different gene expression platforms (SAGE, microarray, etc) –Different statistical methods, fold change thresholds, etc applied Questions: –Which genes are actually differentially expressed in CRC? Which genes would make good CRC biomarkers?

5 One Solution Determine the intersection of a comprehensive collection of high throughput gene expression studies. Expect that genes biologically relevant to CRC will be reported the most often. System-specific spurious genes should be under-represented.

6 However, the statistical significance of this overlap is often not considered A certain level of overlap among studies can be expected due to chance alone Table source: Cardoso J et al, 2007

7 Meta-analysis Method Developed a vote-counting strategy to rank differentially expressed genes based on the following criteria, in order of importance: –Number of studies reporting a gene as differentially expressed –Number of tissue samples showing this differential expression –Fold Change of differential expression

8 Published Gene Expression Studies Collected 25 published gene expression studies –23 studies compared Cancer versus Normal –7 studies compared Adenoma versus Normal –5 studies compared Cancer versus Adenoma PlatformCount (Total: 25) Commerical cDNA microarrays12 Custom cDNA microarrays7 Affymetrix oligo-nucleotide microarrays3 Oligo-nucleotide microarrays2 SAGE1

9 Study 1Study 2Study 25 Differentially expressed gene list 1 Differentially expressed gene list 2 Differentially expressed gene list 25 Platform gene list 1 Platform gene list 2 Platform gene list 25

10 Example Croner RS et al, 2005 –Compared Cancer versus Normal –Utilized Affymetrix HG-U133A GeneChip Obtained platform annotation file for HG-U133A from Affymetrix website –Mapped Affy probe ids to Enterz Gene IDs (platform gene list) Mapped differentially expressed genes to Entrez Gene IDs (differentially expressed gene list)

11 Therefore, for each study, two files would be produced: –File 1: All genes (represented by Entrez Gene IDs) covered on the platform: 759 10581 11234 76013 etc –File 2: Differentially expressed Entrez Gene IDs 759UP 1434DOWN 1112UP etc

12 Simulations –Developed custom Perl scripts to perform Monte Carlo simulation. –For 10,000 iterations, For each study, –Determine number of up-regulated (X) and down-regulated (Y) genes reported in the study –Randomly choose X genes from the platform gene list and label as up-regulated –Randomly choose Y genes from the platform gene list and label as down-regulated Determine number of overlapping genes across the studies in this simulation –Calculate the average number of genes with overlap of 2,3,4, etc and associated P-values

13 Cancer versus Normal

14 Summary of Comparisons Analyzed for Overlap Comparison Total Num of Studies Total Num of Differentially Expressed Genes Reported (mapped) Total Num of Differentially Expressed Genes with Multi-study Confirmation P-value Cancer versus Normal 236537 (5886)573<.0001 Adenoma versus Normal 71101 (986)39<.0001 Cancer versus Adenoma 5538 (415)5.08

15 Gene Name Description Studies Reporting this Gene Total Sample Sizes Mean Fold Change Validation TGFβI Transforming growth factor, beta induced, 68kDa 93698.94RT-PCR IFITM1 Interferon induced Transmembrane Protein 1 (9-27) 93517.52RT-PCR MYC V-myc Myelocytomatosis Viral Oncogene Homolg (avian) 73295.02RT-PCR SPARC Secreted protein, acidic, cysteine-rich (osteonectin) 72446.30IHC GDF15 Growth differentiation factor 15 72307.42RT-PCR

16 Future Studies Purchased antibodies for certain high ranking candidates Validate protein expression level on colorectal tissue microarrays Correlate to certain prognostic outcomes

17 Conclusions Low overlap of results between many colorectal cancer high throughput gene expression studies Meta-analysis method identified consistently reported differentially expressed genes Cancer versus Normal and Adenoma versus Normal, but not Cancer versus Adenoma, studies resulted in genes consistently reported at a statistically significant frequency

18 Acknowledgements: Dr. Steven Jones Dr. Isabella Tai Obi Griffith Chan SK, Griffith OL, Tai IT, Jones SJM. Meta-analysis of Colorectal Cancer Gene Expression Profiling Studies Identifies Consistently Reported Candidate Biomarkers. Manuscript in review with Cancer Epidemiology, Biomarkers & Prevention.


Download ppt "A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11."

Similar presentations


Ads by Google