Presentation is loading. Please wait.

Presentation is loading. Please wait.

CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison.

Similar presentations


Presentation on theme: "CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison."— Presentation transcript:

1 CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison

2 General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter Overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison

3 Analysis Workflow Normalized, scaled data Differentially expressed genes Identify and partition expression patterns Gene Summaries Biological themes (Pathways, molecular function, etc.) Raw data

4 Analysis Workflow Normalized, scaled data Differentially expressed genes Identify and partition expression patterns Gene Summaries Biological themes (Pathways, molecular function, etc.) Raw data Comparison statistics, correction for multiple testing Up and down regulated, magnitude, clustering Annotation (UniGene, Entrez Gene, Gene Ontologies, etc.) Ontology report, pathway report, z-score Data upload

5 Experiment Design Experimental design determines what can be inferred from the data as well as determining the confidence that can be assigned to those inferences. Careful experimental design and the presence of biological replicates are essential to the successful use of microarrays. Type of experiment –Two groups –Three or more groups Time series Dose response Multiple treatment The type of experiment and number of groups will affect the statistical methods used to detect differential expression Replicates –The more the better, but at least 3 –Biological better than technical Rigorous statistical inferences cannot be made with a sample size of one. The more replicates, the stronger the inference. Supporting material - Experimental Design and Other Issues in Microarray Studies - Kathleen Kerr - http://ra.microslu.washington.edu/presentation/documents/KerrNAS.pdf http://ra.microslu.washington.edu/presentation/documents/KerrNAS.pdf microarraysuccess.com

6 Differential Expression The fundamental goal of microarray experiments is to identify genes that are differentially expressed in the conditions being studied. Comparison statistics can be used to help identify differentially expressed genes and cluster analysis can be used to identify patterns of gene expression and to segregate a subset of genes based on these patterns. Statistical Significance –Fold change Fold change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance. –Comparison statistics 2 group –t-test, Welch’s t-test, Wilcoxon Rank Sum, 3 or more groups –ANOVA, Kruskal-Wallis Comparison tests require replicates and use the variability within the replicates to assign a confidence level as to whether the gene is differentially expressed. Supporting material - Draghici S. (2002) Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today, 7(11 Suppl).: S55-63. microarraysuccess.com

7 Correction for multiple testing- Methods for adjusting the p-value from a comparison test based on the number of tests performed. These adjustments help to reduce the number of false positives in an experiment. –FWER : Family Wise Error Rate (FWER) corrections adjust the p-value so that it reflects the chance of at least 1 false positive being found in the list. Bonferonni, Holm, W & Y MaxT –FDR : False Discovery Rate corrections (FDR) adjust the p-value so that it reflects the frequency of false positives in the list. Benjamini and Hochberg, SAM The FWER is more conservative, but the FDR is usually acceptable for “discovery” experiments, i.e. where a small number of false positives is acceptable Dudoit, S., et al. (2003) Multiple hypothesis testing in microarray experiments. Statistical Science 18(1): 71-103. Reiner, A., et al. (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19(3):368-375. Differential Expression microarraysuccess.com

8 Accessibility Web-based Secure Data management Data Annotation (MIAME) Multiple upload tools CodeLink Affymetrix Illumina Agilent Custom Differential Expression - Powerful, accessible tools for determining Statistical Significance R based statistics Bioconductor Comparison Tests t-test, Welch’s t-test, Wilcoxon Rank sum test, ANOVA, Correction for Multiple Testing Bonferroni, Holm, Westfall and Young maxT, Benjamini and Hochberg Unsupervised Clustering PAM, CLARA, Hierarchical clustering Silhouettes GeneSifter – Microarray Data Analysis CodeLink compatible

9 GeneSifter – Microarray Data Analysis Integrated tools for determining Biological Significance One Click Gene Summary™ Ontology Report Pathway Report Search by ontology terms Search by KEGG terms or Chromosome

10 The GeneSifter Data Center Free resource Training Research Publishing 5 areas Cardiovascular Cancer Neuroscience Immunology Oral Biology Access to : Data Analysis summary Tutorials WebEx

11 The GeneSifter Data Center www.genesifter.net/dc

12 GeneSifter - Analysis Examples Differential expression Fold change Quality t-test False discovery rate Differential expression Fold change Quality ANOVA False discovery rate Visualization Hierarchical clustering PCA Partitioning PAM Silhouettes Data Upload CodeLink Biological significance Gene Annotation Ontology report Pathway report 2 groups (Huntingtons Blood vs Healthy Blood) 3 + groups (Time series, dose response, etc.)

13 General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter Overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison

14 Background - Huntington’s Disease Huntington’s Disease (HD) Autosomal dominant neurodegenerative disease Motor impairment Cognitive decline Various psychiatric symptoms Onset 30-50 years Mutant Huntingtin protein (polyglutamine) Effects transcriptional regulation Transcription effects may occur outside of CNS

15 Pairwise Analysis CodeLink Human 20K Bioarray Human blood expression for Huntington’s disease versus control, CodeLink Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8.

16 Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8. Collected peripheral blood samples - 14 Controls 12 Symptomatic HD patients 5 Presymptomatic HD patients Identified 322 most differentially expressed genes (Con. Vs Symptomatic HD) using U133A array. Used CodeLink 20K to confirm genes identifed using Affymetrix platform Focused on 12 genes that showed most significant difference between Control and HD Data available from GEO Background - Data

17 Pairwise Analysis CodeLink Human 20K Bioarray Human blood expression for Huntington’s disease versus control, CodeLink Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8.

18 Pairwise Analysis Select group 1 14 normal Select group 2 12 Huntingtons

19 Already normalized (median) t-test Quality filter – 0.75 (filters out genes with signal less than 0.75) Benjamini and Hochberg (FDR) Log transform data Pairwise Analysis

20 Pairwise Analysis – Gene List

21 Biological Significance Gene Annotation Sources UniGene - organizes GenBank sequences into a non-redundant set of gene-oriented clusters. Gene titles are assigned to the clusters and these titles are commonly used by researchers to refer to that particular gene. LocusLink (Entrez Gene) - provides a single query interface to curated sequence and descriptive information, including function, about genes. Gene Ontologies – The Gene Ontology™ Consortium provides controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. KEGG - Kyoto Encyclopedia of Genes and Genomes provides information about both regulatory and metabolic pathways for genes. Reference Sequences- The NCBI Reference Sequence project (RefSeq) provides reference sequences for both the mRNA and protein products of included genes. GeneSifter maintains its own copies of these databases and updates them automatically.

22 One-Click Gene Summary

23 Pairwise Analysis – Gene List

24 Ontology Report

25 Ontology Report : z-score R = total number of genes meeting selection criteria N = total number of genes measured r = number of genes meeting selection criteria with the specified GO term n = total number of genes measured with the specific GO term Reference: Scott W Doniger, Nathan Salomonis, Kam D Dahlquist, Karen Vranizan, Steven C Lawlor and Bruce R Conklin; MAPPFinder: usig Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data, Genome Biology 2003, 4:R7

26 Z-score Report

27

28 KEGG Report

29 Pairwise Analysis - Summary ~20,000 genes5684 genes 2606 increased In HD Biological processes Protein biosynthesis (104) Ubiquitin cycle (123) RNA splicing (53) KEGG Oxidataive phosphorylation (35) Apoptosis (22) Biological processes Neurogenesis (90) Cell adhesion (120) Sodium ion transport (29) G-protein coupled receptor signaling (114) KEGG Neuroactive ligand-receptor interaction (56) 3078 decreased In HD Human blood expression for Huntington’s disease versus control, CodeLink 12 HD 14 Control Z-scoresPattern selectiont-test, Benjamini and Hochberg (FDR)

30 General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter Overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison

31 Pairwise Analysis U133A Human Genome Array MAS 5 signal Human blood expression for Huntington’s disease versus control, Affymetrix Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8.

32 Already normalized (median) t-test Quality filter – 50 (filters out genes with signal less than 50) Benjamini and Hochberg (FDR) Log transform data Pairwise Analysis - Affymetrix

33 Pairwise Analysis – Gene List Human blood expression for Huntington’s disease versus control, Affymetrix

34 Gene Lists – Common and Unique Genes

35 Platform comparison – Biological themes Affymetrix

36 Platform comparison – Biological themes CodeLink

37 GeneSifter - Analysis Examples Differential expression Fold change Quality t-test False discovery rate Differential expression Fold change Quality ANOVA False discovery rate Visualization Hierarchical clustering PCA Partitioning PAM Silhouettes Data Upload CodeLink Biological significance Gene Annotation Ontology report Pathway report 2 groups (Huntingtons Blood vs Healthy Blood) 3 + groups (Time series, dose response, etc.)

38 Project Analysis - Clustering

39 Cluster by Samples – All Genes CodeLink Affymetrix

40 Cluster by Samples – ? CodeLink Affymetrix

41 Cluster by Samples – Y Chrom. Genes CodeLink Affymetrix

42 Platform Comparison - Summary CodeLinkAffymetrix Transcripts Total 19729 22283 Increased in HD 2606 1976 Overlap (LL genes) 41% 65% Top BP Ontologies Ubiquitin cycle RNA splicing Regulation of translation Apoptosis Clustering of samples

43 Platform Comparison - Summary CodeLinkAffymetrix Increased in HD 2606 1976 Decreased in HD 3708 986 Unique ontologyOxidative Phos.IL-6 Biosynthesis

44 DataPublicationBiologicalSignificanceDifferentialExpressionSystemAccessDataManagementPlatformSelectionExperimentDesign Type of experiment Two groups Time series Dose Response Multiple treatments Replicates The more the better Technical vs. biological Platforms cDNA Oligo One color Two color Feature Extraction Software File formats Databases Raw Data Storing Retrieving Experiment Annotation Samples Protocols Usability Intuitive Special training System Access Single user desktop Single user server Web-based Sharing data In the lab Collaboration Normalization Differential Expression Fold change Comparison statistics FWER/FDR Pattern Identification Clustering Visualization Partitioning Gene Annotation UniGene LocusLink Gene Ontology KEGG OMIM Single Genes Gene Summaries Gene Lists Ontology Report Pathway Report MIAME What is it? Publication Public databases GEO ArrayExpress SMD Using public data Meta analysis Seven Keys to Successful Microarray Data Analysis MicroarraySuccess.com Academic partner – University of Washington

45 The GeneSifter Data Center www.genesifter.net/dc

46 Eric Olson eric@genesifter.net 206.283.4363 Thank You www.genesifter.net Trial account, tutorials, sample data and Data Center CodeLink compatible


Download ppt "CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison."

Similar presentations


Ads by Google