Affymetrix case study Jesper Jørgensen NsGene A/S

Slides:



Advertisements
Similar presentations
Lecture 9 Microarray experiments MA plots
Advertisements

Introduction to Microarray Gene Expression
High-dimensional data analysis: Microarrays and multiple testing Mark van de Wiel 1,2 1. Dep. of Mathematics, VU University Amsterdam 2. Dep. of Biostatistics.
Microarray Normalization
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Mathematical Statistics, Centre for Mathematical Sciences
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Getting the numbers comparable
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
Microarray Data Preprocessing and Clustering Analysis
Microarray analysis Golan Yona ( original version by David Lin )
Statistical Analysis of Microarray Data
10 Genomics, Proteomics and Genetic Engineering. 2 Genomics and Proteomics The field of genomics deals with the DNA sequence, organization, function,
Gene Discovery & Genome Browsing
GCB/CIS 535 Microarray Topics John Tobias November 8th, 2004.
Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first paper: 1995) Allows study of thousands of genes at.
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
1 April, 2005 Chapter C4.1 and C5.1 DNA Microarrays and Cancer.
Introduce to Microarray
Introduction to DNA microarrays DTU - January Hanne Jarmer.
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Affymetrix GeneChips Oligonucleotide.
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Lecture 22 Introduction to Microarray
CDNA Microarrays MB206.
Data Type 1: Microarrays
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Panu Somervuo, March 19, cDNA microarrays.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Microarray data analysis
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Statistical Principles of Experimental Design Chris Holmes Thanks to Dov Stekel.
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Introduction to Microarrays.
Lecture 7. Functional Genomics: Gene Expression Profiling using
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Introduction to Microarrays. The Central Dogma.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Microarray Data Analysis The Bioinformatics side of the bench.
Empirical Bayes Analysis of Variance Component Models for Microarray Data S. Feng, 1 R.Wolfinger, 2 T.Chu, 2 G.Gibson, 3 L.McGraw 4 1. Department of Statistics,
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
NCode TM miRNA Analysis Platform Identifies Differentially Expressed Novel miRNAs in Adenocarcinoma Using Clinical Human Samples Provided By BioServe.
Gene Expression Biology 224 Instructor: Tom Peavy October 4 & 6, 2010
Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Microarray: An Introduction
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
Microarray Data Analysis Xuming He Department of Statistics University of Illinois at Urbana-Champaign.
DNA Chip Data Interpretation Tools: Genmapp & Dragon View
Getting the numbers comparable
Data Type 1: Microarrays
Presentation transcript:

Affymetrix case study Jesper Jørgensen NsGene A/S

Overview Affymetrix GeneChip technology Data processing –Expression level –Normalisation –Fold change –Statistics Parkinson disease Ventral versus dorsal midbrain (case study) Verification of array data –Q-PCR –In situ hybridization –Immunohistochemistry

Expression profiling –Investigate mRNA expression profile. –Compare gene expression between two or more situations. –Case versus control. Profiling methods –Differential display. –SAGE (Serial Analysis of Gene Expression) –Micro array (Custom spotted arrays / Affymetrix GeneChip).

Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, Gene 5’ Mulitple oligo probes PM MM 3’

Probe synthesis on the array

Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, Gene 5’ Mulitple oligo probes PM MM 3’

A probe set = PM,MM pairs (Probe design is not optimized) Probe set design

Affymetrix GeneChip technology Figure adapted from: David Givol, Weizman Institute of Science, Gene 5’ Mulitple oligo probes PM MM 3’

Preparation of samples for GeneChip Figure modified from: Knudsen (2002), “A Biologist's Guide to Analysis of DNA Microarray Data", Wiley. Amplification (T7 RNA polymerase) U133A U133B

The hardware

Overview Affymetrix GeneChip technology Data processing –Expression level –Normalisation –Fold change –Statistics Parkinson disease Ventral versus dorsal mesencephalon (case study) Verification of array data –Q-PCR –In situ hybridization –Immune histochemistry

Li-Wong model  n : scaling factor obtained by fitting Several other models exists. Irizarry et al. (2002) uses log transformed PM values after carrying out a global background adjustment and across array normalisation. Expression level (probe signal) Irrizary et al. (2002) Biostatistics

Workman et al., (2002) Genome Biology, vol. 3, No. 9. qspline normalisation (M/A plot) Assumption: Most genes are unchanged. M/A plot: Raw chip data are used to plot, for each probe, the logarithm of the ratio between two chips versus the logarithm of the mean expression for the two chips. Before After

Variation Two different amplifications of the same RNA applied to GeneChips A/AB/B

Fold change = sample/control Log transformation makes scale symmetric around 0 All data log2 transformed Fold change (Log fold) Fold change Log fold (2)

Student and Welch’s t-test ANOVA SAM Wilcoxon Kruskal-Wallis Westfall-Young ……….. Is the regulation significant? Statistical testing

5 false positives if you look at 100 genes 1200 false positives if you look at genes Increased likelihood of getting a significant result by chance alone At a P-value of 0.05 you expect: If you want 25% chance of having only one false positive in the list of regulated genes, you should only consider P-values more significant than the Bonferroni corrected cutoff. 2.5x10 -3 (0.25/100) if you look at 100 genes 1.0x10 -5 (0.25/24.000) if you look at genes Bonferroni correction

Overview Affymetrix GeneChip technology Data processing –Expression level –Normalisation –Fold change –Statistics Parkinson disease Ventral versus dorsal mesencephalon (case study) Verification of array data –Q-PCR –In situ hybridization –Immune histochemistry

Parkinson’s Disease (PD) A fairly common neurodegenerative disorder (app. 2 million in USA/Europe) Due to loss of the dopamine- producing neurons in the Substantia Nigra Cardinal motor symptoms: tremor, rigidity and bradykinesia Conventional treatment does not halt the progression nerve cell loss

Fetal Transplantation for PD Cells from the developing midbrain (A) –are collected and dissociated (B) –and transplanted into the striatum (C) The cells will integrate with the host brain and produce dopamine.

Stem cells in Parkinson disease Langston JW., J Clin Invest Jan;115(1):23-5.

Overview Affymetrix GeneChip technology Data processing –Expression level –Normalisation –Fold change –Statistics Parkinson disease Ventral versus dorsal mesencephalon (case study) Verification of array data –Q-PCR –In situ hybridization –Immune histochemistry

Aim * TH IHC In the human fetus, DA neurons can be found in the ventral part of the tegmentum (VT) from approximately 6 weeks. In contrast, no DA neurons can be found in the neighboring dorsal part (DT). We aim at finding genes associated with DA differentiation by using GeneChips to compare the expression profiles of VT and DT.

8wVT (B) 8wDT (A) 8wDT (B) 8wVT (A) High quality RNA from 8w GA human ventral midbrain

Experimental setup Compare VT against DT (3x3) Affymetrix Human Genome U133 Chip Set –HG-U133A: Well substantiated genes –HG-U133B: Mostly EST’s –Total: 45,000 probes (genome) A VENTRAL B VENTRAL C VENTRAL A DORSAL B DORSAL C DORSAL

U133A data permutations and filter Red: VM versus DM: VM ( A1 VENTRAL, A2 VENTRAL, B VENTRAL ) DM ( A1 DORSAL, A2 DORSAL, B DORSAL ) Other colors: Permutations Low-stringency filter as dotted line: Average expression > 50 P-value < 0.04 SLR>0.5 (42% up-regulation in VM) Arrange with descending fold change. SLR

Genes up-regulated in VM on U133A Low-stringency filter: Average expression > 50, P-value 0.5 arranged with descending fold change. Total list 107 probes. Only SLR>1 displayed.

Literature verification ALDH1A DAT1 VMAT2 TH Calbindin, 28kDa HNF3a 3x Nurr1 2x IGF 4x SNCA 4x DRD2 KCNJ6 (Girk2) Ret PITX3 BDNF DLK1 (FA1) SLC17A6 (VGLUT2) EPHA5 ERBB4

Overview Affymetrix GeneChip technology Data processing –Expression level –Normalisation –Fold change –Statistics Parkinson disease Ventral versus dorsal mesencephalon (case study) Verification of array data –Q-PCR –In situ hybridization –Immune histochemistry

Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns

ALDH1A1 RT-PCR 35x cDNA#257 (DM)cDNA#256 (VM)cDNA#245 (DM)cDNA#244 (VM)cDNA#254 (DM) cDNA#253 (VM) 299bp 30x 299bp

Why Q-PCR Conservative (10-50 ng template) Sensitive Broad dynamic range Rapid (1-2 hrs) Gel-free Multiple samples can be processed simultaneously (1-96) How to do it Serial dilution of ’known’ standards (standard curve) From the standard curve, expression levels are calculated Then, data are ’normalized’ to housekeeping genes. Q-PCR

Q-PCR verification of genes regulated on U133A

TH Q-PCR on a developmental series of subdissected human embryonic and fetal brain material OD 260/280 were measured to / for all RNA samples

Q-PCR analysis and clustering OD 260/280 were measured to / for all RNA samples

1.5 fold up-regulation from no expression 1.5 fold up-regulation from some expression Fold change in a mixed population

Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns

Organization of ISH procedure

GeneChip verification with ISH ISH from: Vernay et al., J Neurosci May 11;25(19):

Verification of array data Array Data (100 candiate genes) Validation on array material (confirmation) Validation on new samples (universality) Desk work Statistics Literature Bioinformatics RNA Q-PCR ISH Northerns Protein IHC ELISA Westerns

GeneChip verification with IHC Courtesy of Josephine Jensen

Conclusions Using arrays one will get at snapshot of the expression profile under the conditions investigated. –Careful experimental design –RNA quantity and quality are important Since a single array experiment generates thousands of data points, the primary challenge of the technique is to make sense of data. –Calculations/Statistics (back and forth) –Literature mining Independent methods are needed for verification –Q-PCR –In situ hybridization (ISH) –Immunohistochemistry (IHC)

Acknowledgements NsGene, Ballerup, Denmark ( Lars Wahlberg Bengt Juliusson Teit Johansen Neurotech, Huddinge University Hospital, Sweden Åke Seiger Department of Medical Genetics, IMBG, Panum Institute, Denmark Claus Hansen Karen Friis Wallenberg Neuroscience Center, Sweden Anders Björklund Josephine Jensen Elin Andersson CBS, DTU, Denmark Søren Brunak Steen Knudsen Nikolaj Blom Thomas Nordahl Petersen