Using Web-Based Tools for Microarray Analysis

Slides:



Advertisements
Similar presentations
Recombinant DNA Technology
Advertisements

M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Microarray Normalization
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
Getting the numbers comparable
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
DNA Microarray: A Recombinant DNA Method. Basic Steps to Microarray: Obtain cells with genes that are needed for analysis. Isolate the mRNA using extraction.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first paper: 1995) Allows study of thousands of genes at.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Introduce to Microarray
Gene Expression Data Analyses (1) Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
and analysis of gene transcription
By Moayed al Suleiman Suleiman al borican Ahmad al Ahmadi
with an emphasis on DNA microarrays
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Affymetrix vs. glass slide based arrays
CDNA Microarrays MB206.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Microarray Technology
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Agenda Introduction to microarrays
Microarray - Leukemia vs. normal GeneChip System.
Scenario 6 Distinguishing different types of leukemia to target treatment.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
1 Genomics The field of biology based on studying the entire DNA sequence of an organism - its “genome”. Genomics tools don’t replace classical genetics.
Introduction to Microarrays.
Design of Micro-arrays Lecture Topic 6. Experimental design Proper experimental design is needed to ensure that questions of interest can be answered.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Introduction to Microarrays. The Central Dogma.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Microarray: An Introduction
The Central Dogma. Life - a recipe for making proteins DNA protein RNA Translation Transcription.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Gene Expression Analysis
Microarray - Leukemia vs. normal GeneChip System.
Microarray Experiment Design and Data Interpretation
Functional Genomics in Evolutionary Research
Microarray technique approach in animal production.
The Basics of cDNA Microarray Technology
What are the Advantages?
Functional Genomics in Evolutionary Research
Microarray Technology and Applications
Lecture 11 By Shumaila Azam
Example of a DNA Array (note green, yellow red colors; also note that only part of the total array is depicted)
Introduction to cDNA Microarray Technology
DNA Technology.
Overview Expression data basics Introduction Biological network data
The Basics of Microarray Image Processing
Gene Chips.
Introduction to Microarrays.
Getting the numbers comparable
Microarray Data Analysis
Pre-processing AFFY data
Design Issues Lecture Topic 6.
Presentation transcript:

Using Web-Based Tools for Microarray Analysis Michael Elgart

Outline Introduction to microarrays – why use them and what to expect from their results What are they? Why use them? What types are there? Low level analysis Background correction Normalization Quality control Significance analysis Annotations Functional Analysis: Gene Ontology Promoter Analisys

Outline Introduction to microarrays – why use them and what to expect from their results What are they? Why use them? What types are there? Low level analysis Background correction Normalization Quality control Significance analysis Annotations Functional Analysis: Gene Ontology Promoter Analisys

What is a microarray? A tool for analyzing gene expression that consists of a small membrane or glass slide containing samples of thousands of genes arranged in a regular pattern.

The Boom of Microarray Technology: Number of Publications with Affymetrix Chips 200 400 600 800 1000 1200 Year 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 Number of publications

What’s the Point? Large scale (genome-wide) screening Eliminate bias of pre-selecting candidate genes Test multiple hypotheses simultaneously Generate new hypotheses by identifying novel genes associated with experiment Identify novel relationships/patterns among genes

GEO: Public Database Example

Outline Introduction to microarrays – why use them and what to expect from their results What are they? Why use them? What types are there? Low level analysis Background correction Normalization Quality control Significance analysis Annotations Functional Analysis: Gene Ontology Promoter Analisys

What are DNA microarrays? Microarrays are a method of scanning the genome based on an well known property of nucleic acids (hybridization) Complementary strands of DNA/RNA will find each other in solution

Types of DNA Microarray Experiments Some types of experiments that can be done: Measure changes in gene expression RNA hybridizes to DNA Identify genomic gains and losses Genomic DNA hybridizes to DNA Identify mutations in DNA PCR product hybridizes to DNA

Expression Microarray Basics Two parts: Probes: the single stranded DNA molecules on the solid surface Targets: the single stranded labeled population from your experimental source

Microarray Overview Probe

Probe deposition on array Contact printing Ink jet spraying On chip synthesis

Pin Spotting of DNA Arrays Can be automated or manual Relatively cheap but may result in QC issues with spots ~10$ per 100 probe array

Under the microscope

Ink jet spraying

Ink jet sprayed spots on a chip

Affymetrix Will be dealing mainly with this type today, so here is a little more data

On chip synthesis Lithography

Set of probes that identifies a transcript = ProbeSet

Affymetrix: Gene Expression Arrays Transcripts/Genes Arabidopsis Genome 24,000 C. elegans Genome 22,500 Drosophila Genome 18, 500 E. coli Genome 20, 366 Human Genome U133 Plus 47,000 Mouse Genome 39, 000 Yeast Genome 5, 841 (S. cerevisiae) & 5, 031 (S. pombe) Rat Genome 30, 000 Zebrafish 14, 900 Plasmodium/Anopheles 4,300 (P. falciparum) & 14,900 (A. gambiae) Barley (25,500), Soybean (37,500 + 23,300 pathogen), Grape (15,700) Canine (21,700), Bovine (23,000),B.subtilis (5,000), S. aureus (3,300 ORFS), Xenopus (14, 400)

Spots on an Affymetrix chip printed using photolithography

DNA Deposition on Array 2um Taken from Duggan et al, Nature Genetics 21:10

RNA Quality and Quantity 28S rRNA 18S rRNA Degraded sample

Hybridization = expression level The amount of hybridization of RNA to a fragment of DNA representing any gene can be measured if the RNA is labeled with some dye The intensity of hybridization is a surrogate that measures the level of expression of the gene represented by that DNA fragment

Hybridization and Washing of DNA Microarrays Remains one of the most poorly controlled steps in the process Long oligonucleotide probes were designed to standardize the Tms across the slide However, there will be variable efficiency, variable specificity

Slide Scanning Selectable lasers Emission filters with range from 500-700 nm 5 micron resolution Goal is to generate images of the arrays that are used as input for quantitation algorithms

Outline Introduction to microarrays – why use them and what to expect from their results What are they? Why use them? What types are there? Low level analysis Background correction Normalization Quality control Significance analysis Annotations Functional Analysis: Gene Ontology Promoter Analisys

Usually the 75th percentile

Do not use MM data! MAS (3,4,5…) is NOT GOOD Use RMA !!!

Fortunately (?) you don’t do this The result [INTENSITY] NumberCells=4691556 X Y MEAN STDV NPIXELS 0 0 30022.0 4025.9 9 1 0 507.0 48.5 9 2 0 30116.0 4500.7 9 3 0 602.0 97.3 9 4 0 339.0 36.3 9 5 0 491.0 59.1 9 6 0 29208.0 3090.8 9 7 0 877.0 126.0 9 8 0 28683.0 4069.2 9 9 0 645.0 63.6 9 10 0 28536.0 3462.7 9 11 0 473.0 100.5 9 12 0 29509.0 4287.0 9 13 0 667.0 83.2 9 [CEL] Version=3 [HEADER] Cols=2166 Rows=2166 TotalX=2166 TotalY=2166 OffsetX=0 OffsetY=0 GridCornerUL=623 408 GridCornerUR=16090 586 GridCornerLR=15932 15984 GridCornerLL=464 15807 .

So can we just use the data now? Not quite…

Sources of Microarray Data Variability Biological variability in the population No good solution here… At an experimental level, there is variability between preparations and labelling of the sample, variability between hybridisations of the same sample to different arrays, and variability between the signal on replicate features on the same array. Variability between Individuals True gene expression of individual Variability between sample preparations Variability between arrays and hybridisations Variability between replicate features Measured gene expression Expression values in 2 replicas will be different! Can we handle it? 39

Normalization Deals with the fact that the results from identical experiments on two identical microarrays will never be exactly the same. In addition to unavoidable random errors there are also systematic differences caused by: Different incorporation efficiencies of dyes. For example, green colored markers are stronger then red ones (measured as stronger illumination) creating a bias between experiments done with green and red markers. Different amounts of mRNA in the tested sample, causing different expression levels. Difference in experimenter or protocol. Different scanning parameters Differences between chips created in different production batches.

Quantile Normalization Intensity distributions are adjusted to be equivalent Scaling to a target intensity sets the mean signal intensity to the defined value 500 Probe Intensity Probe Intensity Number of Probes Number of Probes

Background Correction Different GC content of probes Location on Chip Effect etc. All this need to be compensated for. The algorythm to do it is RMA

Correct Experimental Design Tree representation of replicate experiments: The first level is at the level of biological replicates This is followed by two independent mRNA extractions In each microarray experiment, each gene (each probe or probe set) is really a separate experiment in its own right Biological Replicates Experiment Replicate 1 Replicate 2 Technical Replicates Extract 1 Extract 2 “We need normalization to be able to look at the biological differences between samples and not technical ones” Elgart M. 43

Reproducibility How big is the difference between sample that was twice hybridized on same type of array? If we look at technical replicas, what do we expect to see?

Summary Statistics Correlation (>2x Diffl Only) % Agree on All using only Top 10,000 brightest probes Correlation (>2x Diffl Only) Red = In Replicates % Agree on 2x Diff’l

Set of probes that identifies a transcript = ProbeSet If all 10 probes give high signal in Treatment and low in Control then all’s well. But what if only 6 of 10 are “positive”? How do we decide whether this gene is expressed?

Set of probes that identifies a transcript = ProbeSet If all 10 probes give high signal in Treatment and low in Control then all’s well. But what if only 6 of 10 are “positive”? How do we decide whether this gene is expressed?

Is this a “hands-on” thing ? Yes. Example :

49

Outline Background correction Normalization Quality control Introduction to microarrays – why use them and what to expect from their results What are they? Why use them? What types are there? Low level analysis Background correction Normalization Quality control Significance analysis Annotations Functional Analysis: Gene Ontology Promoter Analisys