Microarray Pitfalls Stem Cell Network Microarray Course, Unit 3 October 2006.

Slides:



Advertisements
Similar presentations
Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.
Advertisements

Gene Expression Index Stat Outline Gene expression index –MAS4, average –MAS5, Tukey Biweight –dChip, model based, multi-array –RMA, model.
Microarray Normalization
Open Day 2006 From Expression, Through Annotation, to Function Ohad Manor & Tali Goren.
Introduction to Affymetrix Microarrays
Mathematical Statistics, Centre for Mathematical Sciences
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
Gene expression analysis summary Where are we now?
Recap Sometimes it is necessary to conduct Bad Science – often the product of having too much information Human Genome Project changed natural scientists.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Using Isoform-Sensitive Microarrays to Study Different Modes of Alternative Splicing Christina Zheng Ares Lab RNA Club September 14, 2006.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Introduce to Microarray
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Post-Genomics Experimental Design CSC Gene Expression and Proteomics Simon Cockell & Cedric Simillion.
Making, screening and analyzing cDNA clones Genomic DNA clones
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 5 – Testing for equivalence or non-inferiority. Power.
and analysis of gene transcription
A genome-wide perspective on translation of proteins Dec 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel.
Statistical Bioinformatics QTL mapping Analysis of DNA sequence alignments Postgenomic data integration Systems biology.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
CDNA Microarrays MB206.
Data Type 1: Microarrays
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Some views on microarray experimental design Rainer Breitling Molecular Plant Science Group & Bioinformatics Research Centre University of Glasgow, Scotland,
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Dr Andrew Harrison Departments of Mathematical Sciences and Biological Sciences University of Essex Looking for signals in tens of thousands.
Intelligent systems in bioinformatics Introduction to the course.
A Short Overview of Microarrays Tex Thompson Spring 2005.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Alistair Chalk, Elisabet Andersson Stem Cell Biology and Bioinformatic Tools, DBRM, Karolinska Institutet, September Day 5-2 What bioinformatics.
Central dogma: the story of life RNA DNA Protein.
Statistical Testing with Genes Saurabh Sinha CS 466.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
SAGE data in StemBase Christopher Porter Ottawa Health Research Institute.
Biochemistry April Lecture DNA Microarrays.
Title: Assign Pathways to Gene Set June 21, 2007 Guanming Wu.
Finding genes in the genome
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
The State of Microarrays The Scientist: 2003 By: Hien Dang.
Affymetrix User’s Group Meeting Boston, MA May 2005 Keynote Topics: 1. Human genome annotations: emergence of non-coding transcripts -tiling arrays: study.
Engineering magnetosomes to express novel proteins Which ones? Tweaking p18 Linker Deleting or replacing GFP Something else? TRZN Oxalate decarboxylases.
Vicky Fan, Bioinformatics Institute.  Solid surface which the sequences from thousands of different genes (or proteins) are attached at fixed locations.
How to read a scientific paper Professor Mark Pallen Acknowledgements : John W. Little and Roy Parker, University of Arizona.
Microarray: An Introduction
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
Canadian Bioinformatics Workshops
Statistical Applications in Biology and Genetics
Statistical Testing with Genes
Functional Genomics in Evolutionary Research
The mRNA stem cell signature.
Statistical Testing with Genes
Presentation transcript:

Microarray Pitfalls Stem Cell Network Microarray Course, Unit 3 October 2006

Goals To provide some guidelines on Affymetrix microarrays: –How to use them –How not to use them –Things to keep in mind when designing experiments and analyzing data This is a general discussion of issues and is by no means exhaustive

Inconsistent Annotations Affymetrix provided probeset annotations change over time The gene symbol associated with a given probeset is not necessarily stable This is due to changes in gene prediction as new information becomes available.

Inconsistent Annotations (2) Perez-Iratxeta, C. and M.A. Andrade Inconsistencies over time in 5% of NetAffx probe-to-gene annotations. BMC Bioinformatics. 6, 183. –5% of probesets have gene identifiers that change over the two year time span covered by this analysis An inconsistently annotated probeset

Inconsistent Annotations (3) How do we deal with this? –Always note annotation version used in analysis especially when it is for publication –Report probeset name as well as gene symbol –Remember that re-analysis with later annotations may yield different results –Keep your annotation files up to date

Old chips, new data Expression microarrays are designed based the best available model of the genome of interest The model for the HG-U133 microarrays was a human genome assembly that was only 25% complete! The human assembly is >99% complete now

Old chips, new data (2) How do we deal with this? –A number of groups provide re-mappings of probes to probesets based upon the latest data available, for example: Dai M, et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 2005;33:e175

Multiple Testing Corrections A single expression microarray experiment actually consist of hundreds of thousands of simultaneous parallel experiment This means you can test many hypotheses simultaneously This is not free: the significance of any given result is decreases as a function of the number of hypotheses tested

Multiple Testing Corrections (2) How do we deal with this? –Limit the number of hypothesis you are testing instead of just ‘fishing’ in the whole data set. –Do this by selecting a set of candidate genes ahead of time based on your knowledge of the biology of the system.

Multiple Testing (3) Sandrine Dudoit, Juliet Popper Shaffer and Jennifer C. Boldrick Multiple Hypothesis Testing in Microarray Experiments Statistical Science 2003, Vol. 18, No. 1, 71–103 –“The biological question of differential expression can be restated as a problem in multiple hypothesis testing: the simultaneous test for each gene of the null hypothesis of no association between the expression levels and the responses” Talk to a statistician if you have doubts

Not everything is in the array Probesets are designed with a bias towards the 3’ end of the gene. they won’t distinct splice variants won’t pick up alternative 3’ endings

Not everything is in the array (2) What can we do about this? –You should be aware of this, but not much can be done. –Use other technologies to complement your microarray results (PCR, sequencing)

What are you measuring? Remember that you are detecting the average mRNA over a population of cells. Is your sample homogenous? If it’s not homogenous then what are you measuring? How many types of cells in what state? Time series of differentiating cells are particularly problematic.

Inhomogenous Samples? Many sources of inhomogeneity –Source organism gender –Cell cycle –Tissue source –Diet Some can be eliminated All should be documented where possible

Chips don’t detect protein Central assumption of microarray analysis: The level of mRNA is positively correlated with protein expression levels. –Higher mRNA levels mean higher protein expression, lower mRNA means lower protein expression Other factors: –Protein degradation, mRNA degradation, polyadenylation, codon preference, translation rates,….

Conclusion This is a general discussion of issues, doesn’t cover all pitfalls. Please contact if you have any comments, corrections or See associated bibliography for references from this presentation and further reading. Thanks for your attention!