Use of Mixture Model in a genome-wide DNA microarray-based genetic screen for components of the NHEJ Pathway in Yeast Rafael A. Irizarry Department of.

Slides:



Advertisements
Similar presentations
Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Advertisements

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Chromatin Immuno-precipitation (CHIP)-chip Analysis
Assessing the Use of Unmodified 40-mer Oligonucleotides in Barcode Microarray Technology Danielle Hyun-jin Choi Dr. A. Malcolm Campbell Davidson College,
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Normalization of Microarray Data - how to do it! Henrik Bengtsson Terry Speed
Getting the numbers comparable
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Introduce to Microarray
Gene Expression Data Analyses (1) Trupti Joshi Computer Science Department 317 Engineering Building North (O)
25 and 27 February, 2004 Chapter 6C Proteomics Structural and Functional Characterization in the Post- genomic era.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Designing Microarray Experiments Naomi Altman Oct. 06.
Analysis of microarray data
(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
Affymetrix vs. glass slide based arrays
Yeast as a model organism Model eukaryote –Experimental genetics –Gene function – Orthologs, family members –Pathway function - “Biological synteny” Testbed.
Protein protein interactions
Statistical Analyses of Microarray Data Rafael A. Irizarry Department of Biostatistics
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
We calculated a t-test for 30,000 genes at once How do we handle results, present data and results Normalization of the data as a mean of removing.
Recombinant DNA Technology. Restriction endonucleases - Blunt ends and Sticky ends.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
Genomics I: The Transcriptome
Topic intro slides More complete coverage of components involved in gene expression More information on expression technologies -what would the ideal chip.
A Microarray-Based Screening Procedure for Detecting Differentially Represented Yeast Mutants Rafael A. Irizarry Department of Biostatistics, JHU
Introduction to Microarrays.
Statistics for Differential Expression Naomi Altman Oct. 06.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Microarray hybridization Usually comparative – Ratio between two samples Examples – Tumor vs. normal tissue – Drug treatment vs. no treatment – Embryo.
Introduction to Microarrays. The Central Dogma.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
DNA Gene A Transcriptional Control Imprinting Histone Acetylation # of copies of RNA? Post Transcriptional Processing mRNA Stability Translational Control.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Modeling Promoter and Untranslated Regions in Yeast Abstract T ranscriptional regulation is the primary form of gene regulation in eukaryotes. Approaches.
Two powerful transgenic techniques Addition of genes by nuclear injection Addition of genes by nuclear injection Foreign DNA injected into pronucleus of.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Bioinformatics Expression profiling and functional genomics Part I: Preprocessing Ad 29/10/2006.
Hybridization Design for 2-Channel Microarray Experiments Naomi S. Altman, Pennsylvania State University), NSF_RCN.
Gene Expression Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Empirical Bayes Analysis of Variance Component Models for Microarray Data S. Feng, 1 R.Wolfinger, 2 T.Chu, 2 G.Gibson, 3 L.McGraw 4 1. Department of Statistics,
Henrik Bengtsson Mathematical Statistics Centre for Mathematical Sciences Lund University Plate Effects in cDNA Microarray Data.
Example of a DNA array used to study gene expression (note green, yellow red colors; also note.
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Lecture 26 GWAS Based on chapter 9 Functional and Comparative Genomics Copyright © 2010 Pearson Education Inc.
Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Other uses of DNA microarrays
Microarray: An Introduction
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
The array comparative genomic hybridization (aCGH/CMA) technology
Gene Chips.
A Robust Toolkit for Functional Profiling of the Yeast Genome
Introduction to Microarrays.
Statistical Process Control
Getting the numbers comparable
Normalization for cDNA Microarray Data
Presentation transcript:

Use of Mixture Model in a genome-wide DNA microarray-based genetic screen for components of the NHEJ Pathway in Yeast Rafael A. Irizarry Department of Biostatistics, JHU

Damaged DNA Rad50p/Mre11p/Xrs2p Yku70p/Yku80p (DNA-PK ) DNA end binding Lig4p/Lif1p Ligation Nucleolytic processing Repaired DNA

kanR A Transformation into deletion pool Select for Ura + transformants Genomic DNA preparation Circular pRS416 PCR Cy5 labeled PCR productsCy3 labeled PCR products Oligonucleotide array hybridization B EcoRI linearized PRS416 NHEJ Defective MCS CEN/ARS URA3 ttaa aatt CEN/ARS URA3 UPTAG DOWNTAG

Data 5718 mutants 3 replicates on each slide 5 Haploid slides, 4 Diploid slides Haploids are divided into 2 downtags, 3 uptag (2 of which replicate uptags) Diploids are divided into 3 uptags (2 of which are replicates) and 2 uptags

Which mutants are NHEJ defective? Find mutants defective for transformation with linear DNA Dead in linear transformation (green) Alive in circular transformation (red) Look for spots with large log(R/G)

Improvement to usual approach Take into account that some mutants are dead and some alive Use a statistical model to represent this Mixture model? With ratio’s we lose information about of R and G separately Look at them separately (absolute analysis)

Warning Absolute analyses can be dangerous for competitive hybridization slides We must be careful about “spot effect” Big R or G may only mean the spot they where on had large amounts of cDNA Look at some facts that make us feel safer

Correlation between replicates R1 R2 R3 G1 G2 G3 R R R G G G

Correlation between red, green, haploid, diplod, uptag, downtag RHD RHU RDD RDU GHD GHU GDD GDU RHD RHU RDD RDU GHD GHU GDD GDU

BTW The mean squared error across slides is about 3 times bigger than the mean squared error within slides

Mixture Model We use a mixture model that assumes: There are three classes: –Dead –Marginal –Alive Normally distributed with same correlation structure from gene to gene

Random effect justification Each x = (r1,…,r5,g1,…,g5) will have the following effects: Individual effect: same mutant same expression (replicates are alike) Genetic effect: same genetics same expression PCR effect : expect difference in uptag, downtag

Does it fit?

What can we do now that we couldn’t do before? Define a t-test that takes into account if mutants are dead or not when computing variance For each gene compute likelihood ratios comparing two hypothesis: alive/dead vs.dead/dead or alive/alive

QQ-plot for new t-test

Better looking than others

1 YMR106C a a YOR005C a d YLR265C a m YDL041W a m YIL012W a a YIL093C a a YIL009W a a YDL042C a d YIL154C m m YNL149C m d YBR085W a a YBR234C m d YLR442C a a 100

Acknowledgements Siew Loon Ooi Jef Boeke Forrest Spencer Jean Yang