Affymetrix GeneChips and Analysis Methods Neil Lawrence.

Slides:



Advertisements
Similar presentations
Object Orie’d Data Analysis, Last Time •Clustering –Quantify with Cluster Index –Simple 1-d examples –Local mininizers –Impact of outliers •SigClust –When.
Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.
1 Parametric Empirical Bayes Methods for Microarrays 3/7/2011 Copyright © 2011 Dan Nettleton.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Getting the numbers comparable
Microarray GEO – Microarray sets database
Microarray analysis Golan Yona ( original version by David Lin )
Gene Expression Data Analyses (3)
Biological Modelling Gene Expression Data Neil Lawrence.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Information Aspects of Nucleic Acids Measurement Technologies Description of nucleic acid measurement technologies Algorithmic, optimization, data analysis.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Microarray Data Analysis Data quality assessment and normalization for affymetrix chips.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Introduce to Microarray
Theoretical and experimental comparisons of gene expression indexes for oligonucleotide microarrays Division of Human Cancer Genetics Ohio State University.
GeneChips and Microarray Expression Data
Microarray Preprocessing
1 Normalization Methods for Two-Color Microarray Data 1/13/2009 Copyright © 2009 Dan Nettleton.
with an emphasis on DNA microarrays
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Affymetrix vs. glass slide based arrays
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Data Type 1: Microarrays
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Microarray - Leukemia vs. normal GeneChip System.
Scenario 6 Distinguishing different types of leukemia to target treatment.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
Statistical Principles of Experimental Design Chris Holmes Thanks to Dov Stekel.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Introduction to Microarrays.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Introduction to Microarrays. The Central Dogma.
Cluster validation Integration ICES Bioinformatics.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
The State of Microarrays The Scientist: 2003 By: Hien Dang.
Multiple Sequence Alignment Vasileios Hatzivassiloglou University of Texas at Dallas.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Introduction to Oligonucleotide Microarray Technology
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
The Central Dogma. Life - a recipe for making proteins DNA protein RNA Translation Transcription.
High Dimensional Probabilistic Modelling through Manifolds
Neil Lawrence Machine Learning Group Department of Computer Science
Microarray - Leukemia vs. normal GeneChip System.
Lecture 2 – Monte Carlo method in finance
Introduction to Microarrays.
Getting the numbers comparable
Data Type 1: Microarrays
Presentation transcript:

Affymetrix GeneChips and Analysis Methods Neil Lawrence

Schedule 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo Lecture 9 th MayAffymetrix GeneChips 16 th MayGuest Lecturer – Dr Pen Rashbass 23 rd MayAnalysis methods and some of this

Photolithography Photolithography (Affymetrix) –Based on the same technique used to make the microprocessors. –Oligonucleotides are generated in situ on a silicon surface. –Oligonucleotides up to 30bp in length. –Array density of 10 6 probes per cm -2.

Affymetrix Stock Price

Affymetrix Only one biological sample per chip. Oligonucleotides represent a portion of a gene’s sequence. Twenty sub- sequences present for each gene.

Perfect vs Mismatch For each oligonucleotide there is –A perfect match –A mismatch The perfect match is a sub-sequence of the true sequence. The mismatch is a sub-sequence with a ‘central’ base-pair replaced.

Affymetrix Analysis Mismatch is designed to measure ‘background’. Signal from each sub-sequence is I Perfect match – I Mismatch Twenty of these sub-sequences are present. Average of all these signals is taken.

Problems Sometimes I mismatch > I perfect match –Solution: set it to 20??!!! Other issues –Present/Absent call Based on the number of Signals > 0. Proprietary Technology –You don’t know what the subsequences are. Apparently this is changing!

Scaling Factors – Maximum likelihood estimation The data produced is still affected by undesirable variations that we need to remove. We can assume that the variations are primarily multiplicative: (No intensity dependent or print-tip effect) Obs.-exp.Level = true-exp.Level * error *random-noise (chip variations) (biological noise)

Model Assumption Organise the twelve values from three exogenous control species in a matrix: X=[NControls * NChips] Error model: Here m i is associated with each control and r j is associated with each chip or experiment. Taking logs we have:

Scaling Factors Calculating scaling factors using maximum likelihood estimation of the model parameters Likelihood: Estimates are calculated solving Scaling factors are thus :

You Should Know The Central Dogma (Gene Expression). cDNA chip overview. Noise in cDNA chips. Affymetrix GeneChip overview.

Analysis of Microarray Data Vanilla-flavour analysis: –Obtain temporal profiles (e.g. from last week’s mouse experiment). –‘Cluster’ profiles –Assume genes in the same cluster are functionally related.

Temporal Profiles Lack of statistical independence. Take temporal differences to recover. Justified by assuming and underlying Markov process.

Analysis of Microarray Data Day 1Day 2Day 3Day 4Day 5Day Original Temporal Profile Take Temporal Differences Gene expression level Change in exp. level

Consider Clustering via MSE These two similar profiles won’t cluster Day 1Day 2Day 3Day 4Day 5Day Gene expression level Day 1Day 2Day 3Day 4Day 5Day Gene expression level

The Temporal Differences Will Change in exp. level Change in exp. level

Many Other Different Techniques Hierachical Clustering Self-Organising Maps ML-Group –Generative Topographic Mappings (GTM)

GTM Data lies in high dimensional space (>2). Model it with a lower embedded dimensionality (2). MATLAB Demo of embedded dimensions.

GTM on Gene Data MATLAB Demo.

Conclusions Take Temporal differences of Profiles. Attempt to Cluster. Test Hypothesis that clustered Genes are functionally related. Good luck in the Exam!