EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.

Slides:



Advertisements
Similar presentations
Microarray Technique, Analysis, and Applications in Dermatology Jennifer Villaseñor-Park 1 and Alex G Ortega-Loayza 2 1 Department of Dermatology, University.
Advertisements

Yinyin Yuan and Chang-Tsun Li Computer Science Department
Recombinant DNA technology
Bioinformatics Lectures at Rice
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
EE150a – Genomic Signal and Information Processing Seminar series –lectures on first 3 meetings, followed by students presentations –statistical signal.
Microarrays Dr Peter Smooker,
Microarray analysis Golan Yona ( original version by David Lin )
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Bacterial Physiology (Micr430)
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Introduce to Microarray
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
By Moayed al Suleiman Suleiman al borican Ahmad al Ahmadi
Analysis of microarray data
Microarray Preprocessing
Motif finding: Lecture 1 CS 498 CXZ. From DNA to Protein: In words 1.DNA = nucleotide sequence Alphabet size = 4 (A,C,G,T) 2.DNA  mRNA (single stranded)
with an emphasis on DNA microarrays
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Affymetrix vs. glass slide based arrays
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
‘Omics’ - Analysis of high dimensional Data
1 EE381V: Genomic Signal Processing Lecture #13. 2 The Course So Far Gene finding DNA Genome assembly Regulatory motif discovery Comparative genomics.
DNA MICROARRAYS WHAT ARE THEY? BEFORE WE ANSWER THAT FIRST TAKE 1 MIN TO WRITE DOWN WHAT YOU KNOW ABOUT GENE EXPRESSION THEN SHARE YOUR THOUGHTS IN GROUPS.
Affymetrix GeneChips and Analysis Methods Neil Lawrence.
CDNA Microarrays MB206.
Detection and Compensation of Cross- Hybridization in DNA Microarray Data Joint work with Quaid Morris (1), Tim Hughes (2) and Brendan Frey (1) (1)Probabilistic.
Data Type 1: Microarrays
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Microarray - Leukemia vs. normal GeneChip System.
Scenario 6 Distinguishing different types of leukemia to target treatment.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
Summarization of Oligonucleotide Expression Arrays BIOS Winter 2010.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
DNA Gene A Transcriptional Control Imprinting Histone Acetylation # of copies of RNA? Post Transcriptional Processing mRNA Stability Translational Control.
Proteome and Gene Expression Analysis Chapter 15 & 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Modelling Gene Regulatory Networks using the Stochastic Master Equation Hilary Booth, Conrad Burden, Raymond Chan, Markus Hegland & Lucia Santoso BioInfoSummer2004.
Lecture 23 – Functional Genomics I Based on chapter 8 Functional and Comparative Genomics Copyright © 2010 Pearson Education Inc.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
The State of Microarrays The Scientist: 2003 By: Hien Dang.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Introduction to Oligonucleotide Microarray Technology
Microarray: An Introduction
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Gene Expression Analysis
Microarray - Leukemia vs. normal GeneChip System.
Introduction to cDNA Microarray Technology
Presentation transcript:

EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004

Recall the information flow in cells Replication of DNA –{A,C,G,T} to {A, C, G,T} Transcription of DNA to mRNA –{A,C,G,T} to {A, C, G,U} Translation of mRNA to proteins –{A,C,G,U} to {20 amino-acids} Interrupt the information flow and measure gene expression levels!

Gene Microarrays A medium for matching known and unknown sequences of nucleotides based on hybridization (base-pairing: A-T, C-G) Applications –identification of a sequence (gene or gene mutation) –determination of expression level (abundance) of genes –verification of computationally determined genes Enables massively parallel gene expression studies Two types of molecules take part in the experiments: –probes, orderly arranged on an array –targets, the unknown samples to be detected

Microarray Technologies Oligonucleotide arrays (Affymetrix GeneChips) –probes are photo-etched on a chip (20-80 nucleotides) –dye-labeled mRNA is hybridized to the chip –laser scanning is used to detect gene expression levels (i.e., amount of mRNA) cDNA arrays –complementary DNA (cDNA) sequences “spotted” on arrays ( nucleotides) –dye-labeled mRNA is hybridized to the chip (2 types!) –laser scanning is used to detect gene expression levels There are various hybrids of the two technologies above

Oligonucleotide arrays Source: Affymetrix website

GeneChip Architecture Source: Affymetrix website

Hybridization Source: Affymetrix website

Laser Scanning Source: Affymetrix website

Sample Image Source: The Paterson Institute for Cancer Research

Competing Microarray Technologies So far considered oligonucleotide arrays: –automated, on-chip design –light dispersion may cause problems –short probes, cDNA microarrays are another technology: –longer probes obtained via PCR, polymerase chain reaction –[sidenote: what is optimal length?] –probes grown in a lab, robot printing –two types of targets – control and test

cDNA Microarrays

Sample cDNA Microarray Image

Some Design Issues Photo-etching based design: unwanted light exposure –border minimization –the probes are long Hybridization: binding of a target to its perfect complement However, when a probe differs from a target by a small number of bases, it still may bind This non-specific binding (cross-hybridization) is a source of measurement noise In special cases (e.g., arrays for gene detection), designer has a lot of control over the landscape of the probes on the array

Dealing with Measurement Noise Recent models of microarray noise –measurements reveal signal-dependent noise (i.e., shot-noise) as the major component –additional Gaussian-like noise due to sample preparation, image scanning, etc. Image processing assumes image background noise –attempts to subtract it –sets up thresholds Lack of models of processes on microarrays

Probabilistic DNA Microarray Model Consider an m £ m DNA microarray, with m 2 unique types of nucleotide probes A total of N molecules of n different types of cDNA targets with concentrations c 1,…,c n, is applied to the microarray Measurement is taken after the system reached chemical equilibrium Our goal: from the scanned image, estimate the concentrations

DNA Microarray Model Cont’d Each target may hybridize to only one type of probe There are k non-specific bindings Model diffusion of unbound molecules by random walk; distribution of unbound molecules uniform on the array –justified by reported experimental results Assume known probabilities of hybridization and cross- hybridization –Theoretically: from melting temperature –Experimentally: measurements (e.g., from control target samples)

Markov Chain Model Modeling transition between possible states of a target: one specific binding state k=2 non-specific bindings p n =1-kp c -p h is probability that an unbound molecule remains free Measurement is taken after the system reached state of chemical equlibrium – need to find steady state

Markov Chain Model Cont’d Let  i =[  i,1  i,2 …  i,k+2 ] T be a vector whose components are numbers of the type i targets that are in one of the k+2 states of the Markov chain  i,1 is the # of hybridized molecules  i,j, 2 < j · k+2 is # of cross-hybrid. Note that  k=1 k+2  i,k =c i for every i.

Stationary State of the Markov Chain In equilibrium, we want to find  i such that where the transition matrix P i is given by Clearly, in the stationary state we have Finally, ratio  i /c i gives stationary state probabilities

Linear Microarray Model Let matrix Q collect the previously obtained probabilities The microarray measurement model can be written as Vector w describes inherent fluctuations in the measured signal due to hybridization (shot-noise) Binding of the j-type target to the i-type probe is the Bernoulli random variable with variance q i,j (1-q i,j ) –hence the variance of w i is given by Vector v is comprised of iid Gaussian entries

Detection of Gene Expression Levels A simple estimate is obtained via pseudo-inverse, Maximize a posteriori probability p(s|c), which is equivalent to where the matrix  is given by Optimization above readily simplifies to

Simulation Results Consider an 8 £ 8 array (m=8) Apply n=6 types of targets Concentrations: [1e5 2e5 2e5 2e5 1e5 2e5] (N=1e6) Assume the following probabilities: –hybridization – 0.8 –cross-hybridization – 0.1 –release – 0.02 Let k=3 (number of non-specific bindings) Free molecules perform random walk on the array

Simulation Results: Readout Data

Simulation Results: Estimate

Some Comments Adopt mean-square error for a measure of performance As expected, we observe significant improvement over raw measurements (improvement in terms of MSE) Things to do: –investigate how to incorporate control sample measurements –modification of the technique for very large microarrays is needed (matrix inversion may be unstable) Experimental verification!

Why is this Estimation Problem Important? Microarrays measure expression levels of thousands of gene simultaneously Assume that we are taking samples at different times during a biological process Cluster data in the expression level space –relatedness in biological function often implies similarity in expression behavior (and vice versa) –similar expression behavior indicates co-expression Clustering of expression level data heavily depends on the measurements –better estimation may lead to different functionality conclusions

Summary Microarray technologies are becoming of great importance for medicine and biology –understanding how the cell functions, effects on organism –towards diagnostics, personalized medicine Plenty of interesting problems –combinatorial design techniques –statistical analysis of the data –signal processing / estimation