Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.

Slides:



Advertisements
Similar presentations
Introduction to Microarray Gene Expression
Advertisements

Bioconductor in R with a expectation free dataset Transcriptomics - practical 2012.
1. Principles and important terminology 2. RNA Preparation and quality controls 3. Data handling 4. Costs 5. Protocols 6. Information for collaboration.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
DNA microarray and array data analysis
Affymetrix Microarray and Illumina/ Solexa NextGen Sequencing Yuannan Xia, Ph.D Genomics Core Research Facility
GCB/CIS 535 Microarray Topics John Tobias November 3 rd, 2004.
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Central Dogma 2 Transcription mRNA Information stored In Gene (DNA) Translation Protein Transcription Reverse Transcription SELF-REPAIRING ARABIDOPSIS,
Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first paper: 1995) Allows study of thousands of genes at.
5 µm Millions of copies of a specific oligonucleotide probe >5 760,000 different complementary probes ~ targets Single stranded, labeled ‘target’
Alternative Splicing As an introduction to microarrays.
A snapshot that captures the activity
Introduce to Microarray
Affymetrix GeneChip Data Analysis Chip concepts and array design Improving intensity estimation from probe pairs level Clustering Motif discovering and.
Introduction to DNA microarrays DTU - January Hanne Jarmer.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
GeneChips and Microarray Expression Data
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
and analysis of gene transcription
By Moayed al Suleiman Suleiman al borican Ahmad al Ahmadi
Affymetrix vs. glass slide based arrays
Evolva Biotech SA Microarray and Macro opportunities for Discovery informatics Head of Informatics Mobile.
‘Omics’ - Analysis of high dimensional Data
Future data scientists also need to be skilled in statistics, and to be able to tell stories with data, to make it understandable to a variety of people.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
CDNA Microarrays MB206.
Data Type 1: Microarrays
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Library screening Heterologous and homologous gene probes Differential screening Expression library screening.
Introduction to DNA microarrays DTU - May Hanne Jarmer.
Agenda Introduction to microarrays
Expression of the Genome The transcriptome. Decoding the Genetic Information  Information encoded in nucleotide sequences contained in discrete units.
Microarray - Leukemia vs. normal GeneChip System.
Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC.
Bioconductor in R with a expectation free dataset Transcriptomics - practical 2014.
ARK-Genomics: Centre for Comparative and Functional Genomics in Farm Animals Richard Talbot Roslin Institute and R(D)SVS University of Edinburgh Microarrays.
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
Comparison of Microarray Data Generated from Degraded RNA using Five Different Target Synthesis Methods and Commercial Microarrays Scott Tighe and Tim.
Genomics I: The Transcriptome
GeneChip® Probe Arrays
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
MICROARRAY TECHNOLOGY
Introduction to Microarrays.
Whole Genome Approaches to Cancer 1. What other tumor is a given rare tumor most like? 2. Is tumor X likely to respond to drug Y?
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Introduction to Microarrays. The Central Dogma.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Lecture 23 – Functional Genomics I Based on chapter 8 Functional and Comparative Genomics Copyright © 2010 Pearson Education Inc.
Microarray Data Analysis The Bioinformatics side of the bench.
Oigonucleotide (Affyx) Array Basics Joseph Nevins Holly Dressman Mike West Duke University.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Introduction to Oligonucleotide Microarray Technology
Microarray: An Introduction
Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.
Transcriptomics History and practice.
Using ArrayStar with a public dataset
Microarray - Leukemia vs. normal GeneChip System.
Expression of the Genome
Microarray Technology and Applications
Introduction to Microarrays.
Transcriptomics History and practice.
Microarray Data Analysis
Data Type 1: Microarrays
Presentation transcript:

Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of results Inadequate Computer skills to handle large datasets Intimacy with nature (strengths and deficiencies) of the raw data Facile use of computer operating system is absent Biological interpretatio n Application of available statistical tools Functional annotation of results Inadequate Computer skills to handle large datasets Intimacy with nature (strengths and deficiencies) of the raw data Facile use of computer operating system is absent Biological interpretatio n Biology experiment complete Thorough mining of the data for useful information Obstacles that thwart a successful analysis of micro-array data

1. Interrogates thousands of genes. (12,000 55,000 28,869) 2. Versatile with respect to tissues. 3. Recently expanded beyond major biomedical research models. 4. Asks which genes are affected by a treatment? 5. Equivalent to 35,000 northern blots overnight. 6. Time course experiments gain immense value. Benefits of the Gene Array Approach

Genechip

Generate Affy.dat file What is covered in this course? Hyb. cRNA cocktail Hybridize to Affy arrays Output as Affy.chp file Export as Text file TotRNA Data mining Pattern mining Pathway Analysis Illumina platform at CCF facility Case/UH Core Facility

Perfect Match 25 mer DNA oligo WT Expression Array Design 3’ 5’ Only PM used Perfect Match Mismatch Probe Set (<= 26 probes) PM Probe Cell 11  Validate using Blast and Tm

Total RNA (1-5  g) AAAAAAAAA cRNA preparation cRNA is now ready for hybridization to test chip cDNA Strand 1 synthesisTTTTTTTTTNNNNNNNNN AAAAAAAAA SS II reverse transcriptase T7RNA pol. promoter cDNA Strand 2 synthesis TTTTTTTTTNNNNNNNNN AAAAAAAAANNNNN E. coli DNA pol. I T7RNA pol. promoter NNNNNNNN IVT cRNA synthesis amplifies and labels transcripts with Biotin NNNNNNNNNNNNNAAAAAAAAAAAAAAN TTTTTT T T T T T UUUUUUUUUU ……….. UUUUUUUUUU ……….. UUUUUUUUUU ……….. UUUUUUUUUU ……….. UUUUUUUUUU ……….. T 7 RNA pol. T T Fragmented cRNA 1. Conversion to cRNA 2. Amplification (linear) 3. Labelling (biotin)

Chips are placed in the Fluidics station where they are washed, stained and washed again (2.5 hours) After staining, the signal intensities are measured with a laser scanner (15 min) Data is acquired by the computer as soon as the scan has been completed. Chip is placed in a hybridization oven and incubated overnight Hybridization cocktail Affymetrix Array Chip Sample is added to a hybridization cocktail along with spiked control transcripts and is loaded onto an array chip

The first image is “sample1.dat.” note the pixel to pixel variation within a probe cell A “*.cel.” file is automatically generated when the “*.dat” image first appears on the screen. Note that this derivative file has homogenous signal intensity within its probe cells

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p4 Average How do we get the individual gene signals using RMA in EC?

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p4

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p

SOMs Self organizing maps or SOMs are a popular method for detecting patterns in large data sets

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p4 Average How do we get the individual gene signals using RMA in EC?

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p4

Sample 1Sample 2Sample 3 Gene 1Gene 1 Gene 2Gene 2 Gene 3Gene 3 g1p1 g1p2 g1p3 g1p4 G G g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p1 g1p2 g1p3 g1p4 g2p1 g2p2 g2p3 g2p4 g3p1 g3p2 g3p3 g3p4 g1p3 g2p4 g1p2 g3p2 g1p1 g3p1 g2p3 g2p2 g3p3 g2p1 g1p4 g3p4 g2p1 g2p3 g3p4 g2p2 g1p1 g3p1 g3p3 g2p4 g1p2 g1p3 g1p4 g3p2 g1p4 g2p3 g1p1 g3p2 g2p2 g1p3 g3p1 g3p3 g3p4 g1p2 g2p1 g2p

7% not transcribed 1% ORF 1%UTR 35-40% Intron Non-protein-coding RNAs The information content of the human genome ENCODE Consortium (Nature 2007 Vol 447: ) The Human Genome Protein-coding genes } Small RNAs ~10% Functional Long ncRNAs

The increase in complexity among eukaryotes is concomitant with an increase in the ratio of non-coding to coding DNA Mattick, 2007

Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of results Inadequate Computer skills to handle large datasets Intimacy with nature (strengths and deficiencies) of the raw data Facile use of computer operating system is absent Biological interpretatio n Application of available statistical tools Functional annotation of results Inadequate Computer skills to handle large datasets Intimacy with nature (strengths and deficiencies) of the raw data Facile use of computer operating system is absent Biological interpretation Biology experiment complete Thorough mining of the data for useful information Obstacles that thwart a successful analysis of micro-array data