Download presentation
Presentation is loading. Please wait.
Published byPaul McDonald Modified over 8 years ago
1
Arrays How do they work ? What are they ?
2
WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise Next gene quantify Tissue sample
3
Probe preparation Acquire or Generate probes ‘All the genes you want’ Label cDNA from sample 1 RNA …and sample 2 RNA Target preparation Extract RNA from your Control AND your Experimental plant Spot
4
Identify ‘spots’ remove background produce ‘red/green’ ratios Link ratio to relative abundance. Link spot to gene. Link genes to each other. Hybridise & Scan
5
Arrays How do (did) you make them ?
6
Arrayers
7
Before processing, we have a LOT of spots
8
After processing, we have a LOT of objective data Example Hybridisation
9
What biological questions can you answer with arrays ?
10
5 hormone response gene family members In different experiments 3. testes vs brain hyb1. +hormone vs ctrl hyb 2. Normal vs pathology hyb microarray Sorting out gene families Biopsy type:
11
However ….. Duplication in genomes is a real problem Animal Plant Fungal
12
Gene families (plant): (# of members as a proportion of the genome) Apart from wholesale duplication Unique2345>5 35%12.5%7%4.4%3.6%37.4% Conservation between genes: 37% of genes are highly conserved (TBLASTX E<10 -30 ) 10% more are partially conserved (TBLASTX E<10 -5 )
13
One choice would be: Amplifications of cDNAs chosen by partial sequence (ESTs) What goes on the slide ?
14
Gene of interest ESTs have inherent problems Example EST sequence 1 Homologous EST sequence 2 Dissimilar EST sequence 3 On the slide 1 2 3 Labelled target may hybridise similarly to each
15
Probes Sequence Probe Background control Chip 5’ 3’ A more precise, reliable solution: Affymetrix genechips
16
1.28cm 5 - 50 µm Millions of identical oligonucleotide probes per feature 49 - 400 chips/wafer millions of features/chip
18
Synthesis of Ordered Oligonucleotide Arrays One nucleotide at a time.
19
RNA Quality control
20
cDNA Wash & Stain Scan Hybridise (16 hours) RNA AAAA BBBB Biotin-labeled transcripts Fragment (heat, Mg 2+ ) Fragmented cRNA B B B B IVT (Biotin-UTP Biotin-CTP)
21
Array cRNA Target Hybridized Array Ab detection
22
Affymetrix software derives the intensity for each probe from the 75% quantile of the pixel values in each box.
23
The intensities of the multiple probes within a probeset are combined into ONE measure of expression Expression Measure
24
A recent solution: RNAseq 3 ‘simple’ steps: Take an RNA sample, 1. sequence it and align it to the genome. 2. count how many times each transcript appears. 3. work out the frequency of each transcript.
25
RNAseq software – Tuxedo suite (2012) Cufflinks (includes cuffmerge, cuffcompare, cuffdiff) Uses TopHat to assemble the ‘best’ transcriptome. Estimates relative abundance based on how many reads support each transcript. CummeRbund: Visualization of RNA-Seq analysis -an R package for analysing Cufflinks RNA-Seq output. Bowtie (2009)*: Ultrafast short read alignment -Aligns short DNA reads at 25 million x 35-bp p/h TopHat: Alignment of short RNA-Seq reads -Aligns RNA-Seq reads to genomes using Bowtie. Identifies splice junctions between exons. * Named after the Burrows-Wheeler transform algorithm (BWT)
26
Controlling Biological Variability Biological variability contributes more to experimental variability than technical variability. To mitigate biological variability:- - Consider all potential variables as part of the experiment design - Increase the number of biological replicates until Coefficient of Variation (CV) stabilizes
27
Percentage CV as Estimate of Variability CV% is a measure of variance amongst replicates of a single condition Defined as the standard deviation divided by the mean multiplied by 100 Example: 5 signal values representing 5 replicates - 230.4, 241.7, 252.9, 338.8, 178.9 - Mean = 248.56; = 57.9; CV% = 23.29% CV% helps you assess pilot studies:
30
Age Gender Biological Variability – same species sampling Individual phenotypes do not reflect overall genetic variability G vs E
31
Chips need to be normalised against each other. Each chip is a different colour in this graph They are not co-incident for intensities To compare they need to be comparable
32
Data Normalization Methods Scaling Factor (linear) normalization - Works well when metrics are consistent - Weakness: assumes error is uniform assumes total mRNA is the same for all cells Non-linear - Can provide higher precision - Requires invariant set - Weakness: false confidence in poor data
33
RMA uses Quantile normalisation at the probe level Chip 1 Chip 2 Chip 3 1 2 3 4 5 1 2 3 5 7 2 3 4 5 9 Order by ranks PA PB PC PD PE Chip 1 Chip 2 Chip 3 1 2 4 3 5 7 2 5 3 1 5 3 4 2 9 Average the intensities at each rank Chip 1 Chip 2 Chip 3 1.33 2.33 3.33 4.66 7 PA PB PC PD PE Chip 1 Chip 2 Chip 3 1.33 2.33 4.66 3.33 7 7 2.33 4.66 3.33 1.33 4.66 2.33 3.33 1.33 7 Reorder by probe
34
Seeing patterns in arrays 29 70
35
scatter plot of array data e.g. log Cy5 vs. log Cy3 Separate Affy chips Showing patterns in arrays
36
The Yeast array
37
Yeast cell cycle array data set Organised by cycle expression
38
Differential expression patterns 15 4 3 2 1.5 1:1 1.5 2 3 4 15 Fold change Phl A B C D E -1 -2 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 Phl A B C D E I II III IV VI VII IX VIII VX Expression Roadmap for Wood formation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.