Presentation is loading. Please wait.

Presentation is loading. Please wait.

Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.

Similar presentations


Presentation on theme: "Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise."— Presentation transcript:

1 Arrays How do they work ? What are they ?

2 WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise Next gene quantify Tissue sample

3 Probe preparation Acquire or Generate probes ‘All the genes you want’ Label cDNA from sample 1 RNA …and sample 2 RNA Target preparation Extract RNA from your Control AND your Experimental plant Spot

4 Identify ‘spots’ remove background produce ‘red/green’ ratios Link ratio to relative abundance. Link spot to gene. Link genes to each other. Hybridise & Scan

5 Arrays How do (did) you make them ?

6 Arrayers

7 Before processing, we have a LOT of spots

8 After processing, we have a LOT of objective data Example Hybridisation

9 What biological questions can you answer with arrays ?

10 5 hormone response gene family members In different experiments 3. testes vs brain hyb1. +hormone vs ctrl hyb 2. Normal vs pathology hyb microarray Sorting out gene families Biopsy type:

11 However ….. Duplication in genomes is a real problem Animal Plant Fungal

12 Gene families (plant): (# of members as a proportion of the genome) Apart from wholesale duplication Unique2345>5 35%12.5%7%4.4%3.6%37.4% Conservation between genes: 37% of genes are highly conserved (TBLASTX E<10 -30 ) 10% more are partially conserved (TBLASTX E<10 -5 )

13 One choice would be: Amplifications of cDNAs chosen by partial sequence (ESTs) What goes on the slide ?

14 Gene of interest ESTs have inherent problems Example EST sequence 1 Homologous EST sequence 2 Dissimilar EST sequence 3 On the slide 1 2 3 Labelled target may hybridise similarly to each

15 Probes Sequence Probe Background control Chip 5’ 3’ A more precise, reliable solution: Affymetrix genechips

16 1.28cm 5 - 50 µm Millions of identical oligonucleotide probes per feature 49 - 400 chips/wafer millions of features/chip

17

18 Synthesis of Ordered Oligonucleotide Arrays One nucleotide at a time.

19 RNA Quality control

20 cDNA Wash & Stain Scan Hybridise (16 hours) RNA AAAA BBBB Biotin-labeled transcripts Fragment (heat, Mg 2+ ) Fragmented cRNA B B B B IVT (Biotin-UTP Biotin-CTP)

21 Array cRNA Target Hybridized Array Ab detection

22 Affymetrix software derives the intensity for each probe from the 75% quantile of the pixel values in each box.

23 The intensities of the multiple probes within a probeset are combined into ONE measure of expression Expression Measure

24 A recent solution: RNAseq 3 ‘simple’ steps: Take an RNA sample, 1. sequence it and align it to the genome. 2. count how many times each transcript appears. 3. work out the frequency of each transcript.

25 RNAseq software – Tuxedo suite (2012) Cufflinks (includes cuffmerge, cuffcompare, cuffdiff) Uses TopHat to assemble the ‘best’ transcriptome. Estimates relative abundance based on how many reads support each transcript. CummeRbund: Visualization of RNA-Seq analysis -an R package for analysing Cufflinks RNA-Seq output. Bowtie (2009)*: Ultrafast short read alignment -Aligns short DNA reads at 25 million x 35-bp p/h TopHat: Alignment of short RNA-Seq reads -Aligns RNA-Seq reads to genomes using Bowtie. Identifies splice junctions between exons. * Named after the Burrows-Wheeler transform algorithm (BWT)

26 Controlling Biological Variability Biological variability contributes more to experimental variability than technical variability. To mitigate biological variability:- - Consider all potential variables as part of the experiment design - Increase the number of biological replicates until Coefficient of Variation (CV) stabilizes

27 Percentage CV as Estimate of Variability CV% is a measure of variance amongst replicates of a single condition Defined as the standard deviation divided by the mean multiplied by 100 Example: 5 signal values representing 5 replicates - 230.4, 241.7, 252.9, 338.8, 178.9 - Mean = 248.56;  = 57.9; CV% = 23.29% CV% helps you assess pilot studies:

28

29

30 Age Gender Biological Variability – same species sampling Individual phenotypes do not reflect overall genetic variability G vs E

31 Chips need to be normalised against each other. Each chip is a different colour in this graph They are not co-incident for intensities To compare they need to be comparable

32 Data Normalization Methods Scaling Factor (linear) normalization - Works well when metrics are consistent - Weakness: assumes error is uniform assumes total mRNA is the same for all cells Non-linear - Can provide higher precision - Requires invariant set - Weakness: false confidence in poor data

33 RMA uses Quantile normalisation at the probe level Chip 1 Chip 2 Chip 3 1 2 3 4 5 1 2 3 5 7 2 3 4 5 9 Order by ranks PA PB PC PD PE Chip 1 Chip 2 Chip 3 1 2 4 3 5 7 2 5 3 1 5 3 4 2 9 Average the intensities at each rank Chip 1 Chip 2 Chip 3 1.33 2.33 3.33 4.66 7 PA PB PC PD PE Chip 1 Chip 2 Chip 3 1.33 2.33 4.66 3.33 7 7 2.33 4.66 3.33 1.33 4.66 2.33 3.33 1.33 7 Reorder by probe

34 Seeing patterns in arrays 29 70

35 scatter plot of array data e.g. log Cy5 vs. log Cy3 Separate Affy chips Showing patterns in arrays

36 The Yeast array

37 Yeast cell cycle array data set Organised by cycle expression

38 Differential expression patterns 15 4 3 2 1.5 1:1 1.5 2 3 4 15 Fold change Phl A B C D E -1 -2 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 -1 1 2 3 0 4 Phl A B C D E I II III IV VI VII IX VIII VX Expression Roadmap for Wood formation


Download ppt "Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise."

Similar presentations


Ads by Google