Download presentation
Presentation is loading. Please wait.
1
Transcriptomics History and practice
2
Early RNA analysis used Northerns: …..One gene at a time
YFG Label probe + hybridise Tissue sample Transgenic Other species Dwarf WT Next gene Quantify RNA levels Extract target RNA
3
Northerns are too slow for Systems Biology where we want to assay ALL transcripts simultaneously
Massive Datasets for thousands of genes Genes, protein and metabolites link together into biological SYSTEMS
4
Arabidopsis Merged Network 19392 nodes and 72715 edges
Proteins (red) Metabolites (blue) & Genes (green) 19392 nodes and edges EXAMPLE: Cytoscape software Allows the visualisation of all transcript levels for an organism This one is based on ARRAY data Arabidopsis transcriptome network (Ma et al. Genome Research 2007)
5
Post 2000: Microarrays & RNAseq….
Mass transcript profiling: Transcriptomics Historically (pre-2000): Sequencing ESTs and ranking representation Differential display (random 5’ primers fixed polyA primers) Post 2000: Microarrays & RNAseq….
6
‘All the genes you want’
Microarrays Probe preparation Target preparation Acquire or Generate probes ‘All the genes you want’ Extract RNA from your Control AND your Experimental plant Label cDNA from sample 1 RNA …and sample 2 RNA Spot
7
Microarrays Hybridise & Scan Identify ‘spots’ remove background
produce ‘red/green’ ratios Hybridise & Scan Link ratio to relative abundance. Link spot to gene. Link genes to each other. Networks / systems
8
Before processing, we have a LOT of spots
‘Landing lights’ xyz normalisation After processing, we have a LOT of objective data
9
What biological questions can be explored with transcriptomics ?
Learning outcome: What biological questions can be explored with transcriptomics ?
10
Arrays can separate similar genes
Pretend specialist microarray. Only 5 genes ALL responding to a hormone: Plus hormone vs control (i.e. known / expected challenge) All ‘on’ The classic types of array experiments: 1. Normal vs challenge (e.g. pathology, induction) 2. Tissue A vs Tissue B (e.g. muscle vs liver)
11
Remember: Genomes are not tidy – duplication is common
Plant (arabidopsis) Fungal (yeast) Animal (human) This is a big problem for arrays : Cross - hybridisation
12
Apart from gross syntenic duplication
Gene families (recycling of function) is common: e.g. in arabidopsis: Gene family size Unique 2 3 4 5 >5 35% 12.5% 7% 4.4% 3.6% 37.4% Proportion of the genome Conservation at the base-pair level within genes: 37% of genes highly conserved (TBLASTX E<10-30) 10% partially conserved (TBLASTX E<10-5)
13
Pioneer arrays were cDNAs
Derived from mRNA amplified by reverse transcriptase and cloned. Selected based on partial sequence primed from vector cloning sites (e.g. SP6, T7, T3) Commonly called ESTs (Expressed Sequence Tags)
14
Homologous EST sequence Dissimilar EST sequence
ESTs can be misleading Gene of interest Example EST sequence 1 Homologous EST sequence 2 Dissimilar EST sequence 3 On the slide 1 2 3 Labelled target cross hybridises
15
Multiple Short Probes 25-mers
Genechips have better specificity Known Gene Sequences 5’ 3’ Algorithmic selection Multiple Short Probes 25-mers Hybridisation
16
Biotin-labeled transcripts
Example single colour target labelling - 3’ IVT Fragment (heat, Mg2+) Fragmented cRNA B B Biotin-labeled transcripts IVT or WT (Biotin-UTP Biotin-CTP) AAAA RNA Target Preparation RNA isolation is the first step. 1-2 hours The messenger RNA is then reverse transcribed into cDNA (we then go on to make the second strand of cDNA). 4 hours An in vitro transcription reaction using biotinylated nucleotides is then done to both amplify and label the transcripts hours These are then fragmented in order to get a more efficient hybridization ( bases pairs is the goal). 1.0 hours The fragmented target is then hybridized overnight to a GeneChip expression array. 16 hours When washing and staining of the array is complete it can then be scanned hours Wash & Stain cDNA Scan Hybridise (16 hours)
17
Detection: Hybridisation and staining
Array Biotin labelled cRNA Target Hybridisation Antibody detection
18
Each probe call is derived from the 75% quantile of the pixel values (sweet spot).
All the probes of a probeset (gene) are combined into ONE measure of expression
19
Data handling: Chips need to be normalised against each other.
Each different colour line maps all the intensities of a single chip They are NOT co-incident lines (e.g. yellow and black are outliers) To compare they need to be comparable
20
Average the intensities at each rank
PA PB PC PD PE Chip 1 Chip 2 Chip 3 Normalisation Chip 1 Chip 2 Chip 3 Order by ranks RMA is a very powerful but simple process that works at the probe level Average the intensities at each rank Chip 1 Chip 2 Chip 3 PA PB PC PD PE Chip 1 Chip 2 Chip 3 Reorder by probe
21
RMA Normalisation makes data more comparable
So we can derive / display differentially expressed genes ..as candidates for further research... volcano plot trend graph
22
RNAseq 3 ‘simple’ steps: A complementary solution:
Take an RNA sample, 1. sequence it, align it to the genome. 2. count how many times each transcript appears. 3. work out the frequency of each transcript.
23
RNAseq software – Tuxedo suite (2012)
Bowtie (2009)*: Ultrafast short read alignment Aligns short DNA reads at 25 million x 35-bp p/h TopHat: Alignment of short RNA-Seq reads Aligns RNA-Seq reads to genomes using Bowtie. Identifies splice junctions between exons. Named after the Burrows-Wheeler transform algorithm (BWT) Cufflinks (includes cuffmerge, cuffcompare, cuffdiff) Uses TopHat to assemble the ‘best’ transcriptome. Estimates relative abundance based on how many reads support each transcript. CummeRbund: Visualization of RNA-Seq analysis R package for Cufflinks RNA-Seq output.
24
Fragments Per Kilobase per Million reads
FDR-adjusted p-values (q-values) - replication
25
Mutant Control
26
Once we have candidates - we can discover their function...
GO annotations of genes higher in muscle GO annotations of genes higher in liver Graham et al. (2011) Animal
27
....and use these to allocate those differential genes
to pathways and biological systems
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.