Download presentation
Presentation is loading. Please wait.
Published byBarbra Berry Modified over 9 years ago
1
modENCODE August 20-21, 2007 Drosophila Transcriptome: Aim 2.2
2
Aim 2.2 Experimental Validation of Transcript Models 1.Experimental verification of selected splice sites in transcript models (short RT-PCR) 2.Mapping transcript ends using RACE 3.Screening cDNA libraries for transcripts 4.Recovering cDNA clones using long RT-PCR 5.High-throughput sequencing of small RNAs 6.Submitting sequence data to databases 7.Reviewing the transcriptome annotation
3
Experiments at LBNL Transcript Ends TSSs: 20,000 targeted 5’ RACE experiments poly-A: 1,000 targeted 3’ RACE experiments Full-Length Transcript Structures 6,000 cDNA screens and full-insert sequencing 3,000 long RT-PCRs and full-insert sequencing Small RNA Sequencing 15 runs on on 454 Life Sciences device Size fractionate < 500 nt (larger range than Eric Lai)
4
Mapping TSSs 5’ RLM-RACE is a simple, scalable method RLM primer replaces the 5’ CAP structure Gene specific primers are nested & near 5’ end Sequence 8 clones Direct sequencing is also proposed but is difficult We are prioritizing transcripts and tissues using our 5’ EST data
5
TSSs: Slippery vs Discrete head RACE products larval RACE products cDNAs
6
Cap-Trapped 5’ ESTs Define Discrete…
7
…and Slippery Transcripotion Start Sites
8
How Many TSSs Does bowl Have?
9
5’ RACE Plans Identify TSSs that are well mapped by 5’ EST data Test RLM-RACE production protocol on 96 well mapped TSSs to measure experimental success rate Prioritize 5’ RACE experiments: 1. Transcripts with < 8 RE ESTs, using mixed embryo RNA 2. Transcripts with ESTs from other embryo-derived libraries 3. Transcripts with < 8 RH/TA ESTs 4. Transcripts with larval/pupal ESTs 5. Transcript without ESTs. Use appropriate RNA samples. Develop statistical description of “slipperiness” Biological validation with microarrays & P elements
10
Computationally predicted conserved exons validated by cDNA screening and sequencing I. Gene modificationsII. Identification of New Genes
11
cDNA and Long RT-PCR Plans Identify all transcripts that are well defined by cDNA sequence - complete & spliced ORF, poly-A tail, (not necessarily a defined TSS) Identify targets for cDNA screening (DGC goals in parentheses) (Transcripts with a community cDNA but no BDGP cDNA) (Transcripts with truncated ORFs) (Alternative transcripts that encode alternative coding sequences) 1. Conserved ORFs that failed on the first SLIP attempt: choose best RNA 2. Transfrags & RACEfrags that are not captured in sequenced transcripts Identify targets for long RT-PCR - targets that fail in SLIP screening on the best RNA sample - RT-PCR is probably more sensitive than SLIP but seems limited to ~2 kb cDNA and RT-PCR design depends on Aim 1 & Aim 2.1 and should be an iterative process. Biological validation using integrated description of all data
12
An Unannotated Transfrag
13
A Relatively Rare Transript CG31036: chordotonal neurons, lateral and head sensory neurons
14
High Throughput Sequencing Plan Pyrosequence RNA samples on 454 Life Sciences device - consider alternative platforms, e.g. Solexa Select 15 target tissues for analysis Define a transcript size range to target - avoid redundancy with Eric Lai: < 50 bases vs 50-500 bases - consider avoiding tRNAs Align transcript sequences and integrate with models Biological validation: Compare to microarray data Conservation in other species, including structure for ncRNAs Functional genomics in Aim 3
15
Some Questions for Discussion How many genes & transcripts in Drosophila? How many genes with multiple transcripts? CDSs? Are these expressed in different cell types? Can we segregate them in different RNA samples to avoid mixed RACE, cDNA and RT-PCR products? How do we prioritize screening What will we miss? How do we know when we’re done?
16
Future Directions Do different promoter motifs correlate with “slipperiness”, tissue, stage? Confidence scores associated with exons, transcripts and gene models: How do we measure confidence? How confident can we be? How much data do we need per gene?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.