Gene Expression in Loblolly Pine Early Development Keithanne Mockaitis Carol Loopstra Indiana University Center for Genomics and Bioinformatics Texas A & M University
Progressive Transcript Profiling Build a useful transcriptome reference early in project: generate long reads for ease of assembly, scaffolding of existing shorter data integrate community data into assemblies Vegetative Organs vegetative buds candles stems needles roots Early Stress Signaling Responses cold heat elevated UV compression Reproductive Development megastrobili microstrobili Early Development seeds young seedlings
Sequencing of Early Development Collections, Stage 1 embryos dissected from germinating seeds seeds immediately after stratification megagametophytes dissected from germinating seeds Lib 1 Lib 2 cDNA libraries optimized for 454 sequencing, partially normalized GS – XLR Plus
Sequence reads length distribution of libraries seed/embryo pool megagametophyte pool
Data Assembled
Coverage of Assembled Transcripts > 1 kb average coverage length
Transcripts with no blastx hit to NCBI dbEST: 2,173 Transcripts with blastx hit to NCBI dbEST: 49,386 Hits not to Pinus genus: 6,322 Hits not to gymnosperm: 653 Hits to Pinus transcripts in dbEST: 43,064 Most transcripts from new assembly contribute substantial length to older data ~2000 selected Pinus transcripts length Estimated Gene Discovery
Estimated Maternal Expression Full Assembly Isogroups: 24,688 Megagametophyte Isogroups Mapped (>80% length, 98% id): 12,478 (51%) Homology Estimation Fully Assembly Transcripts (Isotigs): 51,513 Transcripts with significant blastx hit to TAIR10: 41,187 (80%) Unique: 12,233 Transcripts with significant blastx hit to Populus trichocarpa v2: 41,291 Unique: 12,768 Unique OrthoMCL groups represented: 7,075 Paralog Groups: 5,362
Most Highly Represented Gene Families Ortholgous Groupprotein family/superfamilymembers , , , , , histone , PPR or TPR containing heat shock LRR transmembrane protein kinase ABC transporter transducin family, WD40 repeat containing plasma membrane intrinsic18 OrthoMCL: Li et al., 2003 Genome Res. 13, 2178
Many expected transcripts are well covered Vuosku et al., 2009 J Exp Bot 60, 1375 RAD5198.5% KU8099.4% DNA ligase IV67.3% TatD DNAse63.9% MCA100%
Progressive Transcript Profiling Early Development, Stage 2 seeds embryos from seedlings young tissues, stages from Build a useful transcriptome reference early in project: generate long reads for ease of assembly, scaffolding of existing shorter data integrate community data into assemblies generate deeper stage-specific sequencing of samples within original pools, additional collections attribute source specificities through comparative mapping refine assemblies of alternatively spliced transcripts
Progressive Transcript Profiling Reproductive Development megastrobili: 4 stages microstrobili: 4 stages
Thanks IU CGB James Ford Zach Smith Aaron Buechlein Texas A & M Jeff Puryear