The Transcriptional Landscape of the Mammalian Genome RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group) and the FANTOM Consortium
The FANTOM project Solving biological questions using bioinformatics, and technology FANTOM1-Production of full length sequence mouse DNA. Design of functional annotation of cDNA. FANTOM2-Assignment of annotations and bp determination of 60770 full length mouse cDNA. FANTOM3- Utilization of CAGE uncovering more info on the transcriptome. Functional annotation referring to: attributes such as molecular function, biological process and cellular component -GO terms (Gene Ontology)
Background At this point, whole genome sequencing was impractical and expensive NGS~2008 GSC(genome signature cloning)/ GIC ditag- full cDNA cloning with the cleavage of internal sequences Transcriptional landscape- pattern of transcriptional control signals, and the transcripts they generate Transcriptional units (TU) Transcriptional framework (TK) GSC(genome signature cloning)/ GIC ditag- full cDNA cloning with the cleavage of internal sequences, leaving the 5’ and 3’ end of cDNA which are then amplified then cloned
Background Expressed sequence tags (EST)- Short (200–800 nucleotide bases in length), unedited, randomly selected single-pass sequence reads derived from cDNA libraries. *Sampling bias-rare transcripts? EST-The 5’ and 3’ UTRs of eukaryotic mRNA have been experimentally shown to contain sequence elements essential for gene regulation, expression and translation [16]. In this context, EST data has proven to be important for mining UTRs as both 50 and 30 ESTs contain significant sections of the UTRs along with protein coding regions.
Methods CAGE- cap analysis gene expression Capture of 5’ cap using a linker Cleavage Amplification Concatenation+linkers Cloning into vector
36166 protein isoforms identified between FANTOM 1-3 and external sources FANTOM 3 added 16274 proteins of which 5154 mapped to new transcriptional frameworks
CAGE tags align to transcriptional start sites
Prediction of transcript length distribution in the genome Transcriptional forest predicted to collapse as depth of coverage increase
Cage tags in first exon show preference for 5’ end Cage tags in last exon show preference for 3’ end Antisense transcription
Promoter sequences are highly conserved across mouse and humans, more so in nc promoters. * Compared to chicken also but much less alignment, expected since different class
Significance CAGE method allowed for identification of promoter regions, gave insight into transcriptome Tags binding to 3’ region- antisense transcription Non coding RNAs appear to be highly conserved in promoter regions, only 40% of known ncRNAs were isolated and sequenced Establishment of databases for cDNA annotation, expression, and analysis Antisense strands seem to participate in regulatory mechanisms which influences genes
Conclusions CAGE revealed transcriptional initiation and termination sites 63% of the genome is transcribed as RNA in regions called transcriptional forests Transcriptional deserts collapse as more TUs are discovered The mammalian transcriptome is highly complex, transcripts overlap
Future Studies Mega transcripts(~2.2Mb)- new cloning technology needed Look at nuclear RNA Laid groundwork for comparative transcriptional studies between mammals Function of ncRNDA? FANTOM4, FANTOM5 The identification of mega transcripts gave insight into the future direction of cloning technology ncRNAs are highly conserved across all species even those evolutionary distantly related, especially their promoters; when compared to promoters of coding RNA. What is their function?
References What/Who is FANTOM? http://fantom.gsc.riken.jp/ EST https://www.researchgate.net/profile/Robin_Gasser/publication/7011834_A_Hitchhiker's_Guide_to_Ex pressed_Sequence_Tag_EST_Analysis/links/00b7d52fa880e27818000000.pdf Tag methods http://www.nature.com/nmeth/journal/v2/n7/full/nmeth768.html
Further Reading Anti sense RNA http://science.sciencemag.org/content/309/5740/1564.full Long noncoding RNAs: functional surprises from the RNA world http://genesdev.cshlp.org/content/23/13/1494.long Tiny RNAs associated with transcription start sites in animals (FANTOM 4) https://www.ncbi.nlm.nih.gov/pubmed/19377478