RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

Slides:



Advertisements
Similar presentations
Periodic clusters. Non periodic clusters That was only the beginning…
Advertisements

RNAseq.
Transcriptome Sequencing with Reference
1 Alternative Splicing. 2 Eukaryotic genes Splicing Mature mRNA.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
Lecture 12 Splicing and gene prediction in eukaryotes
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
1. Bacterial genomes - genes tightly packed, no introns... HOW TO FIND GENES WITHIN A DNA SEQUENCE? Scan for ORFs (open reading frames) - check all 6 reading.
LOC_Os02g08480 Supplementary Figure S1. Exons shorter than a read length have few or no reads aligned. The gene at LOC_Os02g08040 contains exons shorter.
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.
Supplementary Figure 2A. A. ZMYM6-variant missing Exon 2 C. ZMYM6-variant missing Exon 4 B. ZMYM6-variant missing Exon 5 D. ZMYM6-variant missing Exons.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
Chapter 3 The Interrupted Gene.
Motif Search and RNA Structure Prediction Lesson 9.
1 Many to 1 Gene Associations The following slides show a few examples of gene predictions by one annotation group that overlap one or more genes from.
Finding genes in the genome
Shai Carmi, Erez Levanon Bar-Ilan University
Supplementary Fig. 1 Supplementary Figure 1. Distributions of (A) exon and (B) intron lengths in O. sativa and A. thaliana genes. Green bars are used for.
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
Are Roche 454 shotgun reads giving a accurate picture of the genome?
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Supplementary Fig. 1 Supplementary Figure 1. Distributions of (A) exon and (B) intron lengths in O. sativa and A. thaliana genes. Green bars are used.
The Transcriptional Landscape of the Mammalian Genome
RNA-Seq for the Next Generation RNA-Seq Intro Slides
Biases and their Effect on Biological Interpretation
Extract DNA and RNA from the same E. coli culture
Gene expression from RNA-Seq
Pol II Docking and Pausing at Growth and Stress Genes in C. elegans
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
A multi-strain, high-resolution mouse haplotype map reveals three distinctive genetic signatures Laboratory of Population Genetics.
Figure 3. Schematic of the parameters to assess junctions in SpliceMap
TSS Annotation Workflow
Eukaryotic Gene Finding
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
From: TopHat: discovering splice junctions with RNA-Seq
Other annotations (TFP, DHS, ncRNA, pseudogene)
Reliable Identification of Genomic Variants from RNA-Seq Data
Volume 84, Issue 3, Pages (February 1996)
N6-Methyladenosines Modulate A-to-I RNA Editing
Adrien Le Thomas, Georgi K. Marinov, Alexei A. Aravin  Cell Reports 
Alternative Splicing QTLs in European and African Populations
by Jonathan P. Ling, Olga Pletnikova, Juan C. Troncoso, and Philip C
Volume 133, Issue 3, Pages (May 2008)
Joseph Rodriguez, Jerome S. Menet, Michael Rosbash  Molecular Cell 
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
A Major Epigenetic Programming Mechanism Guided by piRNAs
The Structure of the Genome
Volume 11, Issue 2, Pages (April 2015)
Human Promoters Are Intrinsically Directional
Pol II Docking and Pausing at Growth and Stress Genes in C. elegans
Evolution of Alu Elements toward Enhancers
Volume 151, Issue 7, Pages (December 2012)
Volume 132, Issue 2, Pages (January 2008)
ADAR Regulates RNA Editing, Transcript Stability, and Gene Expression
Volume 21, Issue 9, Pages (November 2017)
Volume 65, Issue 3, Pages e6 (February 2017)
Volume 6, Issue 4, Pages (April 2016)
Universal Alternative Splicing of Noncoding Exons
RT-PCR analysis of GFP splice variants in prp18a-1 mutants.
Identification of TSIX, Encoding an RNA Antisense to Human XIST, Reveals Differences from its Murine Counterpart: Implications for X Inactivation  Barbara.
Manfred Schmid, Agnieszka Tudek, Torben Heick Jensen  Cell Reports 
Volume 16, Issue 6, Pages (August 2016)
Volume 11, Issue 7, Pages (May 2015)
Volume 8, Issue 5, Pages e8 (May 2019)
Figure Genetic characterization of the novel GYG1 gene mutation (A) GYG1_cDNA sequence and position of primers used. Genetic characterization of the novel.
Presentation transcript:

RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 1: False calls in the Csl gene at chr10:99,221,220 in 129S1/SvImJ due to a paralog of 92% identity. The screenshot is from the IGV genome browser. RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 2: False call at chr10:59,759,667 in CBA/J just at the end of a low complexity region. The mismatches appear only in reads aligned to forward strand, going from the region outwards. RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 3: False calls at chr6:124,712,670 in CBA/J caused by sequencing spliced transcripts and aligning to the genomic reference. The mismatches start directly adjacent to the splice junction. RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

Supplementary Figure 4: The Variant Distance Bias filter assumes random distribution of variant bases within the aligned portion of the reads (a-b). The random variable used to access the randomness is the mean pairwise distance of the variants. The density function depends on depth (c) and allows to detect any bias in variants positions accurately (d). a b A G A G A G A c d

Supplementary Figure 5: Flowchart and the effect of filters on the substitution pattern and the numbers of editing candidate sites. gDNA SNVs cDNA SNVs Splice-aware realignment Minimum Depth 10x 31,923 sites 304,817 candidate sites 98,061 unambiguous sites Filtering Replicate Consistency 62,889 sites Estimated FDR 2.9% No assumptions about the nature of editing made Assumed editing by ADARs which usually occurs in clusters 5,579 filtered sites End Distance Bias 59,775 sites One-type mismatch clusters added Cluster extension Strand Bias 42,238 sites Variant Distance Bias 36,213 sites 7,389 final sites 7,133 sites

RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 6: Although this editing cluster in 129S1/SvImj overlaps the Zscan30 gene which is on the reverse strand, the editing occurred on the forward strand. These edits could be mistaken for a novel T-to-C type of editing, but the presence of another gene Zfp397 and the Rpl19-ps7 pseudogene on the forward strand just upstream from Zscan30 suggests that antisense transcription or missing annotation are more likely explanations. RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 7: C-to-U edit in the Mfn1 gene at chr3:32,460,397 in PWK/PhJ causes a non-synonymous change (S>L) and is surrounded by a cluster of A-to-I edits. RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

Supplementary Figure 8: Traces from three transcripts confirming C-to-U editing in the Mfn1 gene from the previous figure.

Supplementary Figure 9: Traces of two RNA editing clusters validated by PCR and sanger sequencing: (a) Multiple edits in 3'UTR of the Ebna1bp2 gene with unknown function and (b) cluster of edits in the Flnb gene on chr14 which affect the splice site and include two non-synonymous coding edits (S>G and Q>R). a b

Supplementary Figure 10: (a) The weak TAG motif typical for editing by ADARs. The dashed line shows the AT/GC content of a random transcribed sequence. (b) In agreement with (Lehmann 2000, 11041852), the editing level at TAT and TAG sites is biggest. However, the level of editing is small at the AAG triplet. a b

Supplementary Figure 11: Two non-synonymous coding edits in the Cacna1d gene (I>M and Y>C) were validated by PCR and sanger sequencing. We observed four distinct transcripts involving these edits with both (a), single (b,c) or none of the sites edited (not shown). a b c

Supplementary Figure 12: Two edits in 3'UTR of Cds2 Supplementary Figure 12: Two edits in 3'UTR of Cds2. Three wild-derived strains have A>G genomic SNP at chr2:132,135,391 whereas the other strains are significantly edited to a G at the site (a). There are another two edits nearby. Another edited site 225bp downstream (b) is 225bp away and its SDP is the exact inverse of the first. Both sites have been experimentally confirmed. a b

Supplementary Figure 13: Editing site at chr14:52,913,904 in 3'UTR of Tox4 is private to C57BL/6NJ and the other strains have SNP at this position. It has been confirmed by Sequenom.

a b RNA-seq Replicate 1 RNA-seq Replicate 2 DNA Supplementary Figure 14: The level of editing at chr10:86,300,032 in 3'UTR of Nt5dc3 is significantly enhanced for three wild-derived strains (CAST, PWK, SPRET) (a). All of these strains have a T>A SNP 36 bases downstream. It is interesting that the SNP itself is occasionally edited, although the level of its editing is very rare (b). a b RNA-seq Replicate 1 RNA-seq Replicate 2 DNA

Supplementary Figure 15: Enrichment of RNA-edits and repetitive element classes in the mouse genome