TSS Annotation Workflow

Slides:



Advertisements
Similar presentations
GS 540 week 5. What discussion topics would you like? Past topics: General programming tips C/C++ tips and standard library BLAST Frequentist vs. Bayesian.
Advertisements

Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
InterPro/prosite UCSC Genome Browser Exercise 3. Turning information into knowledge  The outcome of a sequencing project is masses of raw data  The.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Whole genome alignments Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
NGS Analysis Using Galaxy
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
The UCSC Genome Browser Introduction
Annotation of Drosophila GEP Workshop – August 2015 Wilson Leung and Chris Shaffer.
is accessible at: The following pages are a schematic representation of how to navigate through ALE-HSA21.
Use cases for Tools at the Bovine Genome Database Apollo and Bovine QTL viewer.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
 GEP Implementation at Mt. San Jacinto Community College Nick Reeves, Ph.D.
Searching for Transcription Start Sites in Drosophila
Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Wilson Leung08/2015.
Annotation of Drosophila primer
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
数据库使用 杨建华 2010/9/28. Outline of the Topics UCSC and Ensembl Genome Browser (Blat vs Blast vs Blastz vs Multiz) 挖掘数据用 Table Browser 或 BioMart 用户友好化你的数据.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
How do we represent the position specific preference ? BID_MOUSE I A R H L A Q I G D E M BAD_MOUSE Y G R E L R R M S D E F BAK_MOUSE V G R Q L A L I G.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Primer on Annotation of Drosophila Genes GEP Workshop – January 2016 Wilson Leung and Chris Shaffer.
Overview of ENCODE Elements
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
HOMER – a one stop shop for ChIP-Seq analysis
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
Searching for Transcription Start Sites in Drosophila
Web Databases for Drosophila
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
Regulation of Gene Expression
Annotation of Drosophila
Annotation for D. virilis
Functional Elements in the Human Genome
Regulatory Genomics Lab
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Gene expression from RNA-Seq
Pairwise and NGS read alignment
EPConDB: Endocrine Pancreas Consortium Database
GEP Annotation Workflow
Introduction to G-OnRamp
From: TopHat: discovering splice junctions with RNA-Seq
Ensembl Genome Repository.
Identify D. melanogaster ortholog
Volume 7, Issue 5, Pages (June 2014)
lincRNAs: Genomics, Evolution, and Mechanisms
Yating Liu July 2018 G-OnRamp workshop
Volume 64, Issue 2, Pages (October 2016)
Follow-up from last night: XSEDE credits
Regulatory Genomics Lab
Volume 132, Issue 2, Pages (January 2008)
Problems from last section
Basic Local Alignment Search Tool
Stephen Taylor, CBRG GMOD Europe 2010
Determine CDS Coordinates
Universal Alternative Splicing of Noncoding Exons
Regulatory Genomics Lab
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
Presentation transcript:

TSS Annotation Workflow Identify ortholog D. melanogaster Target species Characterize TSS in the D. melanogaster ortholog Identify isoforms with unique TSS FlyBase GBrowse 2 Gene Record Finder For each unique TSS GEP UCSC Genome Browser (D. melanogaster) Classify core promoter (peaked / intermediate / broad / insufficient evidence) 1. Transcription start sites (Celniker > Embryonic) 2. DNase I hypersensitive sites 3. 9-state model Annotate TSS in the target species (e.g., D. biarmipes) 1. blastn (sensitive parameters) 2. RNA-Seq coverage 3. TopHat junctions 4. Conservation track 5. Placements of all 5’ UTR exons 6. Core promoter motifs Annotate the initial transcribed exon Identify core promoter motifs* in the target species and in D. melanogaster (±300 bps from TSS and/or in narrow search region) GEP UCSC Genome Browser (target species) Short Match track (e.g., Inr) Define TSS positions or TSS narrow and wide search regions Synthesize the evidence: blastn > RNA-Seq > motifs * Drosophila core promoter motifs: http://gander.wustl.edu/~wilson/core_promoter_motifs.html

TSS Annotation Workflow (Recommended Parameters) Characterize TSS in the D. melanogaster ortholog GEP UCSC Genome Browser (D. melanogaster) Transcription start sites TSS (Celniker) (R5): pack DNase I hypersensitive sites Detected DHS Positions (Cell Lines) (R5): pack DHS Read Density (Cell Lines) (R5): full 9-state model BG3 9-state (R5): dense S2 9-state (R5): dense Types of core promoters Classification # TSS #DHS Peaked 1 Intermediate ≤ 1 > 1 Broad Annotate TSS in the target species Should we add more TSS search regions Annotate the initial transcribed exon Define TSS position(s) blastn search parameters Similarity to the initial transcribed exon in D. melanogaster (blastn) RNA-Seq and Conservation tracks Position(s) of core promoter motifs Word size: 7 Match / Mismatch scores: 1 / -1 Gap costs: Existence: 2, Extension: 1 Low complexity filter: off RNA-Seq tracks RNA-Seq Alignment Summary: show RNA-Seq TopHat: squish Comparative genomics tracks Additional resources Conservation: full Genome alignments of 26 Drosophila species RNA PolII ChIP-Seq tracks (D. biarmipes) N-SCAN and Augustus TSS predictions Core promoter motifs search 1. Click on the “Short Match” link 2. Enter the motif consensus sequence 3. Change the display mode to “pack” 4. Click “Submit”