A web-based platform for structural and functional annotation of model and non-model organisms www.gensas.org Jodi Humann, Taein Lee, Stephen Ficklin,

Slides:



Advertisements
Similar presentations
Submitting a Genome to RAST. Uploading Your Job 1.Login to your RAST account. You will need to register if this is your first time using SEED technologies.
Advertisements

2 Unité de Biométrie et d’Intelligence Artificielle (UBIA) INRA
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Genome Annotation BCB 660 October 20, From Carson Holt.
GenSAS: Genome Sequence Annotation Server, a Tool for Online Annotation and Curation Dorrie Main, Taein Lee, Ping Zheng, Sook Jung, Stephen P. Ficklin,
Jodi Humann, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Jill Wegrzyn, David Neale and Dorrie Main A web-based platform for genome annotation GenSAS Poster.
Customized cloud platform for computing on your terms !
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Genome Annotation using MAKER-P at iPlant Collaboration with Mark Yandell Lab (University of Utah) iPlant: Josh Stein (CSHL) Matt Vaughn.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS Facilitator: Richard Bruskiewich Adjunct Professor, MBB.
UMR ASP UMR ASP Structural & Comparative Genomics in Bread Wheat TriAnnotPipeline A LifeGrid Project based on AUVERGRID F. Giacomoni, M.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Jodi Humann, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Sook Jung, Jill Wegrzyn, David Neale and Dorrie Main An easy to use, web-based solution for specialty.
Developed by James Estill, Dept. of Plant Biology, University of Georgia.
Sackler Medical School
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Collaborating with the UCSF Library Wiki UCSF Sharecase
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliensis Analysis L Infantum Analysis L Major Analysis Mercator.
Lars Ailo Bongo NBS meeting Tromsø, Jan 23, 2016 NeLS Norwegian e-Infrastructure for Life Sciences Overview and recent developments
Legend Global = Subgraph call Make Data Dir = Step Load Genomic Sequence & Annotation = Subgraph reference Proteome Analysis = Optional step [Taxon] Pk.
Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng, Jodi L. Humann, Deah McGaughey, Heidi Hough, Stephen P. Ficklin, B. Todd Campbell,
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 2.
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliensis Analysis L Infantum Analysis L Major Analysis Mercator.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Basics of Genome Annotation Daniel Standage Biology Department Indiana University.
GeneConnect Use Cases and Design August 3, GeneConnect Database IDs are linked by Direct Annotation, Inferred Annotation, or Sequence Alignment.
Annotating The data.
T3/Tutorials: Data Submission
GDR Workshop Tuesday 21st, 2016 RGC8 2016
Introduction to Genes and Genomes with Ensembl
NGS File formats Raw data from various vendors => various formats
Cancer Genomics Core Lab
Regulatory Genomics Lab
Breeding Information Management System
Bioinformatics for Research
Genome Sequence Annotation Server
Issues with creating Genome Browsers for Whole Genome Assemblies
SEA-PHAGES Bioinformatics Workshop Overview
CottonGen: An Up-to-Date Resource Enabling Genetics, Genomics and Breeding Research for Crop Improvement Plant and Animal Genome Conference XXV Jing Yu1,
Genome Sequence Annotation Server
the Genome Database for Rosaceae: New Data and Functionality
iCIMS 16.3 Release: Highlights
CottonGen An Online Resource for the Cotton Community
Introduction to G-OnRamp
Plant and Animal Genome Conference XXIV
Genome Annotation w/ MAKER
Cuong Nguyen, Deng Xin, Dongmei, Zheng Wang
Ensembl Genome Repository.
Updates to the CSFL Genome Database:
for the Cotton Community
Updates and Future Direction
Explore Evolution: Instrument for Analysis
Using CottonGen for Crop Improvement
Genome Database for Rosaceae:
Yating Liu July 2018 G-OnRamp workshop
2 Unité de Biométrie et d’Intelligence Artificielle (UBIA) INRA
Follow-up from last night: XSEDE credits
Regulatory Genomics Lab
How to Effectively Search and Download Data in CottonGen
CottonGen: Enabling Cotton Research through Big-Data Analysis and Integration Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng,
New Data and Functionality in NRSP10 Databases
2016 Beltwide Cotton Conference
Welcome - webinar instructions
Resources for HLB and Citrus Genomics, Genetics and Breeding Research
Germplasm Overview Page Trait Descriptor Standard Images
Presentation transcript:

A web-based platform for structural and functional annotation of model and non-model organisms www.gensas.org Jodi Humann, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Heidi Hough, Sook Jung, Jill Wegrzyn, David Neale, Dorrie Main jhumann@wsu.edu

What is genome annotation? ???? Annotation Predicted gene models to use in lab experiments

What is GenSAS? Web-based platform, no software installation by user Just need a user account, internet browser, and an internet connection User accounts keep data private and secure and allow for collaborative annotation projects Easy-to-use interfaces and detailed user manual

Account Limits User accounts will remain active as long there is an active project Projects expire after 60 days unless user resets expiration date 250 GB of storage space on server Assembly files must be high quality <25,000 sequences Over 50% of sequences longer than 2,500 bases Seven jobs running at one time, but other jobs can be waiting in queue

Eukaryote annotation workflow Upload Sequences PRINSEQ-lite, BUSCO Create Project Upload Evidence Identify Repeats RepeatMasker, RepeatModeler Mask Sequences Align Evidence BLAST, BLAT, DIAMOND, HISAT2, PASA, TopHat Structural Annotation Augustus, BRAKER2, GeneMarkES, Genscan, GlimmerM, SNAP, RNammer, tRNAScan-SE Choose Official Gene Set EvidenceModeler (optional) Refine Gene Models PASA (optional) Functional Annotation BLAST, DIAMOND, InterProScan, Pfam, SignalP, TargetP Manual Curation Apollo, JBrowse Generate Files for Publication BUSCO

Prokaryote annotation workflow Upload Sequences PRINSEQ-lite, BUSCO Create Project Upload Evidence Align Evidence BLAST, BLAT, DIAMOND Structural Annotation GeneMarkS, Glimmer3, RNAmmer, tRNAScan-SE Choose Official Gene Set Functional Annotation BLAST, DIAMOND, InterProScan, Pfam, SignalP Manual Curation Apollo, JBrowse Generate Files for Publication BUSCO

User provided files Required: Optional: Genome assembly Assembled transcripts or ESTs Species-specific repeats or proteins Species-specifc Genbank gene structures Filtered Illumina RNA-seq reads Aligned RNA-seq reads in the BAM file format Previous annotations in the GFF3 format

GenSAS provided information RepeatMasker: Repbase repeat libraries Transcript and protein alignment tools: NCBI RefSeq transcripts and proteins archaea, bacteria, fungi, invertebrate, mitochondrion, plant, plasmid, plastid, protozoa, vertebrate-mammalian, vertebrate- other, viral SwissProt Trembl

GenSAS Homepage Request free account Login to GenSAS Access User’s Guide and contact us Learn about tools and libraries Access the GenSAS interface

Once jobs are in queue, users can log out of GenSAS GenSAS Interface Once jobs are in queue, users can log out of GenSAS

Sequences Step Once uploaded, assembly metrics are calculated using PRINSEQ Users can run BUSCO on assembly

Project Step Fillable web form Select previously uploaded assembly Email options

GFF3 Step

Evidence Step

Repeats and Masking Steps Masking step produces consensus, or can skip masking

Align Step

Structural Step

Consensus Step Optional step using EVM Can adjust and remove weights Gene Predictions Protein Alignments Transcript Alignments

OGS Step Select “Official Gene Set”

Refine and Functional Steps Optional step to further refine OGS using PASA prior to functional annotation

Annotate Step Edits added to “User-created Annotations” will be merged into final results

Publish Step OGS and repeat consensus automatically prepared FASTA and GFF formats User can select other jobs

Final Annotation Results Summary table of annotation project Project Summary file with details about tool settings Option to create merged GFF3 file Add repeats, tRNA, rRNA Add functional job annotation to column 9

Final Annotation Results All results files are listed and can be downloaded individually or….

Final Annotation Results Use “Download all” option to get all the files at once Option to run BUSCO on proteins from final annotation

Funding GenSAS Poster – PO0085 www.gensas.org