1. C. briggsae sequence curation 2. SNP data handling

Slides:



Advertisements
Similar presentations
Model Organism Databases and Community Annotation
Advertisements

Kino : Making Semantic Annotations Easier Ajith Ranabahu #, Priti Parikh #, Maryam Panahiazar #, Amit Sheth # and Flora Logan- Klumpler* # Ohio Center.
Online, Web-Based Data Collection System February, 2011.
ABSTRACT WormBase is a freely available information resource primarily for the nematode Caenorhabditis elegans but which progressively includes data from.
Online Submission and Management Information -- Authors
Uploading a Turnitin Assignment Faculty of Humanities and Social Sciences.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
Algorithm Animation for Bioinformatics Algorithms.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
UniProt - The Universal Protein Resource
WormBase: A Resource for the Biology & Genome of C. elegans Lincoln D. Stein.
Gene Expression Omnibus (GEO)
05/04/2005 Informatics Meeting C. elegans – “Back To The Future”. Paul Davis (aka Huey)
01/03/2013UK NEQAS UV Participants Meeting 2013 in a quality perspective.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
Improving Curation Efficiency: User Contributions and Textpresso-Based Semi-Automation SAB 2008 WormBase Literature Curators Textpresso.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Curation Tools Gary Williams Sanger Institute. SAB 2008 Gene curation – prediction software Gene prediction software is good, but not perfect. Out of.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
SRI International Bioinformatics 1 Genome Browser Tomer Altman Bioinformatics Research Group SRI, International August 19th, 2009.
By Michael Han Sanger Wormbase Group SAB 2008 Comparative Genomics with.
Advisory Board Meeting, CSHL 2005 Developments at Sanger Anthony Rogers Wellcome Trust Sanger Institute.
Advisory Board Meeting, Caltech 2004 Genome Sequence Updates. Paul Davis The Sanger Institute.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
Sequence Curation Paul Davis Sanger Institute. Overview Sequence curation within WormBase consortium. Import of sequence data. Prediction stats. Work.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
The i5k – enabling genomic data access, visualization and curation for the i5k community Monica Poelchau and the i5k group.
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
T3/Tutorials: Data Submission
Tools For Vertebrate Gene Naming
Regulatory Genomics Lab
Practices of Science Data Sharing Platform for Bioinformation in SCBIT
Genome Sequence Annotation Server
IT Partners Conference Oliver Thomas 19 April 2005
How to Administer a PGDB
SRA Submission Pipeline
How Can I Download My Transactions Directly Into Quicken
Building an Observation Data Layer
Department of Genetics • Stanford University School of Medicine
Display of Near Optimal Sequence Alignments
Functional Annotation of the Horse Genome
Visualization of genomic data
CSDR Submit-Review Website Submitter Guide
Visualization of genomic data
Algorithm Animation for Bioinformatics Algorithms
Large Scale Annotation of Genomic Datasets with Genephony
ID Mapping tools: Converting Accessions between Databases
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
IT tool training Biocides Day 25th of October :30-11:15 IUCLID
TAMU Bovine QTL db and viewer
Vendor Portal Registration Procedures
Genome Database for Rosaceae:
Yating Liu July 2018 G-OnRamp workshop
Genetic Data in Mary Ann Tuli.
Membership Login/sign in
Regulatory Genomics Lab
Welcome to the GrameneMart Tutorial
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Part II SeqViewer AraCyc Help
Welcome - webinar instructions
Regulatory Genomics Lab
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

1. C. briggsae sequence curation 2. SNP data handling

C. briggsae sequence curation What’s involved: ACeDB database (brigace) with gene models and alignments Curator to make changes, be point of contact for user submissions Upload all gene data each release to Sanger Scripts that can be generalized to any genome Sanger generates various flat files (brigpep) and integrates into build SAB 2008

C. briggsae sequence curation Current curation: 175 changes so far Orthologues (personal communication) Protein families (chemoreceptors) Submit to EMBL every frozen release Few systematic problems with original gene set: 2324 Start_not_found 60 don’t start in frame=0 Sequence changes : 1 waiting SAB 2008

Curation tool add-on for transferring new CDS structure SAB 2008

SNP curation What’s involved: ACeDB database (snpace) contains all SNPs for all species Curator to make changes and be point of contact for user submissions Scripts to upload ace files to Sanger to be integrated in build process SAB 2008

SNP curation Current curation: C. elegans: C. briggsae: Large datasets in last year: 50906 pas* (CB4858) 112101 hw* (CB4856) Individually entered: 225 Personal communication Papers C. briggsae: Currently 58000 SAB 2008

SNP curation Future plans: New web form for submission More robust error checking Web interface improvement SAB 2008

Current Variation report page SAB 2008

SNP track visible on genome browser SAB 2008

Old WashU SNP display SAB 2008

nGASP gene predictions are good, but still not perfect Out of 100 Jigsaw (Twinscan) predictions checked: 81 (55) were predicted correctly 1 (0) correctly indicated a required change 10 (25) differed from the curated CDS 3 (7) merged/split genes incorrectly 3 (1) CDS where there was a pseudogene 1 (2) missed a gene entirely 1 (6) gene predicted where there was none SAB 2008

Jigsaw genes for C. elegans SAB 2008

Jigsaw merges two curated CDSs - transfer gene IDs SAB 2008 Jigsaw

Jigsaw correctly makes same change as curator to chemoreceptor curated Jigsaw history SAB 2008