Tools For Vertebrate Gene Naming

Slides:



Advertisements
Similar presentations
1 / 30 Data Mining with BioMart
Advertisements

SRI International Bioinformatics 1 Orthology-Based Multi-PGDB Curation Tools Suzanne Paley Pathway Tools Workshop 2010.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
January 25, Current and Future Database (CH)  Indexing vgd_common (JM; 1Q)  Fully implement Taxonomy tables (JO, DD; 2Q)  Allow subspecies-level.
Human & Mouse Orthologous Gene Nomenclature (HUMOT) HUGO Gene Nomenclature Committee (HGNC) Matt Wright
Data Mining in Ensembl with EnsMart. 2 of 24 All genes from a candidate region Genes with a particular protein domain Members of a protein family Genes.
Genomic Database - Ensembl Ka-Lok Ng Department of Bioinformatics Asia University.
Microsoft Office Live Create Your Own Website Basics Behind Office Live Allows users to create a professional presence without the hefty expenses of.
UniProt - The Universal Protein Resource
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View.
Aequatus Browser, an open-source web-based tool developed at TGAC to visualise homologous gene structures among differing species or subtypes of a common.
GenSAS: Genome Sequence Annotation Server, a Tool for Online Annotation and Curation Dorrie Main, Taein Lee, Ping Zheng, Sook Jung, Stephen P. Ficklin,
Molecular Interactions 2013 Liverpool. PSICQUIC & PSICQUIC-view 2.5/2.6/2.7 Review of new implementation based on MITAB2.7 (2.6/2.5) Reference implementation.
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
Codeigniter is an open source web application. It occupies a very small amount of space in the memory and is most useful for developers who aim to develop.
Student registers to the website Student login Student views current courses Register for available courses View the first course item. with order number.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Three State Data Warehouse 1 Cassie Archuleta Tom Moore May 6, 2014 Progress Update for 3SDW Development.
Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Data Mining in Ensembl with BioMart Nov,
Future Plans for the HGNC Elspeth Bruford. HGNC Team Elspeth Bruford Susan Tweedie* Ruth Seal Kris Gray Welcome to Susan * starting Beth Yates.
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
The Public Face of TAIR User Interface Design Responsiveness to User Input.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
P HYLO P AT : AN UPDATED VERSION OF THE PHYLOGENETIC PATTERN DATABASE CONTAINS GENE NEIGHBORHOOD Presenter: Reihaneh Rabbany Presented in Bioinformatics.
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Variation data in VectorBase NIH/NIAID VectorBase site visit March 2015.
Data Mining in Ensembl with BioMart Giulietta Spudich.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 of 28 Evaluating Genes and Transcripts (“Genebuild”)
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
Future Plans for the HGNC (Questions, Answers, and More Questions) Elspeth Bruford.
Directions For Accessing Your “Moodle” Team Account 1 Step 1: Use your team Moodle account to access Technology Challenge Louisiana High School Technology.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
BUSINESS SENSITIVE 1 SAAW - Sequence Annotation and Analysis Workshop Boyu Yang and Gene Godbold Battelle Memorial Institute, Charlottesville Operations.
Justin Scheitlin Daisey Fahringer
Elsevier Operative Techniques - Netter Process Flow
Future Plans for the HGNC
Using BLAST to Identify Species from Proteins
Using the Personal Image Photo Library
Getting GO annotation for your dataset
Uplode Simple file storage CWEB Final Project
hi2: Contributing to the modernization of VistA
Figure 1. Number of CCDS IDs and genes represented in the human (A) and mouse (B) CCDS releases. The X-axis indicates the year in which a CCDS dataset.
P-POD-PANTHER: update
University of Pittsburgh
SRA Submission Pipeline
What is Bioinformatics?
Department of Genetics • Stanford University School of Medicine
Using BLAST to Identify Species from Proteins
Chapter 1 Introduction.
ID Mapping tools: Converting Accessions between Databases
GO Annotation from different sources
Ensembl Genome Repository.
Identify D. melanogaster ortholog
opening our collections data to the public
TAMU Bovine QTL db and viewer
Explore Evolution: Instrument for Analysis
Yating Liu July 2018 G-OnRamp workshop
Genetic Data in Mary Ann Tuli.
1. C. briggsae sequence curation 2. SNP data handling
Using BLAST to Identify Species from Proteins
Welcome - webinar instructions
1 Directions For Accessing Your “Moodle” Team Account
Presentation transcript:

Tools For Vertebrate Gene Naming Bethan Yates – HGNC SAB 2015

Key aims of the VGNC project To coordinate the naming of genes across vertebrate species. Initial work has focused on identifying a consensus set of 1:1 orthologs between chimpanzee and human that could be named in a semi-automated manner and creating a system to allow this. To assign gene names within complex gene families across multiple vertebrate species. We are working on developing a curatorial interface for gene superfamily annotation by expert collaborators, focusing initially on cytochrome P450s and olfactory receptors.

1:1 consensus orthologs in in chimp 58 Ensembl NCBI Panther OMA 10,834 96 375 92 545 1153 68 100 4555 5837 296 229 2915 http://www.genenames.org

Curation Database A MySQL database schema has been designed to store vertebrate gene symbols and their associated data This database has been populated with a set of chimp genes provided by identifying consensus 1:1 orthologs with human genes using out HCOP tool. In this case consensus was based on agreement between OMA, Panther, Ensembl Compara and NCBI’s “ortholog gene group data”. This seed set of 10,834 chimp genes is being used to test the database schema and curation tools. Our curators have currently been able to use the system to assign approved gene names and symbols to 6000 chimpanzee genes.

Database schema

Curation Tools Website A new website has been developed, https://herd.genenames.org this site provides tools for VGNC curators to input, access and edit vertebrate gene nomenclature data. Access is restricted to people with user accounts. This website is for internal use only and will not be made accessible to the general public. New curation tools will be added to this site as needed.

https://herd.genenames.org

Quick curate tool

Quick curate user interface

Tool to restrict human nomenclature data to human genes only

Symbol list

Preview Symbol Report

Family upload tool – work in progress!

VGNC database Database schema is identical to the curation database Contains vertebrate genes that have had their nomenclature data approved by our curators. Updated daily Provides the data that will be displayed at http://vgnc.genenames.org Used to generate download files for FTP site

http://vgnc.genenames.org This will be the public facing website and will mirror the website for human data, http://genenames.org The site is currently in development, we hope to have a beta site out early in 2016. Initially the site will be very simple and will host symbol reports and a basic search facility for chimp genes as well as download files for the curated chimp data.

Approved symbol list/search

Symbol report

Statistics and Downloads

Future Plans: New Species We need to identify which species we should be working on next and are happy to take input from the SAB members. Species : # of protein coding genes we can name using HCOP: Total # of protein coding genes (taken from Ensembl) Chimp 10,834 18,749 Cow 11,574 19,994 Dog 10.748 19,856 Horse 9909 20,449 Macaque 9745 21,905 Chicken 7649 15,508 Opossum 6605 21,327 Zebrafish 4974 25,642 Platypus 1415 21,698

Future Plans: New Tools/Features For herd.genenames.org: Gene family curation tool Comprehensive gene curation tool Gene mapping tool For vgnc.genenames.org: SOLR powered search and REST service BioMart server/Custom downloads tool? Data Submission tools Sequence Alignment tool

Future Plans: Gene family information Complete work on the family upload tool to allow data that has been already been curated by our gene family experts to be entered into the VGNC database and website. Enable display of vertebrate gene family data on the VGNC website, replicating the displays we have for human gene family data on the HGNC site. Continue working with our family experts to provide tools to enable them to more easily curate gene family data.