Download presentation
Presentation is loading. Please wait.
1
Future Plans for the HGNC
Elspeth Bruford
2
Funding Sources Applied to NHGRI for renewal of current U41 funding Submitted in cycle III ( ) expect score Feb/March, advisory council May current end will apply for no-cost extension Will be applying to Wellcome Trust Biomedical Resources fund (current end ) preliminary application due full application due Should we consider applying to anywhere else?
3
Future Funded Aims ( ) continue naming of human protein-coding genes, pseudogenes & RNA genes - largely maintenance for protein coding genes, more focus on RNAs continue reassignment of uninformative symbols based on functional data – bearing in mind clinical aspect coordinate gene naming across vertebrates – increase in automation and species assign gene names within complex families across vertebrate species (olfactory receptors, cytochrome P450s) – including new families: GSTs, UGTs, and ? ? zinc fingers, histones, immunoglobulins… ?
4
Resource Project Aim 1: Naming novel protein coding loci
Focus on novel protein coding genes reported in the literature, annotated by GENCODE , and novel genes annotated on new alternative haplotypes. Aim 2: Naming pseudogenes Focus on transcribed and unprocessed pseudogenes, as well as segregating/polymorphic pseudogenes and unitary pseudogenes. Aim 3: Naming long non-coding RNA genes Name long non-coding RNA genes based on genomic location, or published (or prepublication) functional data. Prioritize published loci, and those annotated by GENCODE and RefSeq. Aim 4: Naming small non-coding RNA genes Name microRNAs, transfer RNAs, small nucleolar RNAs and ribosomal RNAs, and investigate naming piRNA genes, create a “miscellaneous non-coding RNA” category for non-specific bioinformatically predicted genomic loci.
5
Resource Project Aim 5: Reassigning placeholder symbols based on novel data Seek new functional data to enable updates for placeholder symbols., collaborating with EuropePMC , using bioinformatics tools and identifying new GO annotations. Aim 6: Improving human gene names for transferral to other species Update human gene names to remove superfluous information and punctuation, aim to unify gene and protein names, and avoid using human phenotypes if possible, following community consultation. Aim 7: Naming genes in other vertebrate species Further automate naming of orthologs utilising a subset of HCOP data and the conversion rules formulated for chimp, initially using dog, cow and Rhesus macaque, and improve tools for manual curation. Aim 8: Examining complex homology in chimp Manually curate chimp gene naming for cases where 2 or less of the orthology resources agree
6
Resource Project Aim 9: Naming CYP genes across vertebrates
Continue to name CYP genes in multiple vertebrate species and investigate novel CYP mammalian subfamilies. Aim 10: Naming OR genes across vertebrates Expand naming to non-mammalian vertebrate OR repertoires, initially looking at Xenopus, Anolis, zebrafish, chicken and zebrafinch. Aim 11: Increasing gene family resources Curate more human genes into family sets based on shared characteristics, in consultation with specialist advisors when appropriate, continue to collaborate with FlyBase about their ‘Gene Groups’. Aim 12: Naming in other complex gene families Manually curate gene families with complicated orthology relationships across vertebrate species, develop new synteny and BLAST filtering tools, begin with UGT and GST families.
7
Resource Informatics Aim 1: Updating internal HGNC curation tools
Reimplement internal tools as AngularJS web applications and migrate to a virtual machine Aim 2: Updating internal HGNC QC tools Expand tools, including “end of day” sanity check, rewrite internal sequence search and alignment tool using EMBL-EBI RESTful web services Aim 3: Collaborating with EuropePMC To notify us of publications relating to placeholder symbols, and journals to target Aim 4: Maintaining and updating HCOP Expand with addition of new species, initially sheep, gorilla and S. pombe; investigate further orthology sources.
8
Resource Informatics Aim 5: Maintaining and updating the VGNC database and pipeline Expand to include data from other species, beginning with cow, dog & macaque, increase utility by incorporating more external cross references, expand the set of tools and views available on the website. Aim 6: Updating internal VGNC curation and QC tools Create AngularJS web applications for curating individual gene symbols & gene families, synteny tool for curating orthologs in multiple vertebrate species in a single process. Aim 7: Updating the HGNC database and release pipeline Move from PostgreSQL schema to fully normalised MySQL schema, reimplement update pipeline to streamline the processes and utilise extensive compute farm at EMBL-EBI. Aim 8: Soliciting user input Encourage feedback via our websites, utilise data from annual survey, “contact us” form, web statistics and from panel of users; continue to attend and participate in a range of conferences and workshops
9
Management, Dissemination & Training
Aim 1: Organizational structure and staff responsibilities Elspeth will continue managing 4 FTE curators at EMBL-EBI and University of Cambridge, supported by 2 informatics staff at EMBL-EBI, augmented remotely by 4 complex gene family experts and a programmer. Aim 2: Scientific Advisory Board Continue to receive key advice from their SAB, with yearly face to face meeting Aim 3: HGNC website backend and frontend redesign HGNC website backend replaced with a single server; frontend re-written using Angular JS, Jekyll and HTML5 Aim 4: Maintaining and updating searches & download facilities Continue to support existing facilities, expand Biomart to include gene family data, both Biomart & REST to include VGNC data. Aim 5: Maintaining and updating the VGNC website Initial efforts will focus on methods and tools for downloading VGNC data, along with gene family data displays Aim 6: Training Continue attending major genomics conferences; plan to produce more online tutorials and start an HGNC blog.
10
1. Transposable elements
11
2. Pseudogenes What classes: transcribed? unprocessed? unitary?
published? by parent gene?
12
3. Symbols converted to dates
13
3. Symbols converted to dates
14
3. Symbols converted to dates
DEC1, deleted in esophageal cancer 1 > ?? MARC1-2, mitochondrial amidoxime reducing component 1-2 > MTARC1-2? MARCH1-10, membrane associated ring-CH-type finger 1-10 > MARF1-10? SEPT1-14, septin 1-4 > SEPTIN1-14?
15
4. Create tool for simplified queries on recent updates to dataset
16
5. Alliance of Genomic Resources
17
6. HGNCmine
18
7. HCOP - other species
19
7. HCOP - other species
20
8. HCOP - more export options
21
9. Classifying Gene Families
Different ways to classify: Homology Domain/motif Complex Shared function/pathway/phenotype Combinations of these…
22
10. Blogging and other social media
23
Computing
24
Complex Gene Families Olfactory receptors – Doron
Cytochrome P450s – Jed & David
25
Bidirectional promoters
Grzechnik et al 2014, TiBS lncRNA “head to head” (TSSs < 1 kb) with coding gene on antisense strand implications for transcription of both loci currently denoted in gene name, e.g. FOXG1-AS1, “FOXG1 antisense RNA 1 (head to head)” Mattick & Rinn, 2015 suggested using “BI” prefix to coding gene symbol recent community consultation suggests referencing head-to-head orientation, e.g. SYMBOL-HTH or naming as lincRNAs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.