Barcode sequences at GenBank Scott Federhen NCBI/NLM/NIH 4 Jun. 2005
Archiving and Distributing Core Barcode Data Elements 1. Barcode sequences versioned and archived in INSD. 2. Core specimen annotation associated with sequence entries. 3. Links to specimen data, literature and ancillary resources. 4. Archive underlying raw data. (traces & quality scores) 5. Bulk submission tool for barcode sequences.
Core specimen annotation organism - species name isolate - field isolation name specimen-voucher - voucher identifier country - locality information lat-lon - GPS coordinates collection-date - minimal data about collection event collected-by identified-by forward-primer - PCR primer data (for reproducibility) reverse-primer forward-primer-name reverse-primer-name note - free-text note other GenBank qualifiers - cell-type; cell-line; sex; dev-stage; host ... db-xref - explicit links to outside database entry
BARCODE requirements valid species name 5’ region of mtCOI (until other loci are approved by CBoL) at least 500 bp; <1% ambiguous bases; bi-directional reads (structured) specimen voucher (e-voucher if appropriate) locality information (/country or /lat_lon) primers traces
new NCBI initiatives museum/herbarium/collections database structured vouchers: <inst-code> <coll-code> <spec-id> explicit checklists: ITIS, Species2000, uBio, ZooRecord ... PubMed/PubMed Central expansion to systematic literature
barcode [keyword] cbol
------------------------------------------------------------------- linkid: 0 query: AF187808 [accn] query: AF186989 [accn] query: AF186917 [accn] rule: name: Image of voucher specimen for DNA Sample 27-6
barcode[keyword] Formicidae[orgn] Diptera[subtree] AND species[rank] (Craniata[subtree] NOT Tetrapoda[subtree]) AND species[rank] NOT unspecified[prop] NOT loprovfishbase[filter]
Barcode Submission Tool Contact Information Authors & Citations Global Elements mt COI [genetic code] [primers] Source Modifiers Nucleotide FASTA Protein FASTA Quality Scores
Source Modifiers Table