Identifiers: what are they and why are they useful? Anita Bandrowski, UCSD.

Slides:



Advertisements
Similar presentations
How to Get Published European Journal of Human Genetics www. nature
Advertisements

EndNote Web Reference Management Software (module 5.1)
EndNote Web Reference Management Software (module 5)
Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
Longitudinal fissure 6 1 Cerebrum Gyrus 2 Central sulcus 5 Sulcus 3
DNAGENOMICS  RNAFUNCTIONAL GENOMICS  PROTEIN PROTEOMICS  STRUCTUREFUNCTIONAL PROTEOMICS.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2005.
How to draw different sections of the brain stem Sanjaya Adikari Department of Anatomy.
The Thomson Reuters CITATION CONNECTION Digital Library st March – 3 rd April 2014, Jasná David Horký Country Manager – Central and Eastern Europe.
Data Landscapes neuinfo.org Anita Bandrowski, Ph. D. University of California, San Diego.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
The cranial nerves. Central Nervous System - Brain Identify the anatomical location of each major brain area. Describe the functions of the major brain.
Neurolinguistics: Brain and Language Sydney Lamb Ling/Anth 411.
Maryann E. Martone, Ph, D, Neuroscience Information Framework University of California, San Diego.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
A tutorial on how to compute H-index using Web of Science database.
NEUROSCIENCE INFORMATION FRAMEWORK antibodyregistry.org An antibody registry for biological sciences Illuminating dark data, one antibody at a time Image.
NEUROSCIENCE INFORMATION FRAMEWORK scicrun.ch/resources The case of the missing research resources Illuminating dark data, one antibody at a time Image.
MEDULLA OBLONGATA INTERNAL FEATURES.
Domain Name System | DNSSEC. 2  Internet Protocol address uniquely identifies laptops or phones or other devices  The Domain Name System matches IP.
Identifying research resources should be easy Anita Bandrowski, representing the RII team.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
BACKGROUND Have a gene involved in neurological disease, its function unclear Knockout is lethal, so… Designed a conditional knockout (cKO) mouse where.
Data and Publications how to make things better Integration of Research Data and Publications Project ODE – workpackage 4 Eefke Smit International Association.
How to Use Google Scholar An Educator’s Guide
Periodical Databases Full-text article – entire textual contents of article in online format Abstract – brief summary of article Citation – basic information.
Introduction to Neuroanatomy
Community Engagement Maryann E. Martone, Ph. D. President, FORCE11.
What does Elsevier count? Use Measures for Electronic Resources: Theory & Practice ALCTS Program, ALA, Chicago Daviess Menefee Director, Library Relations,
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
Identifying research resources should be easy Anita Bandrowski, UCSD.
Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
Data Science for VIVO Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
English 115 GoogleScholar/ OneSearch Hudson Valley Community College Marvin Library Learning Commons 1.
How to get the most out of the survey task + suggested survey topics for CS512 Presented by Nikita Spirin.
Part 1 – PubMed Interface, Display options, Saving, Printing, and ing results. Instructions This part of the course is a PowerPoint demonstration.
Resource Curation and Automated Resource Discovery.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Data enters Scholarly Communication; how publishers can help make things better Integration of Research Data and Publications Project ODE – workpackage.
The Brain. CNS – composed of the brain and spinal cord Composed of wrinkled, pinkish gray tissue Surface anatomy includes cerebral hemispheres, cerebellum,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NIFSTD Maryann Martone University of California, San Diego.
Philip E. Bourne Professional Development Lecture 7 Understanding and Working the Publishing Process.
Exercise Your your Library ® RefWorks: The Basics October 10, 2006.
1 OSTI - Accelerating Science Information Dr. Walter L. Warnick Director U.S. Department of Energy Office of Scientific and Technical Information Federal.
Cross-Sectional Anatomy of the Human Brain and Spinal Cord
ESSENTIAL SCIENCE INDICATORS (ESI) James Cook University Celebrating Research 9 OCTOBER 2009 Steven Werkheiser Manager, Customer Education & Training ANZ.
The Uniform Resource Layer Anita Bandrowski Neuroscience Information Framework.
PRO and the NIF / ImmPort Antibody Registries Alexander Diehl Protein Ontology Workshop 6/18/14.
Nervous System By Allison Merritt, Tyler Tunell. What is the Nervous System? Made of Neurons Controls body functions and processes Divided into CNS, and.
Dr. Mujahid Khan. Divisions  Midbrain is formally divided into dorsal and ventral parts at the level of cerebral aqueduct  The dorsal portion is known.
What does a Nervous System do for you?
Anatomy of the Central Nervous System Lesson 5. Functional Anatomy: CNS n Major Divisions l Forebrain, Midbrain, Hindbrain l Know structure *name, location.
Bio 3411 Midterm Review: Structure/Development/Systems/Pla stics/Talents/Diseases/Genes Wednesday October 27, /27/101Bio 3411 Mid-Term Review.
Brain Dissection. How to perform a successful brain dissection Perform each step of the lab, referring to this power point for guidance. Carefully study.
Brain Dissection. How to perform a successful brain dissection Perform each step of the lab, referring to this power point for guidance. Carefully study.
Brainstem Anatomy. General Organization General organization Sensory cranial nerve nuclei are lateral Sensory cranial nerve nuclei are lateral Motor.
Uniform Resource Layer Anita Bandrowski, Ph. D. Neuroscience Information Framework University of California, San Diego.
Introduction to Neuroanatomy Structure-function relationships –Localization of function in the CNS Non-invasive brain imaging –CAT: structure, low resolution.
الله الرحيم بسم الرحمن علیرضا صراف شیرازی دانشیار و مدیر گروه دندانپزشکی کودکان رئیس کتابخانه مرکزی و مرکز علم سنجی دانشگاه علوم پزشکی مشهد.
The National Library of Medicine and its databases a PhD Lívia Vasas February.
How to Use Google Scholar An Educator’s Guide
Researching for your Literature Review
A tutorial on how to compute H-index using Web of Science database
Introduction to Endnote
WikiNeuron: Semantic Neuro-Mashup
The National Library of Medicine and its databases
Presentation transcript:

Identifiers: what are they and why are they useful? Anita Bandrowski, UCSD

W HEN PEOPLE RAN THE WORLD … People identify things based on location, group membership Roman Kowalski, Nowa Gora (individual) (family) (location) Graph theory points to social networks of roughly 100 relationship

New York phone system, as complexity grows people need to be replaced W HEN PEOPLE RAN THE WORLD …

A UNIVERSAL T URNG MACHINE Performing tasks retrieving thousands or millions of datum is relatively unencumbered Storing digits is ‘native’ to machine Can we use this to help humans make calls? (directional)(location)(specific)

W HY NOT A NAME ? What is a cell? What is a nucleus? What is a Curie? Who is Alexander? Who is Olek? One to many, many to one

A N ID Points uniquely to a single entity Resolves information about the entity without ambiguity In relational databases, a unique key Can we have IDs that are unique across multiple systems? URI (like a URL but resolves a single entity uniquely between systems)

C AN WE PUSH ENTITIES FURTHER ? Can identifiers be generated for everything? Can there be relationships between everything? Can there be a path that a computational system traverse between unambiguous terms?

C AN WE PUSH ENTITIES FURTHER ? Can identifiers be generated for everything? Can there be relationships between everything? Can there be a path that a computational system traverse between unambiguous terms? Cell Neuron is_a Pyramidal cell is_a

C AN WE PUSH ENTITIES FURTHER ? Can identifiers be generated for everything? Can there be relationships between everything? Can there be a path that a computational system traverse between unambiguous terms? Cell Neuron is_a Pyramidal cell is_a Neocortex Brain part_of

C AN WE PUSH ENTITIES FURTHER ? Can identifiers be generated for everything? Can there be relationships between everything? Can there be a path that a computational system traverse between unambiguous terms? Cell Neuron is_a Pyramidal cell is_a Neocortex Brain part_of Glutamate Molecul e is_a neurotransmitter_of

O NTOLOGY Philosophical study of the nature of being. Categories of being and their relations What are entities, how can they be grouped, related? Computer science: Formal representation of knowledge in a domain …but what do you code into an ontology?

A RESOURCE CATALOG : Must have a reasonable account of what is out there Image repository Database Atlas NITRC Harvard Nencki Inst. Institution Resource Type is_a has_role

A RESOURCE CATALOG IS IMPORTANT, BUT Software tools appear and disappear Portals are created and change frequently Databases update data annually, monthly, weekly So how can you catalog the ephemeral?

H OW DO YOU KEEP A REGISTRY CURRENT ?

A SHARED RESOURCE REGISTRY NIF

S OCIAL N ETWORK OF R ESOURCES ? 3DVC – 182 Force11 – 88 Monarch – 88 OneMind – 609 GeneOntology Tools Which resources are shared by multiple communities ? Structured data allows us to answer questions easily

C AN WE MINE RELATIONSHIPS BETWEEN RESOURCES ? Human annotations give a different graph of relationships Text mining gives a picture of the most used resources PDB

… BUT DATABASES CAN CONTAIN A LOT OF DATA NOT EASILY FOUND BY SEARCHING KEYWORDS Databases continue to be opaque to search engines They defy cataloguing efforts They can update daily There are over 2500 of them Where is data relevant to me? DISCO tool suite was built to incorporate data directly from databases into a unified index in NIF.

>200 data sources >850M data records >6M links to Articles >200 data sources >850M data records >6M links to Articles neuinfo.org

D ATA ARE DIVIDED INTO TYPES

U NIFORM SEARCH BASED ON ONTOLOGIES

D ATA ABOUT THE SUBTHALAMUS

Each resource implements a different model, which works well for the resource C ONNECTOME DATABASES

U NIFORM RESOURCE LAYER ALSO MEANS UNIFORM DATA ACCESS disco.neuinfo.org Luis Marenco, Rixin Wang; Yale

L ET ’ S PLAY A GAME

W HAT IS THIS ?

H OMUNCULUS

H OMUNCULUS *Careful mapping of the entire somatosensory cortex yields a representation of the amount of area devoted to sensing each body region. *From the homunculus we learn that humans pay attention to the lips, hands and genitals.

E ACH ANIMAL HAS A SET OF BODY REGIONS THAT IT IS PARTICULARLY CONCERNED WITH

Is there a data homunculus? If so, how can we know it?

T HE B RAIN AND ITS ’ DATA Ontologies provide a semantic framework for understanding data/resource landscape Data sources included in NIF -Complete list: -Services: Striatum Hypothalamus Olfactory bulb Cerebral cortex Brain Brain region Data source Vadim Astakhov, Kepler Workflow Engine

Most popularLeast popular Brain region popularity Brain Cerebral cortex Striatum Amygdala Thalamus Cerebellar cortex Cerebellum Hypothalamus Olfactory bulb Forebrain Nucleus accumbens Third ventricle Substantia nigra Midbrain Medulla oblongata Ventral tegmental area Pons Stria terminalis Subbrachial nucleus Commissural nucleus of vagus nerve Dorsal longitudinal fasciculus of medulla Medullary raphe nuclear complex Abducens nerve root Central tegmental tract of midbrain Spinothalamic tract of midbrain Superior cerebellar peduncle of midbrain Central tegmental tract of midbrain Medial longitudinal fasciculus of midbrain Spinothalamic tract of midbrain Superior cerebellar peduncle of midbrain White matter of the cerebellar cortex Accessory nerve root Vagus nerve root Oculomotor nerve root Trochlear nerve root Optic nerve root Olfactory nerve root

W HICH BRAIN REGIONS HAVE MOST ANNOTATIONS ? Sum per Level Sum for Major Brain Region

S O HOW CAN WE DO BETTER AT ANNOTATING DATA ? Can Identifiers help?

A SYSTEM TO IDENTIFY NOT JUST WHO PRODUCED A FINDING, BUT WHAT PRODUCED IT Faulty Antibodies Continue to Enter US and European Markets, Warns Top Clinical Chemistry Researcher- Genome Web Daily, October 11, 2013 “…of the findings in the literature about neuronal NF-κB are based on data garnered with antibodies that are not selective for the NF-κB …” --Herkenham et al. “…of the findings in the literature about neuronal NF-κB are based on data garnered with antibodies that are not selective for the NF-κB …” --Herkenham et al.

W HAT STUDIES USED MY MONOCLONAL MOUSE ANTIBODY AGAINST ACTIN IN HUMANS ? The following antibodies were used for immunoblotting: -actin mAb (1:10,000 dilution, Sigma-Aldrich); - tubulin mAb (1:10,000, Abcam); T46 mAb (specific to tau 404–441, 1:1000, Invitrogen); Tau-5 mAb (human tau 218–225, 1:1000, BD Biosciences) (Porzig et al., 2007); AT8 mAb (phospho-tau Ser199, Ser202, and Thr205, 1:500, Innogenetics); PHF-1 mAb (phospho-tau Ser396 and Ser404, 1:250, gift from P. Davies); 12E8 mAb (phospho-tau Ser262 and Ser356, 1:1000, gift from P. Seubert); NMDA receptors 2A, 2B and 2D goat pAbs (C terminus, 1:1000, Santa Cruz Biotechnology)… mAb=monoclonal antibody

… SURELY THIS YOU HAVE FOUND A TERRIBLE PAPER, THIS CAN ’ T BE THE NORM

Hypothesis: Resources in the published literature are not uniquely identifiable Gather journal articles 5 domains: Immunology Cell biology Neuroscience Developmental biology General biology 5 domains: Immunology Cell biology Neuroscience Developmental biology General biology 3 impact factors: High Medium Low 3 impact factors: High Medium Low 84 Journals 238 papers 707 antibodies 104 cell lines 258 constructs 210 knockdown reagents 437 model organisms Vasilevsky et al, PeerJ, 2013

The problem is general across multiple resource types and disciplines Vasilevsky et al, Peer J 2013

R ESOURCE I DENTIFICATION I NITIATIVE Two pre-meetings with editors and publishers Society for Neuroscience, 2012 NIH: June, 2013 Society for Neuroscience, 2013 Designed pilot project Entities Procedure Infrastructure Established working group through FORCE11 Signed up partners Led by: Matt Brush, Nicole Vasilevsky, Anita Bandrowski And more

P ILOT P ROJECT Authors to identify 3 types of research resources: Software /databases Antibodies Model organisms Include RRID in methods section Voluntary for authors Journals did not have to modify their submission system Journals have flexibility in implementation. Send request to author at: Submission During review After acceptance Launched February 2014: 3 month commitment and more…

RII P ORTAL A single portal for authors >10 databases One search interface Simple directions Big “Cite This” button Uniform format for citation Help desk for authors

W HAT STUDIES USED … >100 articles have appeared to date 15 journals 630 RRID’s 3 removed by typesetting 95% correct 14% false negative rate >200 antibodies were added >75 software tools/databases were added Database available at:

An update of Vasilevsky et al.

W HAT CAN WE DO WITH AN RRID? A resolver service has been created 3rd party tools are being created to provide linkage between resources and papers Utopia prototype ScienceDirect

W HAT HAVE WE LEARNED ? Authors are willing to adopt new types of citations Authors were fairly accurate at performing the task RRID’s resolved by search engines without requiring specialized citation services Citation drives registration Clear role for repositories as authorities

H OW C AN Y OU H ELP ? Authors: Use IDs in YOUR next paper. At least 100 of your friends already have. scicrun.ch/resources Tool Makers: Register your tools! Make authors job easy. Display the proper citation ID format proudly. Reviewer: Ask authors to put identifiers in their methods, you know they will do almost anything to get you off their back. Editors: Still time to join the RII, go to Force11 to download a sample letter to authors. Publishers: Central instructions to authors have been updated at Springer and Elsevier, where are yours?