Data Landscapes neuinfo.org Anita Bandrowski, Ph. D. University of California, San Diego.

Slides:



Advertisements
Similar presentations
Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
CHORUS Implementation Webinar May 16, 2014 Mark Martin Assistant Director, Office of Scientific and Technical Information Office of Science U.S. Department.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Knowledge Graph: Connecting Big Data Semantics
BioQUEST Curriculum Consortium Biocomplexity Project
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Databases & Data Warehouses Chapter 3 Database Processing.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Institute on Systems Science and Health- Federal Funding Panel Grace C.Y. Peng, Ph.D. May 25, 2011.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Integrating digital atlases of the brain: atlas services with WPS Ilya Zaslavsky San Diego Supercomputer Center, UCSD Lead of the INCF Digital Atlasing.
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
E-BioSci a platform for e-publishing and information integration in the life sciences Les Grivell European Molecular Biology Organization.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
Resource Curation and Automated Resource Discovery.
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
Navigating the Neuroscience Data Landscape Maryann Martone, Ph. D. University of California, San Diego.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
NanoHUB.org and HUBzero™ Platform for Reproducible Computational Experiments Michael McLennan Director and Chief Architect, Hub Technology Group and George.
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NIFSTD Maryann Martone University of California, San Diego.
Presented by Dr. S. C. Jindal Librarian Central Science Library University of Delhi Delhi Information Competency.
Proof of concept study of the Socio-Ecological Research and Observation oNTOlogy (SERONTO) for integrating multiple ecological databases. Introduction.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
VIVO and Scholarly Repositories: Synergistic Opportunities.
The Neuroscience Information Framework Making Resources Discoverable for the Computational Neuroscience Community Jeffrey S. Grethe, Ph. D. Co-Principal.
SHARE (SHared Access Research Ecosystem) Tyler Walters Co-Chair, SHARE Steering Group (a joint committee of the ARL, the AAU, and the APLU) Eric Celeste.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
The Uniform Resource Layer Anita Bandrowski Neuroscience Information Framework.
Information Dynamics & Interoperability Presented at: NIT 2001 Global Digital Library Development in the New Millennium Beijing, China, May 2001, and DELOS.
University of California, San Diego Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework.
PRO and the NIF / ImmPort Antibody Registries Alexander Diehl Protein Ontology Workshop 6/18/14.
Research: Discovering Information Published Resources Printed articles, books, catalogs, etc. Online articles, etc. – found via: Search engine results.
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
N IF S TD : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE Fahim IMAM 1, Stephen LARSON 1, Sridevi POLAVARAM 2, Georgio ASCOLI 2, Gordon SHEPHERD 3, Jeffery.
The Neuroscience information framework A User’s Guide.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
The Uniform Resource Layer Anita Bandrowski Neuroscience Information Framework.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Contributions to mouse BIRN tools and resources Maryann Martone and Mark Ellisman University of California, San Diego 2008.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Uniform Resource Layer Anita Bandrowski, Ph. D. Neuroscience Information Framework University of California, San Diego.
Experience with the development and operation of the Neuroscience Information Framework (NIF) portal Maryann E. Martone, Ph. D. University of California,
Networked Information Resources Federated search, link server, e-books.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Towards a unified MOD resource: An Overview
University of California, San Diego
Solutions to Clinical Data Visualization and Analysis
Doron Goldfarb & Yann LE FRANC
WikiNeuron: Semantic Neuro-Mashup
An ecosystem of contributions
ece 627 intelligent web: ontology and beyond
Collaborative RO1 with NCBO
Presentation transcript:

Data Landscapes neuinfo.org Anita Bandrowski, Ph. D. University of California, San Diego

Overview Brief overview of NIF philosophy Examples of data about addiction Why you should never use google to answer any scientific question How can we make google better?

Power! How many subject/patients do we need to be relatively certain that we are correct? More than you can afford? If YFGM gave each of you 1B dollars, would that solve the problem? But, what if: – Big data from small data?

Addiction is a large problem

Solving the large problems of science? Observation Experimentation Modeling Cooperative data intensive science

A SHARED UNDERSTANDING OF THE GENETICS OF ADDICTION, HOW CAN EVERYONE PLAY?

NIF is an initiative of the NIH Blueprint consortium of institutes NIF is an initiative of the NIH Blueprint consortium of institutes – What types of resources (data, tools, materials, services) are available to the neuroscience community? – How many are there? – What domains do they cover? What domains do they not cover? – Where are they? Web sites Web sites Databases Databases Literature Literature Supplementary material Supplementary material – Who uses them? – Who creates them? – How can we find them? – How can we make them better in the future? PDF files PDF files Desk drawers Desk drawers

NIF: A New Type of Entity for New Modes of Scientific Dissemination NIF’s mission is to maximize the awareness of, access to and utility of digital resources produced worldwide to enable better science and promote efficient use – NIF unites neuroscience information without respect to domain, funding agency, institute or community – NIF is a library for scholarly output that is a web enabled resource and not a paper – Aggregates all the different databases, tools and resources now produced by the scientific community – Makes them searchable from a single interface – A practical approach to the data deluge – Educate neuroscientists and students about effective data sharing

Surveying the resource landscape NIF resource registry: listing of > 6000 databases, tools, materials, services, websites (> 2500 databases)

NIF data federation: Pub Med Central for data NIF was designed to accommodate the multiplicity of heterogeneous and distributed data resources, providing deep query of the contents and unified views  200 sources  > 360 M records  200 sources  > 360 M records

NIF Semantic Framework: NIFSTD ontology NIF covers multiple structural scales and domains of relevance to neuroscience Aggregate of community ontologies with some extensions for neuroscience, e.g., Gene Ontology, Chebi, Protein Ontology NIFSTD Organism NS Function Molecule Investigation Subcellular structure Macromolecule Gene Molecule Descriptors Techniques Reagent Protocols Cell Resource Instrument Dysfunction Quality Anatomical Structure Ontologies provide the universals for integrating across disparate data by linking them to human knowledge models

Neurolex: Machine-processable concepts for neuroscience Machine-processable lexical units Connected via relationships Identified by a unique identifier (URL) Computable index for neuroscience Framework for linking knowledge, claims and data Built using a semantic wiki

NIF Analytics: The Neuroscience Landscape Ontologies provide a semantic framework for understanding data/resource landscape Where are the data? Striatum Hypothalamus Olfactory bulb Cerebral cortex Brain Brain region Data source Vadim Astakhov, Kepler Workflow Engine

A data homunculus?

Genetics of addiction? Gene Protein Subcellular components Cells Cell microcircuits Cell macrocircuits Networks Brain regions PNS Whole organism Behaving organism (environment) Networks of organisms Populations

Genetics of addiction? Gene Protein Subcellular components Cells Cell microcircuits Cell macrocircuits Networks Brain regions PNS Whole organism Behaving organism (environment) Networks of organisms Populations

Genetics of addiction? Addiction is a disease of subpopulations of humans who take sociologically undesirable drugs or sociologically desirable drugs at undesirable concentrations Drug is a molecule that does not exist in the body, an environmental factor Drugs are metabolized by the digestive system and act after crossing the BBB Drugs modify the activity of existing proteins on vastly different time scales Drugs modify behaviors that depend on the actions of an orchestra of neurons acting within circuits that all have a purpose that is not to take drugs

The ecosystem is diverse and messy (and that’s OK) NIF favors a hybrid, tiered, federated system Domain knowledge – Ontologies Claims and observations – Virtuoso RDF triples Data – Data federation – Spatial data – Workflows Narrative – Full text access Neuron Brain part Disease Organism Gene Caudate projects to Snpc Grm1 is upregulated in chronic cocaine Betz cells degenerate in ALS Data Knowledge

Wish list: Cooperative science A mission that will engage the entire neuroscience community and beyond An active community contribution model where everyone is expected to contribute their outputs, not just a selected few – Diverse contributions are tracked and recognized – Spatial-semantic-genetic-temporal frameworks make data discoverable-usable-integratable and help fill in the gaps A platform that moves neuroscience into the web – Networking data, knowledge, tools, models, efforts, people, compute resources, simulation – Supports digital research objects as first order contributions, not just narrative – Works through and with existing platforms to improve them where possible Cooperative system: “...individual components that appear to be “selfish” and independent work together to create a highly complex, greater-than-the-sum-of-its-parts system.”

20 neurolex.org INCF Community encyclopedia Standardize vocabulary Define all vocabulary, terms, protocols, brain structures, diseases, etc Living review articles Build and maintain working ontologiesLinks to data, models and literature Semantic organization, search, analysis and integration Global directory of all shared vocabularies, CDEs, etc Slide courtesy of Sean Hill

Community Platforms: Researchers-tools-data-computing