Download presentation
Presentation is loading. Please wait.
Published byMaximillian Bruce Modified over 9 years ago
1
Study Discovery in Support of the Data Without Boundaries Initiative, the NIH Data Documentation Index and Infonomics Jay Greenfield Booz Allen Hamilton DDI 2014 iAssist Sprint Toronto, ON
2
Agenda Introduce three initiatives that a DDI 4 Discovery functional view needs to support – Data Without Boundaries (DwB) Data Without Boundaries – NIH Data Discovery Index (NCI DDI) NIH Data Discovery Index – The Infonomics Use CaseInfonomics In this context consider some SDMX-based, GSIM and DDI Dublin Core-based information objects with which a DDI 4 Discovery view may need to be alignedSDMX-basedGSIMDDI Dublin Core-based In view of these information objects consider the completeness of DISCODISCO 2
3
U SE C ASES 3
4
The DwB and NIH DDI Use Cases In both DwB and NIH DDI aggregate datasets are a subject for discovery together with micro datasets – The DwB Metadata Model includes both elements from DDI 3 and SDMX with the idea of using aggregate data to “provide context for searches for microdata” – Likewise NIH DDI seeks to spawn a pilot project that “would work with interested journals (such as PLoS, BMC, or Nature Genetics) to require that every table and figure links out to original data and software” 4
5
Infonomics, Citation and the NIH DDI 5 From GSIM 1.1: Represented and Instance VariablesRepresented and Instance Variables
6
Infonomics, Citation and the NIH DDI GSIM has introduced the represented variable It is akin to constructs and common data elements whereas instance variables are actual measurescommon data elements NIH DDI has suggested that we attach citations to constructs and datasets because “citations are a metric that can be used by NIH and the academic communities to assess scholarly activity” Such “assessments” are central to infonomics which seeks to find and define metrics that can be used in the valuation of information 6
7
M EET THE I NFORMATION O BJECTS 7
8
The RDF Data Cube Vocabulary 8 Dimension Measure
9
The RDF Data Cube Vocabulary 9 Slice
10
Represented variables and infonomics 10 Citation has Citations, when associated with represented variables (CDEs) enable resource valuation or, again, infonomics
11
Represented variables and infonomics 11 Citation
12
Represented variables and infonomics A represented variable can have many citations Citations conform to Dublin Core and cover 15 domains as well as keywords from thesauri like MeSHDublin Core MeSH Using MeSH enables programmatic search for articles in PubMedPubMed By comparing and compiling the citations, evaluations of represented variable and datasets can be undertaken in support of reviews by governance groups including NIH and OMB 12 Citation
13
Represented variables and infonomics In DDI Dublin Core (DC) is expressed in XML Natively, DC is specified in DC UML and DC RDF/XMLDC RDF/XML Using DC RDF/XML and a standard RDF query engine, it is possible to observe and analyze relationships between citations both within and between represented variables 13 Possible Partner: Metadata TechnologyMetadata Technology Citation
14
Represented variables and infonomics 14 Citation
15
Represented variables and infonomics MeSH vocabulary is used for indexing journal articles citations hosted by PubMed MeSH PubMed hosts more than 23 million citations for biomedical literature from MEDLINE, life science journals, and online books PubMedMEDLINE PubMed supports both human searchers at its portal and software agents by way of EntrezEntrez PubMed indexes citations using both MeSH Medical Subject Headers and MeSH subheadingssubheadings 15 Citation
16
D ISCO C OMPLETENESS 16
17
In DDI 4 might we want to revisit the DISCO discovery view? 17
18
In DDI 4 might we want to revisit the DISCO discovery view? Including more elements from the RDF Data Cube Vocabulary (the qb namespace in DISCO) can lend additional specificity to search: – In which studies was a specific analysis undertaken and reported – How comparable was the micro data that went into these analyses? 18
19
In DDI 4 might we want to revisit the DISCO discovery view? Including GSIM represented variables and connecting elements from the the Dublin Core RDF Citation Vocabulary to represented variables and datasets opens the way to an ecosystem of crawlers: – Software agents can search citation databases for new publications – Other data resources might be linked in They might include “existing domain-specific repositories, institutional data repositories, or other resources including commercial clouds” 19
20
Could there be more than one DISCO? Dublin Core motivates itsDublin Core Application Profiles (DCAP) with this introduction:Dublin Core Application Profiles – When it comes to metadata, one size does not fit all. In fact, one size often does not even fit many. The metadata needs of particular communities and applications are very diverse. The result is a great proliferation of metadata formats, even across applications that have metadata needs in common. 20
21
Could there be more than one DISCO? – The Dublin Core Metadata Initiative has addressed this by providing a framework for designing a Dublin Core Application Profile (DCAP). A DCAP defines metadata records which meet specific application needs while providing semantic interoperability with other applications on the basis of globally defined vocabularies and models. In line with this vision in its DCAP guidelines document Dublin Core introduces the Singapore FrameworkSingapore Framework 21
22
Could there be more than one DISCO? 22 The Singapore Framework
23
Could there be more than one DISCO? The Singapore Framework is a standard, not an information model Perhaps the middle layer “Domain standards” might be analogous to a DDI 4 Discovery package Then, in place of DISCO, there might be multiple application profiles or, again, views In this context imagine that DDI 4 might publish at least two such “official” ones If you had your druthers, what would these two profiles be?druthers 23 The Singapore Framework
24
24
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.