Connect UNAVCO, a VIVO for a Scientific Community M. Benjamin Gross1, Linda R. Rowan1, Matthew Mayernik2, Michael D. Daniels2, Huda Khan3 and Dean B. Krafft3 UNAVCO, Boulder, CO National Center for Atmospheric Research, Boulder, CO Cornell University, Ithaca, NY
About UNAVCO UNAVCO is a non-profit university-governed consortium which facilitates geoscience research and education using geodesy. Geodesy: the study of Earth’s shape, gravity field, and rotation Could also show video if there’s audio https://www.youtube.com/watch?v=yxLMk120vMU
About EarthCollab EarthCollab goals EarthCollab A National Science Foundation EarthCube building block Partnership between UNAVCO, NCAR, and Cornell University EarthCollab goals Support scientific collaboration, and increase the discoverability and usability of scientific resources, via semantic and linked data technologies. Could also show video if there’s audio https://www.youtube.com/watch?v=yxLMk120vMU
Connect UNAVCO – connect.unavco.org Ontology Controlled research vocab Geospatial info Faceted search PIDs Data facility Distributed community, for specific domain and as such can be, should be customized for our domain. Ontology, Controlled research vocab, Geospatial info, Faceted search, PIDs, Data facility Data facility: datasets, stations, external collaborators (consortium)
User Engagement EarthCollab Survey conducted in 2014 and 2015 Survey about how researcher find and share research.
Requirements and Challenges Site must be easily searchable – survey takers indicated they use search for most tasks Connect data with people, publications, and tools in a discoverable way and point user toward data source Use unique IDs whenever possible, e.g. DOIs Minimize duplication of data by crosslinking VIVO instances Extend VIVO ontology to capture UNAVCO concepts Could also show video if there’s audio https://www.youtube.com/watch?v=yxLMk120vMU
Requirements and Challenges Ontology requirements Describe Earth observations – ships, networks, platforms, temporal and spatial aspects Capture relationships between the UNAVCO facility and member universities and their representatives Could also show video if there’s audio https://www.youtube.com/watch?v=yxLMk120vMU
Local ontology extensions
Local ontology extensions
vocabulary comparison
Ingest Process Challenges: Distributed and variable data stores, no institutional subscription to publication indexing service, publications authored by external collaborators, not employees
Connect UNAVCO stats http://connect.unavco.org Events: mostly scientific conferences Locations: includes GPS/GNSS stations ~ 555,000 asserted triples, running v1.9
Connect UNAVCO Research Terms Expertise Community members and employees can select from a list of 120 research and expertise terms Software Engineering Expertise Research area
Connect UNAVCO Research Terms Limited vocabulary > longer lists of people
Geospatial info
Facets in Connect UNAVCO Find member reps Find people with expertise
Facets in Connect UNAVCO Filter publications by publication year, sort by Altmetric score
Facets in Connect UNAVCO Elasticsearch 1 https://www.elastic.co/ Facetview2 https://github.com/CottageLabs/facetview2 Ingest scripts and themes https://github.com/gneissone/connect-unavco-elasticsearch https://github.com/tetherless-world/dco-elasticsearch https://github.com/cu-boulder/facetview2 Workflow: Query VIVO → Map to JSON → load to Elasticsearch A
Facets in Connect UNAVCO Query VIVO → Map to JSON → load to Elasticsearch Use VIVO SPARQL API to pull out necessary info Station name, location, PIs, retirement date, related datasets, image thumbnail CONSTRUCT queries: Better performance than DESCRIBE, more complicated to write A
Facets in Connect UNAVCO Query VIVO → Map to JSON → load to Elasticsearch Create JSON file Optionally, create schema file for Elasticsearch that defines each field Define data type and type of tokenizing that should be done on it by Elasticsearch’s analyzers A data.json
Facets in Connect UNAVCO Data is loaded to Elasticsearch via the load API $ curl –XPOST ‘http://localhost:9200/unavco/_bulk’ – data-binary @data.json Query VIVO → Map to JSON → load to Elasticsearch A For more on Elasticsearch and facetview2 in VIVO…
Altmetric scoreS VIVO displays Altmetric badges on demand… But we need Altmetric score in database for sorting... Get score by doi using API
Altmetric scoreS Fetch scores for 5,400 publications daily Can buy commercial license or get free license for academic research projects A
Future Work Integrate crosslinking work Elasticsearch/facetview geospatial capabilities Refine and enhance faceted browsing Survey community for dataset and publication connections A
Other EarthCollab presentations at VIVO 2016 Thursday, 5pm, Colorado Ballroom A-D: EOL Artic Data Connects – Don Stott, John Allison, and C. Brooks Snyder Friday, 11am, Colorado Ballroom G: Using VIVO for Scientific Applications - Matthew Mayernik, Anne Wilson and John Furfey Friday, 3:30pm, Colorado Ballroom E-F: Extending VIVO Infrastructure to Support Linking Information between EarthCollab VIVO Instances - Huda Khan et al.
Thank you! connect.unavco.org git.io/vG9AJ earthcube.org/group/earthcollab Contact: Benjamin Gross mbgross@unavco.org orcid.org/0000-0002-7908-1987