<Panel: The Art & Science of Data Visualization>

Slides:



Advertisements
Similar presentations
RPI Li Ding, Jim Hendler and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute July 27, 2010 The Data-gov.
Advertisements

© NERC All rights reserved NERC Data Catalogue Service Patrick Bell NERC (British Geological Survey)
Schema.org, an ontology for discovery on the web Phil Barker, Heriot-Watt University
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
The CERIF-2000 Implementation. Andrei S. Lopatenko CERIF Implementation Guidelines Andrei Lopatenko Vienna University of Technology
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
Data.gov Wiki: A Semantic Web Approach to Government Data Li Ding, Dominic DiFranzo, Sarah Magidson, Jim Hendler Tetherless World Constellation Aug 7,
Data.gov Wiki: A Semantic Web Approach to Government Data Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li, Deborah.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Microdata for Dallas County Historical & Genealogical Cemetery Data Tony Hanson Webmaster 1.
Educause October 29, 2001 A GEM of a Resource: The Gateway to Educational Materials Copyright Nancy Virgil Morgan, This work is the intellectual.
Project Report Presentation and Update October 10, 2014 Jeff Mixter - OCLC Research Patrick OBrien - Montana State Univeristy Kenning Arlitsch - Montana.
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.
Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
First they have to find it: Getting Government Data Discovered and Used Adapted from: John S. Erickson, Ph.D. Tetherless World Constellation Rensselaer.
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
TWC LOGD: A Portal for Linking Open Government Data Li Ding, Deborah L. McGuinness, Jim Hendler Tetherless World Constellation Rensselaer Polytechnic Institute.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
Applying Provenance Extensions to OPeNDAP Framework Patrick West, James Michaelis, Tim Lebo, Deborah L. McGuinness Rensselaer Polytechnic Institute Tetherless.
Resource Discovery for Extreme Scale Collaboration Benno Lee Patrick West 1 William Smith 2
Linking Open Government Data (TWC LOGD) Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute.
Tetherless World Constellation Semantic Web Science Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
Semantic Web Basics Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Linked Open Government Data: What’s Next? Li Ding, James A. Hendler, and Deborah L. McGuinness With thanks to the entire RPI Tetherless World LOGD team:
Using Open Data to Create Value for Citizens. Data.gov Provides instant access to ~400,000 datasets in easy to use formats Contributions from UN, World.
 Structured Data An Introduction to Semantic Web “It is very hard for search engines to understand the structure and semantics of data embedded in an.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
TWC LOGD: A Portal for Linking Open Government Data Dominic DiFranzo, Li Ding, John S. Erickson, Xian Li, Tim Lebo, James Michaelis, Alvaro Graves, Gregory.
Supported by ESIP Semantic Web Cluster A service based on community-built semantic web applications Provide users with the means to match their datasets.
Prizms for Data Publication and Management May 9, 2014 Katie Chastain.
Tetherless World Constellation Web 3.0 emerges… Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
Open Government Data Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Giuseppina Inserra INFN Catania
Presented at Archives Records 2016, session 510
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
IRUS-UK and ORCIDs Paul Needham Cultivating ORCID: Encouraging growth
PNDS Architecture - an overview
NISO Virtual Conference 19 February 2014 Ralph Swick, W3C
Metadata Quality: Learning from Open Data Portalwatch
<Panel: The Art & Science of Data Visualization>
Enabling direct data access to social science research data
WorldCat: Broad Web visibility for our collection
Rensselaer Polytechnic Institute
PREMIS Tools and Services
Making data discoverable through mainstream search engines
Data types and persistent identifiers in
Google Dataset Search Evaluation
LOD reference architecture
Adoption of RDA DTR and PIT in the Deep Carbon Observatory Data Portal
W3C Recommendation 17 December 2013 徐江
Modeling Data Set Versioning Operations
Metadata supported full-text search in a web archive
Australian and New Zealand Metadata Working Group
Presentation transcript:

<Panel: The Art & Science of Data Visualization> First they have to find it: Getting Government Data Discovered and Used Adapted from: John S. Erickson, Ph.D. Tetherless World Constellation Rensselaer Polytechnic Institute Troy, New York, USA Twitter: @olyerickson #TWCRPI 1

Open Government Data Around the World Starting with efforts in the US and UK, governments around the world have recognized the need to publish their critical data Percent of total collection (from 1M+ datasets) 2 2

Diverse Approaches to Open Gov't Data Government data initiatives have taken many forms GovData portals are widely varied in how they help users discover and use relevant datasets Percent of total catalogs (from 192 catalogs) 3 3

Federated Discovery of Government Data Stakeholders have seen the need for Federated discovery across catalogs, especially from within major search engines including Bing, Google, Yahoo! and Yandex 4 4

Government Data in the linked open data cloud Government Data is currently over ½ the cloud in size (~17B triples), 10s of thousands of links to other data (within and without) http://linkeddata.org/

Linked Data is Not Enough... Publishing open government data as Linked Data is not enough For OGD to be useful, datasets must be published using metadata, markup standards and presentation that aid discovery and use 6 6

Linked Data is Not Enough... Publishing open government data as Linked Data is not enough For OGD to be useful, datasets must be published using metadata, markup standards and presentation that aid discovery and use 7 7

Dataset Metadata for Discovery and Use Recent work at TWC RPI demonstrates the value of applying emerging standards for uniformly describing government datasets and catalogs 8 8

International Open Government Dataset Search TWC's IOGDS application is an aggregated catalog of more than 1M datasets from over 192 dataset catalogs from governments at every level around the world See: http://logd.tw.rpi.edu 9 9

International Open Government Dataset Search Anticipates W3C DCAT RDF vocabulary Demos what a comprehensive federated catalog based on DCAT and aggregation API might look like 10 10

International Open Government Dataset Search IOGDS is a multi-year effort based on downloading, scraping or accessing APIs, converting metadata to a proto-DCAT model, and publishing via endpoint and download Catalogs API IOGDS Workflow ad hoc code Download IODGS CSV Csv2rdf4lod automation Web Per-site scraper code Web Web See: http://logd.tw.rpi.edu 11 11 11

Schema.org: Semantic Markup for Discovery TWC RPI has published dataset listings based on IOGDS using emerging microdata standards, esp. schema.org model endorsed by Bing, Google, Yahoo!, Yandex... 12 12

Schema.org datasets extension TWC RPI's schema.org dataset extension will enable government dataset catalogs to more easily be parsed and indexed by the major search engines... ...which will help users find relevant datasets! TWC's dataset extension entered public discussion June 2012 13 13

Schema.org datasets extension The schema.org datasets extension enables relevant datasets to be more easily discovered by a range of stakeholders including researchers, data journalists, bloggers and developers 14 14

Schema.org datasets extension “...we've reviewed the current datasets schema proposal in draft, and we are comfortable with the current state of things... “...At this point, if the group would solidify on the dataset proposal, then Data.gov would support and use it. ---Chris Musialek 15 15

CKAN Data Catalog Scheme & Protocol API-based catalog federation is also possible ckan announced DCAT-based query/federation API enables OAI-PMH-like harvesting and more 16 16

Dataset extension to schema.org

Demo/ links http://www.w3.org/wiki/WebSchemas/Datasets http://www.w3.org/wiki/WebSchemas/SchemaDot OrgProposals Good introduction (longer/ with more context): http://www.slideshare.net/joshsh/semantic- markup-using-schemaorg

Examples of current schema.org results http://schema-creator.org/event.php http://schema-creator.org/product.php

To do… Get Google, Bing, Yahoo, … to crawl these pages It might look like this: http://www.google.c om/publicdata/direct ory

From Jim Hendler: Google is now building custom search engines that will pull down schema.org Dan Brickley is working on one from the Dataset schema, not yet public There's also an open govt data search – not much in it, but looks nice – it's at http://www.google.com/publicdata/directory

Retrieve all the logd datasets: PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#> PREFIX conv: <http://purl.org/twc/vocab/conversion/> PREFIX void: <http://rdfs.org/ns/void#> PREFIX dcterms: <http://purl.org/dc/terms/> SELECT DISTINCT ?dataset ?catalog ?catalog_id ?title ?desc ?country ?homepage ?agency_id ?contributor_id WHERE { ?dataset a conv:CatalogedDataset . ?dataset void:inDataset ?catalog . ?catalog dcterms:identifier ?catalog_id . ?dataset <http://purl.org/dc/terms/title> ?title . ?dataset dcterms:description ?desc . OPTIONAL { ?dataset dgtwc:catalog_country ?country . } ?dataset <http://xmlns.com/foaf/0.1/homepage> ?homepage . ?dataset dgtwc:agency ?agency . ?agency dcterms:identifier ?agency_id . ?dataset <http://purl.org/dc/terms/contributor> ?contributor . ?contributor dcterms:identifier ?contributor_id . #?dataset dgtwc:catalog_country <http://dbpedia.org/resource/United_States> . } Courtesy: Josh Shinavier (RPI/TWC)

A large number of datasets: http://logd.tw.rpi.edu/schemaorg_dataset_extensio n http://www.google.com/webmasters/tools/richsnip pets?url=http://logd.tw.rpi.edu/schemaorg_datas et_extension&view=

http://logd.tw.rpi.edu/page/international_dataset_catalog_search

Latest from Josh: Datasets-as-Linked-Data demo. The RDFa in the pages is not only correct w.r.t. schema.org but is also presented in such a way that an RDFa-aware Linked Data crawler can hop from datasets to catalogs, back again, into DBpedia, etc. while gathering the RDFa as linked RDF. Since we now have Datasets-ish RDFa markup in the main IOGDS dataset pages (i.e. the pages which the URIs of the datasets redirect to), we're pretty close to a completely integrated demo. What remains: (1) the current markup has some problems. We need to fix those; (2) we need markup for catalogs as well as datasets…

Needed (1) and (2): To fix (1), we need to make changes to the LODSPeaKr templates that automatically generate those pages, to make them compliant with the model Josh developed. To fix (2), we'll work with Alvaro (Graves) to create LODSPeaKr-based automation to generate catalog pages in an efficient way. (2) presents more of a challenge than (1) at this point, since the IOGDS implementation of dataset details pages is mostly correct at this point. Still need Dan B. to assist with getting them found…

What we need: Willingness to adopt the dataset schema extension – we need lots of datasets to start showing up We (TWC) will be pushing out some tools, more demos and how-tos, very soon Wanna play? http://wiki.esipfed.org/index.php/DatasetSchema