Presentation is loading. Please wait.

Presentation is loading. Please wait.

<Panel: The Art & Science of Data Visualization>

Similar presentations


Presentation on theme: "<Panel: The Art & Science of Data Visualization>"— Presentation transcript:

1 <Panel: The Art & Science of Data Visualization>
First they have to find it: Getting Government Data Discovered and Used Adapted from: John S. Erickson, Ph.D. Tetherless World Constellation Rensselaer Polytechnic Institute Troy, New York, USA #TWCRPI 1

2 Open Government Data Around the World
Starting with efforts in the US and UK, governments around the world have recognized the need to publish their critical data Percent of total collection (from 1M+ datasets) 2 2

3 Diverse Approaches to Open Gov't Data
Government data initiatives have taken many forms GovData portals are widely varied in how they help users discover and use relevant datasets Percent of total catalogs (from 192 catalogs) 3 3

4 Federated Discovery of Government Data
Stakeholders have seen the need for Federated discovery across catalogs, especially from within major search engines including Bing, Google, Yahoo! and Yandex 4 4

5 Government Data in the linked open data cloud
Government Data is currently over ½ the cloud in size (~17B triples), 10s of thousands of links to other data (within and without)

6 Linked Data is Not Enough...
Publishing open government data as Linked Data is not enough For OGD to be useful, datasets must be published using metadata, markup standards and presentation that aid discovery and use 6 6

7 Linked Data is Not Enough...
Publishing open government data as Linked Data is not enough For OGD to be useful, datasets must be published using metadata, markup standards and presentation that aid discovery and use 7 7

8 Dataset Metadata for Discovery and Use
Recent work at TWC RPI demonstrates the value of applying emerging standards for uniformly describing government datasets and catalogs 8 8

9 International Open Government Dataset Search
TWC's IOGDS application is an aggregated catalog of more than 1M datasets from over 192 dataset catalogs from governments at every level around the world See: 9 9

10 International Open Government Dataset Search
Anticipates W3C DCAT RDF vocabulary Demos what a comprehensive federated catalog based on DCAT and aggregation API might look like 10 10

11 International Open Government Dataset Search
IOGDS is a multi-year effort based on downloading, scraping or accessing APIs, converting metadata to a proto-DCAT model, and publishing via endpoint and download Catalogs API IOGDS Workflow ad hoc code Download IODGS CSV Csv2rdf4lod automation Web Per-site scraper code Web Web See: 11 11 11

12 Schema.org: Semantic Markup for Discovery
TWC RPI has published dataset listings based on IOGDS using emerging microdata standards, esp. schema.org model endorsed by Bing, Google, Yahoo!, Yandex... 12 12

13 Schema.org datasets extension
TWC RPI's schema.org dataset extension will enable government dataset catalogs to more easily be parsed and indexed by the major search engines... ...which will help users find relevant datasets! TWC's dataset extension entered public discussion June 2012 13 13

14 Schema.org datasets extension
The schema.org datasets extension enables relevant datasets to be more easily discovered by a range of stakeholders including researchers, data journalists, bloggers and developers 14 14

15 Schema.org datasets extension
“...we've reviewed the current datasets schema proposal in draft, and we are comfortable with the current state of things... “...At this point, if the group would solidify on the dataset proposal, then Data.gov would support and use it. ---Chris Musialek 15 15

16 CKAN Data Catalog Scheme & Protocol
API-based catalog federation is also possible ckan announced DCAT-based query/federation API enables OAI-PMH-like harvesting and more 16 16

17 Dataset extension to schema.org

18 Demo/ links OrgProposals Good introduction (longer/ with more context): markup-using-schemaorg

19 Examples of current schema.org results

20 To do… Get Google, Bing, Yahoo, … to crawl these pages It might look like this: om/publicdata/direct ory

21 From Jim Hendler: Google is now building custom search engines that will pull down schema.org Dan Brickley is working on one from the Dataset schema, not yet public There's also an open govt data search – not much in it, but looks nice – it's at

22 Retrieve all the logd datasets:
PREFIX dgtwc: < PREFIX conv: < PREFIX void: < PREFIX dcterms: < SELECT DISTINCT ?dataset ?catalog ?catalog_id ?title ?desc ?country ?homepage ?agency_id ?contributor_id WHERE { ?dataset a conv:CatalogedDataset . ?dataset void:inDataset ?catalog . ?catalog dcterms:identifier ?catalog_id . ?dataset < ?title . ?dataset dcterms:description ?desc . OPTIONAL { ?dataset dgtwc:catalog_country ?country . } ?dataset < ?homepage . ?dataset dgtwc:agency ?agency . ?agency dcterms:identifier ?agency_id . ?dataset < ?contributor . ?contributor dcterms:identifier ?contributor_id . #?dataset dgtwc:catalog_country < . } Courtesy: Josh Shinavier (RPI/TWC)

23 A large number of datasets:
n pets?url= et_extension&view=

24

25 Latest from Josh: Datasets-as-Linked-Data demo. The RDFa in the pages is not only correct w.r.t. schema.org but is also presented in such a way that an RDFa-aware Linked Data crawler can hop from datasets to catalogs, back again, into DBpedia, etc. while gathering the RDFa as linked RDF. Since we now have Datasets-ish RDFa markup in the main IOGDS dataset pages (i.e. the pages which the URIs of the datasets redirect to), we're pretty close to a completely integrated demo. What remains: (1) the current markup has some problems. We need to fix those; (2) we need markup for catalogs as well as datasets…

26 Needed (1) and (2): To fix (1), we need to make changes to the LODSPeaKr templates that automatically generate those pages, to make them compliant with the model Josh developed. To fix (2), we'll work with Alvaro (Graves) to create LODSPeaKr-based automation to generate catalog pages in an efficient way. (2) presents more of a challenge than (1) at this point, since the IOGDS implementation of dataset details pages is mostly correct at this point. Still need Dan B. to assist with getting them found…

27 What we need: Willingness to adopt the dataset schema extension – we need lots of datasets to start showing up We (TWC) will be pushing out some tools, more demos and how-tos, very soon Wanna play?


Download ppt "<Panel: The Art & Science of Data Visualization>"

Similar presentations


Ads by Google