Linked Open Data for INSPIRE: From 3 to 5 star geospatial data Francisco J. Lopez-Pellicer, Aneta J. Florczyk, Javier Nogueras-Iso, Pedro R. Muro- Medrano and F. Javier Zarazaga-Soria INSPIRE Conference 2011, Edinburg, July 1, 2011
5 star Linked Data? Sir Tim Berners Lee (2010) “This year, in order to encourage people - especially government data owners - along the road to good linked data, I have developped this star rating system” Purpose of this talk Introduce this rating system in our context We’ve come part way !!! Exemplified with datasets from Spain Present two 5 star Linked Data sites in Spain related to SDI nodes Present a 5 star Linked Data recipe for INSPIRE data
1 Star Make your stuff available on the Web (whatever format) under an open license. National Reference Geographic Equipment Public data, required attribution Map of MTN50 catographic grid Gazetteer MTN50 catographic grid Boundaries
2 Star Make it available as structured data (e.g., ESRI Shapefile instead of image portrayal of data). Map of MTN50 catographic grid in png
3 Star Use non-proprietary formats (e.g., WKT instead of ESRI Shapefile, CSV instead of Access). Gazetteer available only in Access format Boundaries available only in ESRI Shapefile format MTN50 cartographic grid available in WKT and ESRI Shapefile
4 Star Use URIs to identify things, so that people can point at your stuff (native use of RDF is not required) (no way?) GeoLinkedData.es initiative makes available this data as RDF MTN50 grid not published as RDF
5 Star Link your data to other data to provide context.
5 Star + Metadata For governement data, there should be metadata about the data itself (e.g. provenance, rights).
5 Star + Metadata + Data registry For governement data, their metadata should be available from an official registry (e.g. catalogue).
5 Star + Metadata + Data registry + Infrastructure For government data, the previous steps require a coordinate series of agreements on technology standards, institutional arrangements, and policies.
Summary Level On the Web Datasets ★ Open license (few or no limitations) ★★ Structured data (easy to use) ★★★ Non-propietary format (no additional license restrains) ★★★★ URIs for things (work with the Web) RDF data model (open standard that works with the Web) Co-existence with other approaches (don’t replace) Keep things simple (don’t overengineering RDF) ★★★★★ RDF links (work with the Web) Link to other approaches (don’t replace) Metadata Simple vocs. such as dcat (keeping thing simple) Data registry Official registry (transparency) SPARQL (open standard that works with the Web) Infrastructure Agreements, Norms, Funds, Political support, …
5 Star Linked Data examples from Spain
Spanish NGI
Level On the Web Hydro topics of BCN200, BTN25, National Gazetteer ★ Public data (Norm FOM/956/2008) ★★ Vector and alphanumeric data ★★★ Vary, including gesopatial proprietary formats ★★★★ Developed and maintained by academia Simple vocabulary ★★★★★ Developed by academia Do not link to SDI resources Metadata Provenance vocabulary Data registry CKAN (no official) Infrastructure National Geographic Instute (IGN) support
Saragossa City Council
Level On the Web Assorted content, including data from local SDI node ★ Law: Local statute & Law 37/2007 Web: ColorIURIS License ★★ Points of interest and events ★★★ Vary, including geospatial proprietary formats ★★★★ Developed by industry, maintained by city council Encoded as GeoRSS / RSS 1.0 = RDF Clashes with existing Web admin practices (!) ★★★★★ Developed by industry, maintained by city council Do not link to SDI resources Metadata Dcat vocabulary Data registry Maintained by city council SPARQL Infrastructure City council, normative and funding support
Conclusions - Do’s and don’ts about Linked Data - 5 star recipe for INSPIRE data
Conclusions: do’s and don’ts about Linked Data Do Publish valuable data Pick persistent URIs for naming things Dereference URIs to representations’ URLs Put metadata giving license and provenance Use RDF formats for data transmission in addition Use SPARQL for data and metadata access Keep simple Integrate with existing systems Don’t Publish all your data Publish outdated data Publish without explicit license Hide data behind forms or applications Publish data only in propietary formats Wait until you have a complete ontology Seek to replace existing systems
Conclusion: 5 star recipe for INSPIRE data LevelsTopics On the Web INSPIRE datasets ★ INSPIRE norms Transposed to Web licenses ★★ Vector data ★★★ Representation in an open format [WE ARE ALREADY HERE!!!] ★★★★ A simplified representation in RDF Integrate with existing SDI geoportal/technologies ★★★★★ Link to SDI resources Metadata Dcat vocabulary (crosswalk from existing metadata) Data registry Enable the use of SPARQL to query existing SDI catalogue Infrastructure Linked Data as one of the agreements of the SDI [PREREQUISITE]
Francisco J. Lopez-Pellicer IAAA is currently a partner in the EuroGeoSource project