AIP Data Sharing investigations for GEOSS Summary of AIP-3 Data Sharing Guidelines Working Group George Percivall AIP Task Leader Open Geospatial Consortium Steven F. Browdy AIP DSGWG Leader OMS Tech, Inc.
Background on GEOSS Data Sharing Data Sharing Principles call for a “full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation.” “Full and open exchange” is further defined by: “data and information made available through the GEOSS are made accessible with minimal time delay and with as few restrictions as possible, on a nondiscriminatory basis, at minimum cost for no more than the cost of reproduction and distribution.” Implementation guidelines were generated by the DSTF taking into account the possibility of license conditions, since they are a reality.
GEO Task AR-09-01b Architecture Implementation Pilot AIP supports SBAs by developing new process and infrastructure components for GCI and the broader GEOSS architecture AIP-3 –Completed in February 2011 –Data Sharing Guidelines Working Group (DSGWG) “fostering interoperability arrangements and common practices for GEOSS”
DSGWG Scope in AIP-3 Primary focus on handling licensing. –Licensing framework suggested by Harlan Onsrud, et. al.: “Towards Voluntary Interoperable Open Access Licenses for GEOSS” –Framework focused on open access licenses, and used Creative Commons framework as a working foundation Secondary focus on user registration and login. –Many data sets and repositories require user authentication prior to data access –DSGWG looked at single-sign-on (SSO) as a federated solution and as a centralized solution
Licensing Framework for AIP-3 testing Creative Commons used as the framework. GEOSS SBA license special to GEOSS (needs to be written) Legal interoperability shown by license text tokens –The restrictiveness of the license conditions increases going from CC0 to CC BY-NC- GEOSS_SBA OTHER requires direct negotiation and understanding between the data provider and data user prior to data access Type of LicenseLicense Symbol I. Dedication to the Public Option (CC0, i.e. Creative Commons Zero) II. Creative Commons Attribution Required License a. Attribution Required (CC BY) b. Non-Commercial Use Only (CC BY-NC) III. Specialized GEOSS Open Access Licenses a. GEOSS Societal Benefit Areas Only (CC BY-NC-GEOSS_SBA)SB IV. Non-Standard Open Access LicenseOTHER
Licensing Provisioning Licensing should be carried with the data –Necessary to satisfy persistent nature of licenses and to facilitate mining of license information by the GCI or client applications –Necessary to provide programmatic action based on licenses and to ability to discover data associated with specific licenses For search and discovery, metadata for licensing should include: –Text identifier for the license associated with the data; e.g. “CC0” –Link to logo symbol used for the license, and link to actual license –In the case of attribution, the actual attribution information For AIP-3, ISO metadata standard used to demonstrate how metadata of the data provider’s data could be populated. –ISO used by the GEOSS Clearinghouse –ISO widely used in the geospatial community
ISO Use MD_Constraints –MD_LegalConstraints accessConstraints (code = “license” or “otherRestrictions”) useConstraints (code = “license” or “otherRestrictions”) otherConstraints (free text) –Each attribute can occur with multiple instances CI_Citation –Richness to handle CC BY –Should be used for full citation information –MD_Constraints utilized to capture attribution as a CC BY license.
ISO UML
Licensing Use Cases defined in AIP-3 (page 1 of 2) 1.Search and discovery by specifying license conditions (CC0, CC BY, etc.) at GEO Web Portal 2.Data access after search and discovery by observing license conditions at GEO Web Portal. –Displayed after non-license type search –Sort result set by license type 3.Programmatic (non-user interactive) data access using license conditions –Metadata carries license information –Program handles licensing logic automatically
Licensing Use Cases defined in AIP-3 (page 2 of 2) 4.Data access to multiple data sets after search and discovery by observing license conditions at the user-interactive GEO Web Portal. –Merging or layering multiple data sources –Legal processing must take place to reflect legal interoperability 5.Programmatic (non-user interactive) access to multiple data sets using license conditions. –Merging or layering multiple data sources –Program handles licensing logic and legal interoperability automatically
Licensing use cases Implemented in AIP-3 Use Cases 1 & 2 implemented in AIP-3 Searching for, and accessing, data with a CC BY license attached Video demosntration available online
User Registration in a GEOSS context Some GEOSS data providers require users to register and login to access data –Many solutions for user registration and login already in place with data providers DSGWG recommended solution must have very light impact on data providers. Single-sign-on (SSO) is desired for users –Many, if not most data providers don’t support SSO outside their domains –SSO will minimize the impact on data users.
User Registration Approaches by DSGWG Two approaches for managing the information 1.Federated approach among GEOSS data providers 2.Centralized approach using a GCI component Two approaches for user identity protocols 1.OpenID 2.Shibboleth –Research into shibboleth resulted in many examples showing a very heavy impact on deployment. SSO solution needs to be instantiated to integrate with Shibboleth PKI deployment necessary with possible custom modifications Identity provider deployment necessary with possible custom modifications Policy rules needed for configuration of components Fairly steep learning curve
Federated Approach with Open ID Users register with an existing OpenID server: –OpenID, Google, Yahoo, etc. Users must login to each domain –Each validating with OpenID No control over logout –Logging out at one domain logs a user out at all domains logged into –No control for data providers over the duration of login Data providers would have to implement OpenID interaction Data providers cannot provide use metrics to GEOSS
OpenID Federated Approach Google OpenID used as an example
GCI-Centralized Approach using OpenID Users register with a SSO component hosted by GEOSS, say in the GCI –Use OpenID-like authentication external to the GCI, so GEOSS is not storing user personal information User logs-in once to use multiple GEOSS services –Log out at one service does not logout everywhere –Allows data provider control over login duration Data providers –Implement interface to GCI SSO component, but would not affect an existing authentication scheme –Can provide metrics to GEOSS on services used and data accessed
GCI-Centralized Approach with OpenID Includes post AIP-3 design to use external OpenID service
AIP DSGWG+ Recommendations Implementation of data licenses, including attribution, for further testing of how well licenses are handled. Use of the Creative Commons open standards-based licensing framework. Implementation of a central GCI component and remote OpenID for handling SSO in GEOSS Design of appropriate service interfaces to support the interactions between the central GCI component, the GEOSS users, and the GEOSS data providers. Continue to work with DSTF on use conditions and user management –e.g., DSTF is looking at CC and beyond
References GEO –earthobservations.orgearthobservations.org GEO Architecture Implementation Pilot – GEOSS registries and SIF –geossregistries.infogeossregistries.info George Percivall Steven F. Browdy