Presentation is loading. Please wait.

Presentation is loading. Please wait.

Responsible Citizenship of the Data World

Similar presentations


Presentation on theme: "Responsible Citizenship of the Data World"— Presentation transcript:

1 Responsible Citizenship of the Data World
Wim Hugo Grenoble, 12 April 2017

2

3 Credibility of Science
Access to original and complete data sets for reproducibility Re-usability declines with time Availability declines with age

4

5 Network of referenced objects in the web
Linked Open Data Network of referenced objects in the web Dependent on permanent identifiers for the objects References vocabularies, ontologies, registries, … The Knowledge Network

6 ICSU-WDS Knowledge Network: the Fabric of Science
Scholarly Publications (CrossRef?) TDRs (WDS, DSA, DataCite*) Samples and Events People (ORCID) RDI Outputs/ Online Resources Coverage (Temporal, Spatial, Topic) Data Citations (DataCite) Institutions (?) Projects Initiatives Use, Caveats, Lineage, Methods Networks Licenses (CoDATA, Creative Commons) * Including re3data, DataBib Funders (?) Exists Started Not Now WDS The Knowledge Network

7 Generalised Scientific Data Infrastructure Use Case
“Predictable Assembly from Reliable Components” Access/ Download Data/ Services Analise/ Visualise “Bind” “Publish” Process Metadata Discovery “Find” Generalised Scientific Data Use Case

8 Generalised Scientific Data Infrastructure Use Case
Curate Cite Access/ Download Data/ Services Analise/ Visualise “Bind” “Publish” Process Metadata Discovery Assess/ Rate “Find” Generalised Scientific Data Use Case

9 Mediation and Brokering
Curate Mediate Cite Access/ Download Data/ Services Analise/ Visualise “Bind” “Publish” Process Metadata Discovery Assess/ Rate “Find” Mediation and Brokering

10 “Responsible Citizenship of the Data World”
Content Best Practice Persistent Identifiers and Registries Vocabularies, Ontologies Global Infrastructure Services Federated Data Services and Implementations Standards and Specifications Licenses and Data Policy Scope Applies to all Applies to a specific data family or format Applies to a specific scientific discipline or domain Actors Individual Researchers, Institutions, Initiatives Voluntary contributors and the Public Systems developers, Data Centre Managers, Architects Granularity Individual Data Points (UncertML, …) Individual Data Sets (GEO Data Management Principles, ...) Data Centres and Repositories (WDS, DSA, ISO, Nestor) Data Networks and Composite Services (WDS) Maturity Data Management Principles

11 Core Certification: Trusted Data Services
Data Records Data Sets Repositories <UncertML> Standards can play an important role in establishing this trust. There has long been a demand for some way to evaluate, to assess the trustworthiness of a digital repository. Over the last few years a number of evaluation guidelines are becoming available. These standards not only take into account the technical infrastructure and standards, but also look at organisational, financial, staffing and legal aspects, workflows, risk management, etc. Quality, Accreditation, and Trust

12 African GRDI Perspectives
Challenges are obvious …

13 Technology Footprint It is clear that technology of the type expected in more developed nations remains a problem: not only is bandwidth at a premium, but it is expensive, and state-of-the art equipment (both personal equipment and in respect of data centres) are unlikely to be commonplace in Africa in the near future. Design directives for Networked Data Centres: Technology: use of mobile phone technology in a non-bandwidth intensive manner will be a very good option. Simple data discovery is preferable to non-discovery due to technology hurdles. Technology: Cloud-based services Governance: data dissemination via satellite remains an affordable option for large data sets.

14 Open Access Irrespective of the wide and growing acceptance and mandatory implementation in the developed world, open access remains problematic in the developing world. One can aim to address the misconceptions – undoubtedly a longer term goal – but in the meantime, discovery and access to data embargoed in some way is preferable to non-discovery. Design directives for Networked Data Centres: Technology: data centres should allow multiple modes of access (free and open, acceptance of limiting conditions, paywall). Policy: Licenses should allow a small number of valid restrictions. Divergence of national policies need to be accommodated by matching them with a small number of standardised licenses.

15 Growth of Creative Commons
Policies and Licensing

16 Creative Commons License Use
Policies and Licensing

17 Where CC Works are Published
Policies and Licensing

18 Funding It is highly unlikely that funding for the establishment of data centres on a scale comparable to the developed world will emerge. African countries may have funds, but capacity is also a problem. At times, donors or multinational projects fund infrastructure, but one has to accept that these are often ineffectual, or will not be able to serve the majority of scientists. Design directives for Networked Data Centres: Technology: we need to make use of free technology as far as possible: cloud-based data storage, network data centres for meta-data that are funded by stakeholder institutions, and low-bandwidth options for data discovery, application, and use. Governance: Use the crowd - peer review, quality assurance, and some oversight functions can be crowd-sourced. It may be beneficial for experienced scientists, globally, to act voluntarily as governance sources for Network Data Centres – without financial compensation. Such a framework, and the explicit roles, responsibilities, and benefits may require endorsement by a suitable global institution such as ICSU.

19 Capacity A large part of the problem with implementation of Global Research Data Infrastructure (GRDI) and data centres as a component of these is technology focus. Even today, it is necessary to have significant background knowledge of aspects such as meta-data and its mainstream standards, data formats and their standards, and the general body of knowledge associated with GRDI to participate in and benefit from the emerging infrastructure. This has to change: end users need not know any more of this than they need to know of the standards that enable Google Mail or and Android smartphone. The challenge is with the developers of the GRDI: listen to the customer. Design directives for Networked Data Centres: Technology: Data Centre use needs to be made intuitive, and shield end users from technical complexity, standards, and specialist knowledge. Technology: integration with well-established services (cloud-based services, social networks) both in terms of functionality and shared infrastructure is needed.

20 Scientific Activity in Southern Africa

21 Ships in the Night … First-world-funded initiatives – e.g. SciGaIA
Unaware of one another Not connected to National Initiatives Why not a funded programme to make Zenodo and OpenAire immediately useful to developing country infrastructure providers? Recognition of Effort Often hidden in first-world networks or data centres Coordination and landscape assessment


Download ppt "Responsible Citizenship of the Data World"

Similar presentations


Ads by Google