PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Slides:



Advertisements
Similar presentations
Criteria for the trustworthiness of data centres Jens Klump Helmholtz Centre Potsdam German Research Centre for Geosciences (GFZ) DataCite Summer Meeting.
Advertisements

Paving the way for complete OA Joanne Yeomans & Jens Vigen CERN Scientific Information Group.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
A centre of expertise in data curation and preservation DCC Workshop: Curating sApril 24 – 25, 2006 Funded by: This work is licensed under the Creative.
Requirements for Long- Term Preservation David Giaretta 1 st October 2009, Helsinki.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Sustainability and the APARSEN Network of Excellence: Preservation.
Preservation of Software Barbara Sierman (digital preservation manager) E-Humanities Software and Tools Sustainability,
The Alliance for Permanent Access Hans Jansen Director, Research & Development, Koninklijke Bibliotheek, Netherlands Frankfurt, 20 april 2007.
Digital Preservation and Trusted Digital Repositories Priscilla Caplan Florida Center for Library Automation ALA 2005 Chicago IL.
DigCCurr 2007: What digital curators do and what they need to know The CASPAR view on: What digital curators do and what they need to know : Research Perspectives.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Project Overview APA Conference 2012 ESA/ESRIN (Frascati), 6-7 November 2012 D. Giaretta (APA)
DEUTSCHE INITIATIVE FÜR NETZWERKINFORMATION E.V. Certification and Beyond – DINI Open Access Activities in Germany Susanne Dobratz & Frank Scholze DINI.
SCIDIP-ES services and toolkits David Giaretta. Preserving digitally encoded information Ensure that digitally encoded information are understandable.
SCIDIP-ES Components Oct ,Brussels. Basic Preservation Strategies Often stated as: “Emulate or Migrate” OAIS concepts change these to: Add Representation.
NSD © 2014 DASISH Digital Services Infrastructure for Social Sciences and Humanities WP4 Data Archiving Claudia Engelhardt (UGOE), Arjan Hogenaar (DANS),
©STFC/Keith G Jeffery Metadata in the European e-Infrastructure Metadata in the European e-Infrastructure Keith G Jeffery Science and Technology.
Kevin L. Glick Electronic Records Archivist Manuscripts and Archives Yale University ECURE Arizona State University March 2, 2005 Fedora and the Preservation.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Scientific Publication in the European Research Area: moving towards change Pēteris Zilgalvis Head of Unit, Governance and Ethics European Commission,
FIM-ig Federated Identity Management Interest Group.
David Giaretta Associate Director (Development) Funders: DCC Development Digital Curation Centre a centre of expertise in data curation and preservation.
Josefine Nordling CSC – IT Center for Science LIBER 41st Annual Conference 27th of June 2012.
Science Archives in the 21st Century 25/26 April Towards an International standard for Audit and Certification of Digital Repositories David Giaretta.
David Baker Ed Simons Josh Brown The various aspects of Interoperability A strategic partnership driving interoperability in research information through.
DINI „Electronic Publishing Group“ DINI – Certificate Document and Publication Repositories “Electronic Publishing Group“
EPOS Preparatory phase Torild van Eck (ORFEUS) Call INFRA Deadline: December 3, 2009 Funding: between 3 and 6 MEuro Duration: max 4 year.
What is happening 'Free Access' 3. The Position of SPARC Raf Dekeyser.
1 The Alliance: stakeholders join forces to help create a European Digital Information Infrastructure Peter Tindemans Acting Chair Alliance for Permanent.
European Broadband Portal Phase II Application of the Blueprint for “bottom-up” broadband initiatives.
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
The Scientific Publications System: A Key Factor for EU Research Policy Celina Ramjoué European Commission, Research Directorate-General Science, Economy.
DASISH Final Conference Common Solutions to Common Problems.
Simone Görl │ 18th may 2006 Preserving Authentic Electronic Records: The InterPARES Project & The InterPARES Model DIGITAL PRESERVATION.
CASPAR Framework and Lessons Learned David Giaretta.
1 Direction scientifique Networks of Excellence objectives  Reinforce or strengthen scientific and technological excellence on a given research topic.
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT How certification fits the APARSEN project Simon Lambert,
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
11-Oct-07 Marcel Brannemann AWI-Library, Bremerhaven, Germany Open Access Chance for Paradigm Change in Scientific Publishing ? German Experiences in Global.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT The importance of interoperability and intelligibility in digital.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Datasealofapproval.org13/12/2015 DANS is an institute of KNAW and NWO 1 Identifying and removing barriers for sharing scientific data Laurents Sesink
April 12, 2005 WHAT DOES IT MEAN TO BE AN ARCHIVES? Trusted Digital Repository Model Original Presentation by Bruce Ambacher Extended by Don Sawyer 12.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Summary of HEP SW workshop Ian Bird MB 15 th April 2014.
Data Preservation at Rutherford Lab David Corney 9 th July 2010 KEK.
APA’s Virtual Centre of Excellence (VCOE) and its Vision APARSEN-EGI-Community-Forum Training on Data Preservation 22 nd of May 2014 Helsinki Matthias.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
BNSC Agency Report David Giaretta Colorado Springs 16 Jan 2007.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The use of the.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
Digital Preservation Initiatives in the United States A Summary Deanna B. Marcum.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
PV 2009, ESAC, Spain, 1-3 Dec Long term data and knowledge preservation for the Earth Sciences Archive S. ALBANI (ESA) D. Giaretta (STFC) PV 2009.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network aparsen.eu #APARSEN Options.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN CoE offerings Simon Lambert STFC All Hands Meeting, Amsterdam,
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT Services and Sustainability David Giaretta,
Legacy and future of the World Data System (WDS) certification of data services and networks Dr Mustapha Mokrane, Executive Director, WDS International.
WP14 Common Testing Environments
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Certification of Trusted Repositories
Trustworthiness of Preservation Systems
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Digital Preservation and Trusted Digital Repositories
Presentation transcript:

PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Overview Aims Survey Roadmap

Alliance for Permanent Access The Alliance aims to develop a shared vision and framework for a sustainable organisational infrastructure for permanent access to scientific information  The British Library  European Organization for Nuclear Research [CERN]  CSC — IT Center for Science  Delegation of the Finnish Academies of Science and Letters  Deutsche Nationalbibliothek  Digital Preservation Coalition  European Science Foundation [ESF]  European Space Agency [ESA]  Helmholtz-Gemeinschaft Deutscher Forschungszentren  International Association of Scientific, Technical & Medical Publishers  Joint Information Systems Committee [JISC]  Koninklijke Bibliotheek  Max-Planck-Gesellschaft  NESTOR Kompenteznetzwerk  Nationale Coalitie Digitale Duurzaamheid [NCDD]  Portico  Science & Technology Facilities Council [STFC]

Alliance for Permanent Access The Alliance aims to develop a shared vision and framework for a sustainable organisational infrastructure for permanent access to scientific information  The British Library  European Organization for Nuclear Research [CERN]  CSC — IT Center for Science  Delegation of the Finnish Academies of Science and Letters  Deutsche Nationalbibliothek  Digital Preservation Coalition  European Science Foundation [ESF]  European Space Agency [ESA]  Helmholtz-Gemeinschaft Deutscher Forschungszentren  International Association of Scientific, Technical & Medical Publishers  Joint Information Systems Committee [JISC]  Koninklijke Bibliotheek  Max-Planck-Gesellschaft  NESTOR Kompenteznetzwerk  Nationale Coalitie Digitale Duurzaamheid [NCDD]  Portico  Science & Technology Facilities Council [STFC] PARSE.Insight

PARSE.Insight aims to provide: –Insight and understanding into the capabilities and practices within the various research communities –An inventory of current and planned research and development relating to e-infrastructures and permanent access –A roadmap for a support e-infrastructure for maintaining long- term accessibility and usability of scientific and other digital information in Europe –Identification of gaps in the existing and planned infrastructure –Progress towards a standard for evaluating the sustainability and trustworthiness of digital repositories

Motivation Concern with data and documents Need for supporting e-infrastructure –What should this look like? –How can it be developed? –What timescale? The role of the Alliance for Permanent Access

Infrastructures for preservation Social / Legal / Financial / Organisational Agreements / Trust / Standards Costs/ Benefits/ Rewards Technical components

Lessons from other Infrastructures Need to “grow”, “encourage”, “foster” rather than “build” include organisational, financial, legal & marketing Provide services rather than specific technologies Tackle “choke points” Various phases of development

Approach Approach based on evidence from community insight … … while taking full account of current work on digital preservation Coverage of disciplines: wide and deep Coverage of resources: data and documents

Approach (2) Top-down: –Desk research –Targeted surveys to stakeholders in science –Interviews –Workshops and conferences Bottom-up: case studies in 3 communities: –Case 1: High Energy Physics (HEP) –Case 2: Earth Observation (EO) –Case 3: Social Sciences & Humanities (SSH)

Encouraging Organisational and Social change Policies: mandates for depositing research data and funding agencies requirements: Robust and reliable deposit places, where researchers can be sure their data will not get lost, be corrupted or misused with correct right access mechanisms. Elements that increase comfort levels so that new users will know how to use and interpret the available data. Communication and awareness around these issues. Have publication of data as valued and as referencable as is a publication of a paper in a journal.

12 Benefits No organisation can do everything that is required for digital preservation forever Need to share the cost/effort Need to identify commonalities –None will be a perfect fit for all purposes

Insight: stakeholders Research Research institutes (non-profit) Universities Academic libraries Data management (preservation) Data centres (profit / non- profit) Libraries Archives Funding/policy National Funding organisations European funding Corporate funding Publishing General (cross-community) publishers Specific (community) publishers

General Surveys to stakeholders Research 1397 responses Data management (preservation) 273 responses Funding/policy < responses Publishing 186 responses Plus a similar number from in- depth case studies

About researchers Communities aggregated to: Agriculture & Nutrition Behavioural Sciences Humanities Life Sciences Medicine Social Sciences Physical Sciences Socio-Cultural Sciences Technology Based on KNAW classification (Royal Netherlands Academy of Arts and Sciences)

Reasons for preservation (R) It is unique It potentially has economic value It may stimulate inter-disciplinary collaborations. It allows for re-analysis of existing data. It may serve validation purposes in the future. It will stimulate the advancement of science. If research is publicly funded, the results should become public property and therefore properly preserved.

What? Data spectrum (R)

Sharing of data (R) How open is your data?

Sharing of data (R) Which constrains do you see in making data open?

Threats to preservation 1.The ones we trust to look after the digital holdings may let us down. 2.The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future. 3.Loss of ability to identify the location of data. 4.Access and use restrictions (e.g. Digital Rights Management) may not be respected in the future. 5.Evidence may be lost because the origin and authenticity of the data may be uncertain. 6.Lack of sustainable hardware, software or support of computer environment may make the information inaccessible. 7.Users may be unable to understand or use the data e.g. the semantics, format or algorithms involved.

Threats to preservation (R) The ones we trust to look after the digital holdings may let us down The current custodian of the data may cease to exist Loss of ability to identify the location of data Access and use restrictions may not be respected in the future Evidence may be lost Lack of sustainable hardware/software Users may be unable to understand or use the data

Threats to preservation (R) Users may be unable to understand or use the data e.g. the semantics, format or algorithms involved.

ThreatRequirement for solution Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved Ability to create and maintain adequate Representation Information Non-maintainability of essential hardware, software or support environment may make the information inaccessible Ability to share information about the availability of hardware and software and their replacements/substitutes The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity Ability to bring together evidence from diverse sources about the Authenticity of a digital object Access and use restrictions may make it difficult to reuse data, or alternatively may not be respected in future Ability to deal with Digital Rights correctly in a changing and evolving environment Loss of ability to identify the location of dataAn ID resolver which is really persistent The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future Brokering of organisations to hold data and the ability to package together the information needed to transfer information between organisations ready for long term preservation The ones we trust to look after the digital holdings may let us down Certification process so that one can have confidence about whom to trust to preserve data holdings over the long term

FUTURE Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved Non-maintainability of essential hardware, software or support environment may make the information inaccessible The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity Access and use restrictions may not be respected in the future Loss of ability to identify the location of data The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future The ones we trust to look after the digital holdings may let us down

Links CASPAR: – – validation-evaluation-report/at_download/file - Validation report – - cartoon PARSE.Insight: Alliance for Permanent Access Digital Curation Centre: Audit and certification: wiki.digitalrepositoryauditandcertification.org OAIS:

END