Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approaches and Challenges in Managing Persistent Identifiers

Similar presentations


Presentation on theme: "Approaches and Challenges in Managing Persistent Identifiers"— Presentation transcript:

1 Approaches and Challenges in Managing Persistent Identifiers
Nordic Workshop on Data Citation Policies and Practices Helsinki, 2016/11/23

2 Motivation and background : About DKRZ
A national service provider for the climate (modeling) community DKRZ = German Climate Computing Center Non profit service company established 1987 Located in Hamburg, Germany Balanced HPC / storage system 3 PFlop Bull system 45 PByte Lustre parallel file system 335 PByte HPSS tape backend Data Services: Long term data archival World Data Center for Climate Core node in international climate data federation (ESGF, IS-ENES) Approaches and Challenges in Managing PIDs 2016/11/23

3 Motivation and background: CMIP6
Approaches and Challenges in Managing PIDs 2016/11/23

4 Motivation and background: Challenges
User-driven: Wider user audience Downstream usage of climate data – new processing and analysis services Resource-driven: Same resources, but... More objects More diversity Not monolithic – graph structures Still: Keep it simple Approaches and Challenges in Managing PIDs 2016/11/23

5 Addressing the management challenges
Support objects through their life cycle Give a name to every object Automate tasks – intelligent agents Make transitions transparent Enable users/agents to pull info to object at hand Requirement: Understand PIDs not as a guarantee for object persistency Approaches and Challenges in Managing PIDs 2016/11/23

6 Achieving persistency is not primarily a technical challenge!
What is persistency? Persistency of the object Not bound to use of a (specific) PID Persistency of the PID Object can be gone Persistency of the PID-Object link Object+PID+link = Citability Persistency statements Persistency of essential metadata Object can be gone! Achieving persistency is not primarily a technical challenge! Approaches and Challenges in Managing PIDs 2016/11/23

7 Infrastructure view: Automation and abstraction
Not anymore just management of files in file systems Management of digital objects through dedicated services/chains Focus on stable protocols and interfaces, modularity Hide complexity of automation machinery from users Approaches and Challenges in Managing PIDs 2016/11/23

8 PIDs in the middle enable automated management
Object management scenarios bring new requirements to PIDs courtesy of Larry Lannom Approaches and Challenges in Managing PIDs 2016/11/23

9 What are components for a PID federation?
Federation: Scalable, but needs to be organized well Technical expertise (common interfaces, protocols) Resources (staff, know-how, funding) Support services (help desk, training) Governance mechanisms Operational schema (processes, QA, reporting, intelligence, innovation management) Approaches and Challenges in Managing PIDs 2016/11/23

10 Some details into the challenges for CMIP6
Requirement: Put a Handle in every file header, but not allowed to change files after production phase tracking_id = hdl: /<UUID> Lot of time spent on agreements that ensure sanity of PID record Each object gets a PID and no object outside our control with embedded PID PID not citable – required metadata not ready Still: some file headers are extracted and put in the PID record PIDs are a new development – Handle registration not allowed to interrupt publication process Approaches and Challenges in Managing PIDs 2016/11/23

11 Making it scalable requires additional effort
Buurman, Weigel, Juckes, Lautenschlager, Kindermann: Persistent Identifiers for CMIP6 in the Earth System Grid Federation, EGU 2016 Approaches and Challenges in Managing PIDs 2016/11/23

12 Approaches and Challenges in Managing PIDs
The user‘s reality... Approaches and Challenges in Managing PIDs 2016/11/23

13 Approaches and Challenges in Managing PIDs
Take-home messages Use of PIDs for data management presents new requirements, but also new benefits Automation and machine agent usage are key elements Data citation is one use case besides others, benefits from improved transparency Multiple aspects of persistency can become relevant Approaches and Challenges in Managing PIDs 2016/11/23

14 Thank you for your attention.
Approaches and Challenges in Managing PIDs 2016/11/23


Download ppt "Approaches and Challenges in Managing Persistent Identifiers"

Similar presentations


Ads by Google