Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz RDA Europe Workshop, Garching, 20 Feb 2015
2 Practical Policy Working Group Practical Policy Assertion or assurance that is enforced about a (data) collection (data set, digital object, file) Computer actionable policies are used to enforce data management automate administrative tasks validate compliance with assessment criteria automate scientific data processing and analyses
3 The purpose of a collection defines the properties to be maintained for each digital object within the collection. Example properties can be preservation assertions such as authenticity, integrity, chain of custody, and original arrangement or be based on digital collection assertions such as description and arrangement by subject or be based on systemic properties of the collection such as completeness, correctness, and consistency. PP WG Policy introduction (based on the Policy Template document released on the PP WG wiki by 20 Feb 2015)
4 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Client Action Periodic Assessment Criteria Policy Policy Enforcement Point Workflow Invokes Has SubType Isa Function Chains Operation Isa Persistent State Information Persistent State Information Isa Digital Object Updates Has Replication Policy Checksum Policy Quota Policy Data Type Policy Isa Integrity Isa Authenticity Isa Access control Isa GetUserACL SetDataType SetQuota DataObjRepl SysChksumDataObj Isa DATA_ID DATA_REPL_NUM DATA_CHECKSUM Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
5 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Defines Attribute Defines Policy Has Property Defines Procedure Control s Updates Persistent State Information Persistent State Information Isa Digital Object Updates Has Sharing Publication Preservation Sharing Publication Preservation SubType Has Community Consensus Computer Actionable Implementation
6 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Persistent State Information Persistent State Information Isa Digital Object Updates Has Integrity Isa Authenticity Isa Access control Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
7 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Persistent State Information Persistent State Information Isa Digital Object Updates Has Replication Policy Checksum Policy Quota Policy Data Type Policy Isa Integrity Isa Authenticity Isa Access control Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
8 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Workflow Isa Function Chains Operation Isa Persistent State Information Persistent State Information Isa Digital Object Updates Has Replication Policy Checksum Policy Quota Policy Data Type Policy Isa Integrity Isa Authenticity Isa Access control Isa GetUserACL SetDataType SetQuota DataObjRepl SysChksumDataObj Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
9 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Workflow Isa Function Chains Operation Isa Persistent State Information Persistent State Information Isa Digital Object Updates Has Replication Policy Checksum Policy Quota Policy Data Type Policy Isa Integrity Isa Authenticity Isa Access control Isa GetUserACL SetDataType SetQuota DataObjRepl SysChksumDataObj Isa DATA_ID DATA_REPL_NUM DATA_CHECKSUM Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
10 Policy Components - Conceptual Fundamentals Policy-based Data Management Concept Graph Collection Purpose Completeness Correctness Consensus Defines Consistency Attribute HasFeature Has Defines Policy Has Property Defines Procedure Control s Updates Client Action Periodic Assessment Criteria Policy Policy Enforcement Point Workflow Invokes Has SubType Isa Function Chains Operation Isa Persistent State Information Persistent State Information Isa Digital Object Updates Has Replication Policy Checksum Policy Quota Policy Data Type Policy Isa Integrity Isa Authenticity Isa Access control Isa GetUserACL SetDataType SetQuota DataObjRepl SysChksumDataObj Isa DATA_ID DATA_REPL_NUM DATA_CHECKSUM Isa HasFeature Sharing Publication Preservation Sharing Publication Preservation SubType
11 Name spaces - 7 name spaces for managing distributed environment Users : Collections : Digital objects : Storage systems Policies : Micro-services : Metadata Operations iRODS – more than 300 basic operations Persistent state information iRODS – 338 attributes on the 7 name spaces Policies Data sharing – 11 default policies Data publication – 5 additional policies (LifeTime Library) Preservation – about 70 policies Scale
12 Policy Template Policy : Operation : Constraints : State Information Policy typeOperationConstraintsState information ReplicationSet replica propertiesWhen?Default policy enforcement points Number of replicasDefault number Where is replicate put?Default replica location Which files (collection/user/size)?Default policy selection criteria Default criterium value Set replica access controls?Default access control Require checksum?Replica checksum flag When audit?Default time period ReplicateDelayed or immediateReplica location Replica creation time Replica access control Replica name Replica owner Replica number Verify replica numbersPeriodic ruleAudit time stamp Log of problems and actions Replace missing replicas Replica location Replica creation time Replica access control Replica name Replica owner Replica number
13 Identifiers are defined by the operations that their resolvers support GUID – unique identifier Handle – add location information Ticket – add access controls Data grid logical name – add arrangement and metadata Workflow – add parsing and subset extraction Digital objects File – may have associated structural, provenance, descriptive metadata Soft link – add method to retrieve the digital object from a remote system Workflow Structured Object – add provenance, versioning, and output Persistent Identifiers
14 Associate metadata with the procedure that extracts the associated metadata value Replace metadata with an executable procedure Types of metadata Provenance Structural Description Internal features Feature-based indexing Extract all words from text Extract all degrees of freedom from data set Automate metadata extraction Metadata