1 Authenticity Capture Prototype Matt Dunckley, STFC
2 Why we need tools Demonstrable Authenticity Trustworthiness (Implies Quality) Need to capture all information deemed necessary to make an informed judgement as to the authenticity Provide a customisable and flexible mechanism to define what is the important PDI that needs capturing Design is based on the Authenticity Model, which provides a standard process and framework
3 Authenticity Model Framework to record pertinent information using standard terminology Events – Capture information about an Event of importance Protocols – Executed in response to an event occurring Steps – Details information to capture for the event AuthProtocolExecutionReport – Result of execution used by community member to make a judgement
4 Nature of the Tool Keeping it simple to start with GUI tool Capture mainly textual information (Evidence) when events occur Consistent and Standard user experience to perform capture Where possible allow automated capture through plug-ins & external tools
5 User Roles – Use Cases Project Creator/Administrator –Sets up an Authenticity project - providing detailed project information - reasons, objectives, ie. the business case –Imports the projects Authenticity model, an XML protocol document based on Authenticity Model Authenticity Information Capturer –Registers to Project –Edit/Add User Profile Information i.e credentials, details of their role, qualifications, affiliations, accreditations. –Creates and instance of a project, i.e for a particular digital object –Follows authenticity protocols / steps a procedure to capturing the specified information for each step –Sign off Capture as complete Researcher/Consumer / All users –Can search for authenticity information by project –Can browse the holdings –Export Captured information in various formats
6 Digests To verify the collected Authenticity information is trustworthy it is important to detect any forgery and if the information has been tampered To allow this to be determined by a consumer we use the digest for digital signing the captured information A cryptographic hash function is applied to the captured text information, returning a (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value. To investigate if there has been some change, the hash value can be recalculated and compared the original. A Hash value is also know simply a digest The digest will also inform as to whether digital corruption has ocurred The properties of a good digest should mean it would be –impracticable to find a message that has a given hash, –impracticable to modify a message without changing its hash, –impracticable to find two different messages with the same hash.
7 Digests At each information capture point a digest of the information is recorded At the point of sign-off a digest of all captured information is recorded Example of information captured for Ionosonde station Location <capture capturer="us01" confidence="85" dateTime=" :09:36" hashAlorithum="md5 hashValue="7a1e512cef9cfd93af cc2 pdiID="Ionosonde-01_step1_pdi2" pdiValue="BUDAPEST - HUNGARY Geographic Latitude (WGS-84) 47.00°N Geographic Longitude (WGS-84) 19.00°E Magnetic Latitude (IGRF-10(2005) 45.93°N Magnetic Longitude (IGRF-10(2005)) °E projectID="Ionosonde" stepID="Ionosonde-01_step1" timestamp=" "/>
8 XML Protocol Document Defined by XML Schema – Based on Authenticity Model
9 User Experience / Flow diagram
10 Case Study Applied Auth Model - Ionosonde WDC, STFC To allow us to design of the Auth Protocols 1 st Statement of policy (Authenticity Recommendation) by which we can measure if we have captured enough evidence Record all PDI necessary to verify the authenticity and quality of received data files for long term archival within the WDC 2 nd Identify Events –Ingestion of raw data files in varying formats –Transformation of received data files into IIWG format –Final validation and archival of IIWG file within WDC 3 rd Design Protocols and steps
11 Ingestion Protocol Ingestion of raw data files in varying formats Recommendation (Policy) In order for this digital data to be accepted as Ionosonde data of sufficient quality the reliability of its source must be verified and recorded by a WDC accredited archivist Steps will capture –Source of dataset Evidence that this is indeed the source –Archivist name and details Ideally some form of credentials would be attached
12 Transformation Protocol Transformation of received data files into IIWG format Recommendation (Policy) For the received data file to be deemed as sufficient quality to support data analysis it must have been successfully transformed into standard IIWG data format, the use of processing software must be recorded Steps will capture –Details of transformation used Software details including name, version, source Time/date of process System details Details of person responsible Reason for believing this software is reliable –Details of Transformational Information Properties checked Information Property descriptions Values checked Details of how checked Details of who was responsible for the check
13 Archival Protocol Final validation and archival of IIWG file within WDC Recommendation (Criteria) Successful validation of the IIWG structure and syntax must be achieved and recorded before long term archival can take place Steps will capture –Details of validation process Details of checks performed Details of person responsible –Details of transfer to storage Details of person responsible Checks performed on transfer e.g. Fixity checks Details of archive and storage system Date/time of transfer