Designing Persistency Delos NoE, Preservation Cluster Workshop: Persistency in Digital Libraries 14. February 2006, Oxford Internet Institute
1 Digital Objects Digital objects are the core elements a digital library deals with. For long-term preservation they have to be kept persistent. Therefore we need DL systems that have the ability to keep digital objects persistent.
2 Digital Objects: Structure Digital objects are composed entities with a logical structure. Example:
3 Life-cycle of Entities Persistency appears in the life-cycle of any entity within a digital library.
4 Life-cycle of Entities The nature of instability of entities appears in the life-cycle of any entity within a digital library.
5 Requirements for the Design of a Persistent DL System The system should be able to react according to the transient and composite nature of digital library‘s entities. Entities in digital libraries can be of different nature. E.g., the system should also be able to keep the structure of a collection as well as of a digital object persistent.
6 Requirements for the Design of a Persistent DL System It also should be able to react as to the common failures which can occur in a complex system. It should cope with different preservation strategies. Implementation: The design should be expressed in way that makes implementation of the system and maintenance of it as easy as possible.
7 This includes for example: –We need a flexibel system that is able to deal with digital objects of any structure. –The overall processing of the changes to the digital object needs to be strictly controlled. –The individual processing steps need to be bundled in line with their functional similarity.
8 –All events that may have an impact on the longevity of digital objects need to be recorded. –Modifying access on digital objects (external or internal) need to be „announced“ to the preservation system. –The design of the system need to be expressed in a flexible, extensible, standardised and widely accepted language.
9 Prerequisites for this Approach We decided to use the UML for modelling persistent digital library systems. Therefore, the systems which adopt the Preservation Module, must be expressed in UML notation. The approach suggested here is focused on the implementation of the system, that is a digital library as a software system.
10 Preservation Module
11 Preservation Module
12 Preservation Module
13 Preservation Module
14 Preservation Module
15 Preservation Module
16 Registering Interface Core tasks: Receiving and forwarding messages that represent a potentially persistency-sensitive scenario.
17 Preservation Module
18 Persistency Agent Core tasks: Interface between the processing units of the Preservation Module and the DL system (‚Gate‘); Delegation of messages; Controlling the message flow to the surrounding system components. Additional tasks could be: Producing logfiles on activation of the Preservation Module, statistical issues
19 Processing Controller Heart of the core unit. Unit at the top level of control. Coordination of the particular over-all processing steps. Connected to the other functional units of the Preservation Module.
20 Persistency Memory Unit that encapsulates all preservation-sensitive parameters. Keeps and administrates a machine-readable list of preservation-sensitive parameters. Could be realised as a database.
21 Persistency Guard Unit that controls the execution of basic preservation actions (internal ones, ‚routines‘), like checking if a digital object is physically in good order. Initiation and controlling of modifications to the Persistency Memory.
22 Object Handler Interface between the core unit of the Preservation Module and the system unit that is responsible for the storage of the digital objects (repository). Controls all processing steps which are related to the repository.
23 Core Unit with exemplary functionality
24 MHU
25 Message Handling Unit Message Handler: – Pre-processing of the message: Obtaining its semantic content. –Deciding whether or not a message is preservation-sensitive. External Message Handler: Exact semantic interpretation of messages which are forwarded from external units. Internal Message Handler: Exact semantic interpretation of messages that are forwarded from the inside.
26 Message Handling Unit Message Handling Unit with exemplary functionality
27 PWPU
28 Persistency Workflow Processing Unit Unit of the Preservation Module that controls all actions which concern the modification of the object (workflow). A workflow is composed of a sequence of smaller workflow units that are encapsulated in the Persistency Workflow class.
29 Persistency Workflow Processing Unit The Persistency Workflow Handler initiates the workflow process. It also composes the over-all workflow that has to be carried out, according to the messages previously interpreted and forwarded to it. The Persistency Workflow Controller is responsible for the control of the particular workflow steps.
30 Persistency Workflow Processing Unit PWPU with exemplary functionality
31 Integration of the PM into Existing Systems The PM is conceived as a software module that has only two interfaces which have to be connected to the DL system. The first interface, linking the PM via PA to the DL system is called Message Gate, the second, linking the PM to a repository is called Object Gate.
32
33 Integration of the PM into Existing Systems Step 1: Specification of the interfaces
34 Depending on the analysis of the existing model, there are various possibilities :
35 Step 2: Adding the core unit
36 Step 3: Adding the Message Handling Unit
37 Step 4: Modelling the PWPU
38 Integration of the PM into the conceptual models: EF
39 Integration of the PM into the conceptual models: 5S