ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 1 Workflows in Digital Preservation.

Slides:



Advertisements
Similar presentations
Writing Good Use Cases - Instructor Notes
Advertisements

Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.
Chapter 1: The Database Environment
Chapter 7 System Models.
Requirements Engineering Process
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION ISO 14721:2003 OAIS - RM 6 July, 2004 Richard Ullman / 18th APAN, eScience Workshop, Cairns Austrailia 1.
1 The CASPAR Project Key Components - Overview Luigi Briguglio Engineering R&D Laboratory – Roma CASPAR FINAL WORKSHOP IN ROME SCUOLA SUPERIORE PUBBLICA.
By Rick Clements Software Testing 101 By Rick Clements
1 Hyades Command Routing Message flow and data translation.
Software Process Modeling with UML and SPEM
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
Making the System Operational
|epcc| NeSC Workshop Open Issues in Grid Scheduling Ali Anjomshoaa EPCC, University of Edinburgh Tuesday, 21 October 2003 Overview of a Grid Scheduling.
Programming Language Concepts
1 Implementing Internet Web Sites in Counseling and Career Development James P. Sampson, Jr. Florida State University Copyright 2003 by James P. Sampson,
Solve Multi-step Equations
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Week 2 The Object-Oriented Approach to Requirements
Electric Bus Management System
Configuration management
Fact-finding Techniques Transparencies
OOAD – Dr. A. Alghamdi Mastering Object-Oriented Analysis and Design with UML Module 3: Requirements Overview Module 3 - Requirements Overview.
ABC Technology Project
1 Web-Enabled Decision Support Systems Access Introduction: Touring Access Prof. Name Position (123) University Name.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
VOORBLAD.
Software Requirements
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Lecture plan Outline of DB design process Entity-relationship model
Lecture 6: Software Design (Part I)
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Global Analysis and Distributed Systems Software Architecture Lecture # 5-6.
25 seconds left…...
Subtraction: Adding UP
Copyright 2001 Advanced Strategies, Inc. 1 Data Bridging An Overview Prepared for DIGIT By Advanced Strategies, Inc.
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques
Januar MDMDFSSMDMDFSSS
REGISTRATION OF STUDENTS Master Settings STUDENT INFORMATION PRABANDHAK DEFINE FEE STRUCTURE FEE COLLECTION Attendance Management REPORTS Architecture.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
Chapter 13 The Data Warehouse
CpSc 3220 Designing a Database
© 2014 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
From Model-based to Model-driven Design of User Interfaces.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Open Archival Information System
Presentation transcript:

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 1 Workflows in Digital Preservation

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 2 Overview  Workflows  Processes in Archives  Workflow Modeling for Digital Archives  Workflows in Digital Archives  Example: Open Archival Information System (OAIS) and “Producer-Archive Interface Methodology Abstract Standard”  The State of Workflow

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 3 Workflow Roots  Office Automation Systems (mid 1970s)  Document Management Systems (mid 1980s)  Applications  Database Management

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 4 Definition: Workflow  “A workflow is a specific representation of a process, which is designed in such a way that the formal coordination mechanisms between activities, applications, and process participants can be controlled by an information system, the so-called workflow management system.” Michael zur Muehlen (2004) Workflow-based Process Controlling

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 5 What does this definition mean? 1.A formal process definition exists 2.This process definition is the result of process design 3.A workflow management system coordinates the steps in the process and assigns the participants as defined by the process definition 4.The participants can be (IT-) applications or humans.

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 6 Processes in Archives  If defined, mainly for arrangement  High degree of implicit knowledge  Local flavours  No standards  Very adaptive  Ever changing

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 7 Why Workflows?  Massive amounts of similar data cannot be processed “by hand” (automation)  Quality management  Authentic data needs an audit trail, documenting its handling (retraceability)  Implicit knowledge becomes explicit (sharing knowledge)  Centralized implementation  Iterative development

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 8 Barriers for Workflows in Digital Archives  Small number of digital archives  Little interest  Little research  Implicit knowledge  Small market  No “out of the box” solutions  DSpace implements a simple workflow  Fixed  Only for Ingest (data and metadata checks)

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 9 Rectulance against Workflow Management  Some quotes:  “It’s just another hype”  “We’re low on resources already!”  “Will I still be able to do … ?”  “Another 'big brother' tool”  “This is too technical”  “We've always done it this way”

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 10 Workflows for Digital Preservation Disadvantages:  Initial costs  Complexity  Efforts for definition  Set-up  More control  Too many tools and standards Advantages:  Documentation  Automation  High throughput  Audit trail  More control  Sharing knowledge  Connecting Applications

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 11 Specific Features of Digital Preservation Workflow Modeling  Formal Semantics  Flexibility  Decomposition  Simplicity  Stability  Retraceability

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 12 Formal Semantics  Unambiguous design and enactment of workflows  Textual definitions of semantics will change their meaning in time  Mathematical backing needed  Additional advantages:  Automatic model checks for certain properties of a workflow (e.g. deadlocks)  Simplified implementation

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 13 Flexibility  Every workflow will change with time  New technology will bring new possibilities  New possibilities should not require a new modeling language  Eases long term data management (of the workflows)

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 14 Decomposition  Workflows should not be monolithic  Vertical decomposition in modules that can be composed into some new workflow  Horizontal decomposition to enable more implementational details at lower levels  Reuse of tested building blocks  Parallel development

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 15 Simplicity  The principles should be easy to understand by current and future users (e.g. submitters, personnel and researchers)  Better user acceptance because everybody can understand how it works  Simple principles are harder to misinterpret in the long term

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 16 Stability  The notion of workflows is relatively new  Every change in semantics must be preserved (exponential complexity in data management)  Workflow modeling standards have to show that they are here to stay  Active research, development and use keeps standards alive  Backwards compatibility is a must

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 17 Retraceability  The audit trail should contain enough data to “relive” a workflow (including all routing decisions and error-handling)  The workflow definition is still needed  Basic requirement for authentic digital objects

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 18 Workflows and the OAIS RM

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 19 What is the OAIS RM?  The Open Archival Information System Reference Model (OAIS RM) is a unique standard for digital archives (ISO 14721:2003)  Framework for  Terminology and concepts  Architecture of digital archives  Defines roles and responsibilities  Defines functional entities and logical data models

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 20 Implementation of OAIS Workflows  OAIS RM defines multiple high-level workflows  Scattered in the standard, as it only describes the functions and their interactions  Mainly data flows  Act as guidelines  Implementation by the user  No definition of audit data

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 21 Example Ingest Workflow 1/2  Defined by the OAIS RM 1.Producer delivers Submission Information Package (SIP) 2.Ingest checks SIP against contract (validation) 3.Ingest creates Archival Information Package AIP(s) and Package Description(s) 4.Ingest stores Package Description(s) (metadata) in Data Management 5.Ingest stores AIP(s) in Archival Storage 6.Ingest notifies the producer, that the data has been stored and that the may delete his data

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 22  High level  Simple steps  Circles and arrows OAIS Ingest Workflow

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 23 Example Ingest Workflow 2/2  Refined step 2 (validation) by “Producer- Archive Interface Methodology Abstract Standard” (CCSDS B-1)  Steps:  Apply the validations  Systematic validation (after each transfer session) or  In-depth validation (only on a coherent set of data)  If the validation has failed, the producer is notified.

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 24  Same here Validation Sub-Workflow Validation

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 25 What is missing?  Data flow  Interfaces  Audit Data  Roles and resources  Semantics

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 26 The State of Workflow in Preservation 1/2 Technology Hype Curve by Tom Baeyens (2004)

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 27 The State of Workflow in Preservation 2/2  No common “workflow basis” (à la relational algebra for relational databases)  No single standard for workflow definition (à la structured query language (SQL) for relational databases)  Are we too early?

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 28 Thank You  Stephan Heuscher m t f  ikeep Ltd. Digital Archives Services Morgenstrasse 129 CH-3018 Bern Switzerland Contact:

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 29 Workflow Lifecycle 1/2  Planning  Define process, organization, and data models  Define goals  Implementation  Transform models for execution  Create interfaces to systems and users)

ERPANET Workshop on Workflow / 13. October 2004 / Stephan Heuscher / p. 30 Workflow Lifecycle 2/2  Enactment  Coordinate control and data flow, resource assignments and application invocation  Monitor workflow enactment  Notify users  Evaluation  Evaluate audit trail data