Presentation is loading. Please wait.

Presentation is loading. Please wait.

Softwaretechnologie für Fortgeschrittene Teil Eide Stunde IV: Media server operations and abstraction (with contributions from Christian-Emil Ore, Jon.

Similar presentations


Presentation on theme: "Softwaretechnologie für Fortgeschrittene Teil Eide Stunde IV: Media server operations and abstraction (with contributions from Christian-Emil Ore, Jon."— Presentation transcript:

1 Softwaretechnologie für Fortgeschrittene Teil Eide Stunde IV: Media server operations and abstraction (with contributions from Christian-Emil Ore, Jon Holmen, and other colleagues at the Unit for Digital Documentation, University of Oslo) Köln 17. Dezember 2015

2 File processing Parse processing XML Set up production line – any conversion path with possible conversions can be added Matrix of in and out formats and default scripts – can be overridden. Scripts run in background – queue handling – load balancing

3 Remember work flow Recording a (camera) media_unit a (original image raw format) media_unit b (large tiff) media_unit c (large jpeg) media_unit d (small jpeg) Recording b (processing software) Recording c (image processing server) Recording d (image processing server) One information object for each original recording (Work) (media_group)

4 File processing xml (example) <IMAGE type="digital_kopi" method="convert" format="jpg" subpath="jpeg" rec_process="3" software="Imagemagick 6.2.9" settings="fileFormat JFIF contentFormat 24BITRGB”> <IMAGE type="digital_kopi" method="convert" format="jpg" width="640" height="480" subpath="small" rec_process="3" software="Imagemagick 6.2.9" settings="maxScale 640x480 fileFormat JFIF contentFormat 24BITRGB" default="1">

5 File processing script Process status: 0: not converted 1: under conversion 2: converted 9: conversion error Outer structure producing conversion jobs: while (!kill_signal) fork out new process wait 10 seconds

6 File processing script Each forked our job: dba:connect int i= dba:get_current_conversion_count (status 1) if (i > 10) die string proc_str= dba:get_proc_string with status 0 dba:set_proc_string_status(proc_str) to 1 boolean OK= true while (!end(proc_str)) interpret(proc_str) string comm= pick_command(proc_str) int ret_val= system(comm) if (ret_val == 0) dba:record_new_file_name else OK= false if (OK = true) dba:set_proc_string_status(proc_str) to 2 else dba:set_proc_string_status(proc_str) to 9

7 Archiving script This script is run every night (cron job on unix) Files for archiving are files where – archive date is NULL – error_field is NULL

8 Remember long term preservation tapestation diskraid Digital Original Other copies Digital Original Tape duplicates

9 Archiving script data_list= dba:get_files_for_archiving foreach (data_line from data_list) if (data_line:file exist) dba:write(“Archive command with timestamp”) int ret_val= archive(file) if (ret_val = “Error”) dba:write(“Archive error”) else ret_val= query_archived_file int file_size= ret_val:filesize if (ret_val = “Archived > 1”) dba:write(“Archive warning: multi”) call delete_test(file_size) else if (ret_val = “Archived = 1”) dba:write(“Archiving OK”) call delete_test(file_size) else dba:write(“Archived but not found error”) else dba:write(“File not found error”)

10 Archiving scripts: delete files delete_test: file_size_archiving= arg file_size_limit= dba:ask_limit_for_schema if (file_size_archiving > file_size_limit) dba:write(delete: yes) delete: file_list= dba:files_where(delete=“yes”, deleted=“NULL”) foreach (file in file_list) int ret_val= system(delete file) if (ret_val = 0) dba:write(“Deleted date_time”) else dba:write(“Delete error”)

11 Monitoring Zombie jobs Server load Memory consumption – leakage – fine on a PC, but with uptime > 100 days… Database – error messages – instability – abnormal behaviour Must be monitored by humans (using tools) – email messages with control data

12 Extending metadata Some metadata must be kept – e.g., process records – format details Some metadata can be changed – classification – motive description But keep old versions – institution history – legal liability historical institutional/governmental racism land rights

13 Example system 1: Fedora Based on self promotion: http://www.fedora-commons.org An open source repository system – for the management and dissemination of digital content – robust – modular Especially suited for – digital libraries and archives – access as well as preservation Used to provide specialized access to – very large and complex digital collections – historic and cultural materials – scientific data

14 Example system 1: Fedora User base that includes – academic and cultural heritage organizations – universities – research institutions – university libraries – national libraries – government agencies

15 Example system 1: Fedora Long-term flexible access to digital resources – can be used to support all types of digital content – digital collections – e-research – digital libraries and archives – digital preservation – institutional repositories – open access publishing – document management – digital asset management

16 Example system 2: Omeka Based on self promotion: http://omeka.orghttp://omeka.org A free, open-source, digital publishing suite for – scholars – librarians – archivists – museum professionals – cultural enthusiasts to publish – archives – collections – exhibits – teaching materials Provide interaction for public audiences

17 Example system 2: Omeka Extensible, scalable, and flexible – can handle large collections ( >1 million items) – element sets for institution-specific metadata may be added – Zend framework for PHP allows for customization – accepts and stores all types of files images video audio multi-page documents and PDFs Power Point presentations individual items may contain multiple files etc. – extensible with dozens of available plugins

18 Example system 2: Omeka Standards-based – metadata (tied to Dublin Core) – web design Interoperable – Dublin Core – Fedora Connect Data sharing – feeds – RDF – migration Re-purpose content – enter or import item metadata once – use items and metadata in multiple instances across website – including exhibits

19 Towards abstract modelling How can this method be generalised? – Some preliminary notes Learning strategies The role of theory theory/modelling implementation

20 What is an image? An image can be found – in a data file – on a 35mm film – at a paper positive – at a glass plate –…–… We do not care, they are all images – modelled as media_units – connected to media_groups If something is – another media_unit to an existing media_group, or – a derived media_group is a scholarly (content based) choice

21 The transformation event Each transformation event happened in time – may or may not know when – actor(s) may be know or unknown Transfer events from – analogue to analogue – analogue to digital – digital to digital – (digital to analogue) are recorded in the same way

22 Chains of events Connected to a image there is a chain of events – like the passport of a person with stamps Can see where the image comes from Can step in at any point to re-do processing Some images can be seen as caches – but the distinction between cached and not is not central – rather: some processes can be re-done – but be aware of detail differences program versions libraries

23 The memory of events Storing events: writing history This is obviously important for old stuff – museums try to track provenience But all new will become old History is made by our scripts – we can record it or let it go


Download ppt "Softwaretechnologie für Fortgeschrittene Teil Eide Stunde IV: Media server operations and abstraction (with contributions from Christian-Emil Ore, Jon."

Similar presentations


Ads by Google