Download presentation
Presentation is loading. Please wait.
Published byMilo Chambers Modified over 9 years ago
1
Softwaretechnologie für Fortgeschrittene Teil Eide Stunde IV: Media server operations and abstraction (with contributions from Christian-Emil Ore, Jon Holmen, and other colleagues at the Unit for Digital Documentation, University of Oslo) Köln 17. Dezember 2015
2
File processing Parse processing XML Set up production line – any conversion path with possible conversions can be added Matrix of in and out formats and default scripts – can be overridden. Scripts run in background – queue handling – load balancing
3
Remember work flow Recording a (camera) media_unit a (original image raw format) media_unit b (large tiff) media_unit c (large jpeg) media_unit d (small jpeg) Recording b (processing software) Recording c (image processing server) Recording d (image processing server) One information object for each original recording (Work) (media_group)
4
File processing xml (example) <IMAGE type="digital_kopi" method="convert" format="jpg" subpath="jpeg" rec_process="3" software="Imagemagick 6.2.9" settings="fileFormat JFIF contentFormat 24BITRGB”> <IMAGE type="digital_kopi" method="convert" format="jpg" width="640" height="480" subpath="small" rec_process="3" software="Imagemagick 6.2.9" settings="maxScale 640x480 fileFormat JFIF contentFormat 24BITRGB" default="1">
5
File processing script Process status: 0: not converted 1: under conversion 2: converted 9: conversion error Outer structure producing conversion jobs: while (!kill_signal) fork out new process wait 10 seconds
6
File processing script Each forked our job: dba:connect int i= dba:get_current_conversion_count (status 1) if (i > 10) die string proc_str= dba:get_proc_string with status 0 dba:set_proc_string_status(proc_str) to 1 boolean OK= true while (!end(proc_str)) interpret(proc_str) string comm= pick_command(proc_str) int ret_val= system(comm) if (ret_val == 0) dba:record_new_file_name else OK= false if (OK = true) dba:set_proc_string_status(proc_str) to 2 else dba:set_proc_string_status(proc_str) to 9
7
Archiving script This script is run every night (cron job on unix) Files for archiving are files where – archive date is NULL – error_field is NULL
8
Remember long term preservation tapestation diskraid Digital Original Other copies Digital Original Tape duplicates
9
Archiving script data_list= dba:get_files_for_archiving foreach (data_line from data_list) if (data_line:file exist) dba:write(“Archive command with timestamp”) int ret_val= archive(file) if (ret_val = “Error”) dba:write(“Archive error”) else ret_val= query_archived_file int file_size= ret_val:filesize if (ret_val = “Archived > 1”) dba:write(“Archive warning: multi”) call delete_test(file_size) else if (ret_val = “Archived = 1”) dba:write(“Archiving OK”) call delete_test(file_size) else dba:write(“Archived but not found error”) else dba:write(“File not found error”)
10
Archiving scripts: delete files delete_test: file_size_archiving= arg file_size_limit= dba:ask_limit_for_schema if (file_size_archiving > file_size_limit) dba:write(delete: yes) delete: file_list= dba:files_where(delete=“yes”, deleted=“NULL”) foreach (file in file_list) int ret_val= system(delete file) if (ret_val = 0) dba:write(“Deleted date_time”) else dba:write(“Delete error”)
11
Monitoring Zombie jobs Server load Memory consumption – leakage – fine on a PC, but with uptime > 100 days… Database – error messages – instability – abnormal behaviour Must be monitored by humans (using tools) – email messages with control data
12
Extending metadata Some metadata must be kept – e.g., process records – format details Some metadata can be changed – classification – motive description But keep old versions – institution history – legal liability historical institutional/governmental racism land rights
13
Example system 1: Fedora Based on self promotion: http://www.fedora-commons.org An open source repository system – for the management and dissemination of digital content – robust – modular Especially suited for – digital libraries and archives – access as well as preservation Used to provide specialized access to – very large and complex digital collections – historic and cultural materials – scientific data
14
Example system 1: Fedora User base that includes – academic and cultural heritage organizations – universities – research institutions – university libraries – national libraries – government agencies
15
Example system 1: Fedora Long-term flexible access to digital resources – can be used to support all types of digital content – digital collections – e-research – digital libraries and archives – digital preservation – institutional repositories – open access publishing – document management – digital asset management
16
Example system 2: Omeka Based on self promotion: http://omeka.orghttp://omeka.org A free, open-source, digital publishing suite for – scholars – librarians – archivists – museum professionals – cultural enthusiasts to publish – archives – collections – exhibits – teaching materials Provide interaction for public audiences
17
Example system 2: Omeka Extensible, scalable, and flexible – can handle large collections ( >1 million items) – element sets for institution-specific metadata may be added – Zend framework for PHP allows for customization – accepts and stores all types of files images video audio multi-page documents and PDFs Power Point presentations individual items may contain multiple files etc. – extensible with dozens of available plugins
18
Example system 2: Omeka Standards-based – metadata (tied to Dublin Core) – web design Interoperable – Dublin Core – Fedora Connect Data sharing – feeds – RDF – migration Re-purpose content – enter or import item metadata once – use items and metadata in multiple instances across website – including exhibits
19
Towards abstract modelling How can this method be generalised? – Some preliminary notes Learning strategies The role of theory theory/modelling implementation
20
What is an image? An image can be found – in a data file – on a 35mm film – at a paper positive – at a glass plate –…–… We do not care, they are all images – modelled as media_units – connected to media_groups If something is – another media_unit to an existing media_group, or – a derived media_group is a scholarly (content based) choice
21
The transformation event Each transformation event happened in time – may or may not know when – actor(s) may be know or unknown Transfer events from – analogue to analogue – analogue to digital – digital to digital – (digital to analogue) are recorded in the same way
22
Chains of events Connected to a image there is a chain of events – like the passport of a person with stamps Can see where the image comes from Can step in at any point to re-do processing Some images can be seen as caches – but the distinction between cached and not is not central – rather: some processes can be re-done – but be aware of detail differences program versions libraries
23
The memory of events Storing events: writing history This is obviously important for old stuff – museums try to track provenience But all new will become old History is made by our scripts – we can record it or let it go
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.