PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading
PLANETS A big project to build digital preservation tools...
OPF’s Challenge The Open Planets Foundation was set up to sustain the PLANETS outputs into the future. –But the tools are Numerous, often complex, & of mixed quality/maturity Require complex technology stacks (JEE) –So, how do we make the code sustainable? Selection, modularisation, simplification Aim for a flexible suite of modular tools, rather than a monolithic system
SCAPE Many PLANETS partners –Including OPF Many new partners too Driven by data –Web archiving, science data, large-scale Cluster computing for scale –Based on the HADOOP platform
PLATO
The PLANETS Testbed
The PLANETS Testbed: Too Many Good Ideas In One Place Designing experiments –Web GUI for complex workflows Running experiments –All services hosted centrally, plus test corpora Analysing the results –Per-experiment automated & manual analysis –Multi-experiment aggregation & data mining Sharing all of the above
Re-imagining The PLANETS Testbed: A Modular Approach Use separate tools in each role –Experiment Design –Execution –Analysis Publish results from each –Loosely coupled instead of all-in-one i.e. sharing is built into the design
Experiment Design: SCAPE Workflows In Taverna As part of SCAPE
Experiment Design Support: SCAPE Service Registry
Experiment Design Support: OPF Shared Test Corpora Simple collections accessed over HTTP –No special browser software required Publicly hosted by HATII –May also be mirrored by OPF members Stabilise corpora from Planets –Adsorb corpora from SCAPE & elsewhere Look for Open Source CMS/Annotation tools –Layer on top of HTTP collections
Experiment Design Support: Sharing & Publishing Via myExperiment
Experiment Execution Support: SCAPE’s Lightweight Tool Wrapping PIT: Preservation-action Invocation Tool –Uses XML ‘tool specification’ documents that describe preservation actions Command-line templates, Java classes, PLANETS/SCAPE web services, etc –Built to be shared Can be published via, e.g. myExperiment Should lead to more reproducible results –Re-using PLANETS interoperability code
Experiment Execution: Multi-platform Tool & Workflow Invocation Shared tool specifications make multi-platform execution easier –From the command line –From within Taverna –From the SCAPE cluster platform –From a simplified web interface Run local-first, remote/service as needed Collect results in a standard form, using Testbed code
Experiment Execution: Publishing Experimental Results Via REF OPF Results Evaluation Framework: REF –Hard-coded experiments of common interest Can run the experiment automatically –Publishes results as linked data Built by Dave Tarrant, based on P2 format registry –Will come up again in the Identification session –SCAPE aims to publish much more data
Analysing Results: Linked Data & Future Plans REF allows data to be inspected –Concentrating on collecting data at present Will expose SPARQL endpoint for data queries –Analysis, visualisation can be build upon that Please add analysis Issues for your Datasets and preservation processes to the wiki! –e.g. what graphs and statistics would be useful?
Summary PLATO –SCAPE will add Preservation Watch & more The PLANETS Testbed –Re-imagined as a gateway to a complementary suite of preservation tools and data services –SCAPE leveraging work from Taverna, IMPACT Development driven by user needs –SCAPE Scenarios, AQuA/Hackathon Issues