Www.openplanetsfoundation.org PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading.

Slides:



Advertisements
Similar presentations
Max Kaiser: PLANETS Testbed
Advertisements

Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
Geographic Interoperability Office ISO and OGC Geographic Information Service Architecture George Percivall NASA Geographic.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
HPC Pack On-Premises On-premises clusters Ability to scale to reduce runtimes Job scheduling and mgmt via head node Reliability HPC Pack Hybrid.
Funded by: © AHDS Sherpa DP – a Technical Architecture for a Disaggregated Preservation Service Mark Hedges Arts and Humanities Data Service King’s College.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Transparent Robustness in Service Aggregates Onyeka Ezenwoye School of Computing and Information Sciences Florida International University May 2006.
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Aleksi Kallio CSC – IT Center for Science Chipster and collaboration with other bioinformatics platforms.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
CODING Research Data Management. Research Data Management Coding When writing software or analytical code it is important that others and your future.
Web Service What exactly are Web Services? To put it quite simply, they are yet another distributed computing technology (like CORBA, RMI, EJB, etc.).
Professional Informatics & Quality Assurance Software Lifecycle Manager „Tools that are more a help than a hindrance”
Create with SharePoint 2010 Jen Dodd Sr. Solutions Consultant
Good practice in Research Data Management Module 6: Tools, training and support.
SCAPE Andy Jackson The British Library SCAPEdev1 AIT, Vienna - 6 th – 7 th June 2011 PC Integration Plan First SCAPE Developers’ Workshop.
Testing Tools using Visual Studio Randy Pagels Sr. Developer Technology Specialist Microsoft Corporation.
©2013 Lavastorm Analytics. All rights reserved.1 Lavastorm Analytics Engine 5.0 New Feature Overview.
Web Services Architecture1 - Deepti Agarwal. Web Services Architecture2 The Definition.. A Web service is a software system identified by a URI, whose.
14/11/11 Taverna Roadmap Shoaib Sufi myGrid Project Manager.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
SCAPE Scalable Preservation Environments. 2 Its all about scalability! Scalable services for planning and execution of institutional preservation strategies.
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
The DPubS Development Project: Building an Open Source Electronic Publishing System David Ruddy Cornell University Library.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
SCAP E SCAPE Project EU project aimed at building a scalable platform for planning and execution of computation intensive processes for ingestion or migration.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Introduction to soarchitect. agenda SOA background and overview transaction recorder summary.
Moby Web Services Iván Párraga García MSc on Bioinformatics for Health Sciences May 2006.
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
WHIP - Workflow Hosted in Portals Kurt Mueller and Andrew Harrison School of Computer Science, Cardiff And Ian Taylor School of Computer Science, Cardiff.
Introduction to the sessions & structure of the Hackathon Paul Wheatley British Library / OPF / DPC.
SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library Building Scalable Environments Technologies and SCAPE Platform.
CSCE 315 – Programming Studio Spring Goal: Reuse and Sharing Many times we would like to reuse the same process or data for different purpose Want.
1 G52IWS: Web Services Chris Greenhalgh. 2 Contents The World Wide Web Web Services example scenario Motivations Basic Operational Model Supporting standards.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
The Integration Problem Point-to-point integrations don’t scale Vendor-specific integrations lock you in Over time, the costs of the current set of integrations.
Mike Hildreth DASPOS Update Mike Hildreth representing the DASPOS project 1.
RSS Interfaces and Standards Chander Iyer. Really Simple Syndication (RSS) Web data format providing users with frequently updated content. Make a collection.
1 Service Oriented Architecture SOA. 2 Service Oriented Architecture (SOA) Definition  SOA is an architecture paradigm that is gaining recently a significant.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
Ansible and Ansible Tower 1 A simple IT automation platform November 2015 Leandro Fernandez and Blaž Zupanc.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
By Jeremy Burdette & Daniel Gottlieb. It is an architecture It is not a technology May not fit all businesses “Service” doesn’t mean Web Service It is.
SCAPE Andy Jackson The British Library SCAPEdev1 AIT, Vienna - 6 th – 7 th June 2011 Welcome First SCAPE Developers’ Workshop.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
The future of Delft-FEWS
CI Updates and Planning Discussion
Chris Menegay Sr. Consultant TECHSYS Business Solutions
SOA (Service Oriented Architecture)
Research Data Context Preservation in SCAPE
Digital Measures Replacement
Web services, WSDL, SOAP and UDDI
Digital Preservation Planning:
Middleware, Services, etc.
CEA Experiences Paul Harrison ESO.
Integrated Statistical Production System WITH GSBPM
Presentation transcript:

PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading

PLANETS A big project to build digital preservation tools...

OPF’s Challenge The Open Planets Foundation was set up to sustain the PLANETS outputs into the future. –But the tools are Numerous, often complex, & of mixed quality/maturity Require complex technology stacks (JEE) –So, how do we make the code sustainable? Selection, modularisation, simplification Aim for a flexible suite of modular tools, rather than a monolithic system

SCAPE Many PLANETS partners –Including OPF Many new partners too Driven by data –Web archiving, science data, large-scale Cluster computing for scale –Based on the HADOOP platform

PLATO

The PLANETS Testbed

The PLANETS Testbed: Too Many Good Ideas In One Place Designing experiments –Web GUI for complex workflows Running experiments –All services hosted centrally, plus test corpora Analysing the results –Per-experiment automated & manual analysis –Multi-experiment aggregation & data mining Sharing all of the above

Re-imagining The PLANETS Testbed: A Modular Approach Use separate tools in each role –Experiment Design –Execution –Analysis Publish results from each –Loosely coupled instead of all-in-one i.e. sharing is built into the design

Experiment Design: SCAPE Workflows In Taverna As part of SCAPE

Experiment Design Support: SCAPE Service Registry

Experiment Design Support: OPF Shared Test Corpora Simple collections accessed over HTTP –No special browser software required Publicly hosted by HATII –May also be mirrored by OPF members Stabilise corpora from Planets –Adsorb corpora from SCAPE & elsewhere Look for Open Source CMS/Annotation tools –Layer on top of HTTP collections

Experiment Design Support: Sharing & Publishing Via myExperiment

Experiment Execution Support: SCAPE’s Lightweight Tool Wrapping PIT: Preservation-action Invocation Tool –Uses XML ‘tool specification’ documents that describe preservation actions Command-line templates, Java classes, PLANETS/SCAPE web services, etc –Built to be shared Can be published via, e.g. myExperiment Should lead to more reproducible results –Re-using PLANETS interoperability code

Experiment Execution: Multi-platform Tool & Workflow Invocation Shared tool specifications make multi-platform execution easier –From the command line –From within Taverna –From the SCAPE cluster platform –From a simplified web interface Run local-first, remote/service as needed Collect results in a standard form, using Testbed code

Experiment Execution: Publishing Experimental Results Via REF OPF Results Evaluation Framework: REF –Hard-coded experiments of common interest Can run the experiment automatically –Publishes results as linked data Built by Dave Tarrant, based on P2 format registry –Will come up again in the Identification session –SCAPE aims to publish much more data

Analysing Results: Linked Data & Future Plans REF allows data to be inspected –Concentrating on collecting data at present Will expose SPARQL endpoint for data queries –Analysis, visualisation can be build upon that Please add analysis Issues for your Datasets and preservation processes to the wiki! –e.g. what graphs and statistics would be useful?

Summary PLATO –SCAPE will add Preservation Watch & more The PLANETS Testbed –Re-imagined as a gateway to a complementary suite of preservation tools and data services –SCAPE leveraging work from Taverna, IMPACT Development driven by user needs –SCAPE Scenarios, AQuA/Hackathon Issues