SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library Building Scalable Environments Technologies and SCAPE Platform.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
University of Illinois Visualizing Text Loretta Auvil UIUC February 25, 2011.
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 SWE Introduction to Software Engineering Lecture 22 – Architectural Design (Chapter 13)
A Framework for Distributed Preservation Workflows Rainer Schmidt AIT Austrian Institute of Technology iPres 2009, Oct. 5, San.
A Service for Data-Intensive Computations on Virtual Clusters Rainer Schmidt, Christian Sadilek, and Ross King Intensive 2009,
SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library The SCAPE Platform Overview.
Introduction to Big Data and Hadoop Name Title Microsoft Corporation.
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
Workflows Information Flows Prof. Silvia Olabarriaga Dr. Gabriele Pierantoni.
February Semantion Privately owned, founded in 2000 First commercial implementation of OASIS ebXML Registry and Repository.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Microsoft Confidential - Signed NDA Required Windows Azure Executive Vision and Roadmap NAME TITLE Microsoft Corporation.
SCAPE Dr. Ross King AIT Austrian Institute of Technology GmbH APA Conference Frascati, November 7, 2012 SCAPE Scalable Preservation Tools and Infrastructure.
German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.
DISTRIBUTED COMPUTING
Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox
Database Laboratory Regular Seminar TaeHoon Kim.
PHP With Oracle 11g XE By Shyam Gurram Eastern Illinois University.
Extending ArcGIS for Server
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 07. Review Architectural Representation – Using UML – Using ADL.
SCAPE Dr. Rainer Schmidt AIT Austrian Institute of Technology GmbH APA 2011 Conference London, 8-9 November, 2011 The SCAPE Project Overview, Objectives,
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Module 19 Managing Multiple Servers. Module Overview Working with Multiple Servers Virtualizing SQL Server Deploying and Upgrading Data-Tier Applications.
SCAPE Scalable Preservation Environments. 2 Its all about scalability! Scalable services for planning and execution of institutional preservation strategies.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Designing Persistency Delos NoE, Preservation Cluster Workshop: Persistency in Digital Libraries 14. February 2006, Oxford Internet Institute.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
SCAP E SCAPE Project EU project aimed at building a scalable platform for planning and execution of computation intensive processes for ingestion or migration.
Introduction Infrastructure for pervasive computing has many challenges: 1)pervasive computing is a large aspect which includes hardware side (mobile phones,portable.
DAME: A Distributed Diagnostics Environment for Maintenance Duncan Russell University of Leeds.
1 Apache Spark and Its Role in the Enterprise Data Hub Mike Olson, Chief Strategy Officer,
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
1 Qualitative Reasoning of Distributed Object Design Nima Kaveh & Wolfgang Emmerich Software Systems Engineering Dept. Computer Science University College.
Module 1: Getting Started. Introduction to.NET and the.NET Framework Exploring Visual Studio.NET Creating a Windows Application Project Overview Use Visual.
Unit 2 Architectural Styles and Case Studies | Website for Students | VTU NOTES | QUESTION PAPERS | NEWS | RESULTS 1.
Streamflow - Programming Model for Data Streaming in Scientific Workflows Chathura Herath.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
SCAPE Rainer Schmidt SCAPE Information Day May 5 th, 2014 Österreichische Nationalbibliothek The SCAPE Platform Overview.
May06-11: ISEAGE Attack Tool Repository and Player Jeremy Brotherton, Timothy Hilby, Brett Mastbergen, Jasen Stoeker.
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
Software Engineering Chapter: Computer Aided Software Engineering 1 Chapter : Computer Aided Software Engineering.
Architecture & Cybersecurity – Module 3 ELO-100Identify the features of virtualization. (Figure 3) ELO-060Identify the different components of a cloud.
PhD Dissertation Defense Scaling Up Machine Learning Algorithms to Handle Big Data BY KHALIFEH ALJADDA ADVISOR: PROFESSOR JOHN A. MILLER DEC-2014 Computer.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
CERES-2012 Deliverables Architecture and system overview 21 November 2011 Updated: 12 February
Steven Adler Enterprise Technology Strategist Microsoft EMEA.
1 Server Business Logic & OAuth Beta Overview October 4, 2010 Alan Hantke Product Development Server Business Logic Intuit Partner Platform Diane Weiss.
CANARIE Developer’s Workshop Vancouver, BC March 2014.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
SCAPE Andy Jackson The British Library SCAPEdev1 AIT, Vienna - 6 th – 7 th June 2011 Welcome First SCAPE Developers’ Workshop.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Accessing the VI-SEEM infrastructure
Data Platform and Analytics Foundational Training
Deployment of Flows Loretta Auvil
Research Data Context Preservation in SCAPE
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Data Platform and Analytics Foundational Training
Module 1: Introduction to Business Intelligence and Data Modeling
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
Module 10: Implementing Managed Code in the Database
Demo for Partners and Customers
Mulesoft Anypoint Connector for AS/400 and Web Transaction Framework
Presentation transcript:

SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library Building Scalable Environments Technologies and SCAPE Platform

SCAlable Preservation Environments SCAPE 2 Motivation Increasing amount of data in data centers and memory institutions. Cannot be handled using traditional environments like databases or server facilities. Institutions require ability to process large and complex data sets in preservation scenarios Examples are data migration, information extraction, quality assurance. Goal is to take advantage of data-intensive computing technologies for digital preservation.

SCAlable Preservation Environments SCAPE 3 What we will show you Example Scenarios from the SCAPE Testbed and how they are formalized using Workflow Technology Introduction and hands-on exercise using the involved preservation tools. Overview of the SCAPE Platform, its underlying technologies, preservation services, and how to set-up. Creating scalable workflows and deploy them on the platform. Execute SCAPE workflows using a virtual machine environment as well as on a demonstration cluster.

SCAlable Preservation Environments SCAPE 4 Workflows in this Context Formalized (and repeatable) processes/experiments consisting of one or more activities interpreted by a workflow engine. Usually modeled as DAGs based on control-flow and/or data-flow logic. Workflow engine functions as a coordinator/scheduler that triggers the execution of the involved activities May be performed by a desktop, on server-sided component, or both. Example workflow engines are Taverna workbench, Taverna server, and Apache Oozie. Used for experimentation & research, SOA support, Hadoop integration.

SCAlable Preservation Environments SCAPE 5 Challenges in SCAPE Providing means that aid workflow developers in parallelizing different scenarios. Depends a lot on nature of the data and workflow Handling the interaction between external tools and MapReduce programs. Interaction of the execution environment with data sources and sinks, in particular with repositories. Interfacing with preservation planning and watch tools including semantic search, reporting. Maintaining a central infrastructure and providing guidance for deploying local instances in different institutional settings.