Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology.

Slides:



Advertisements
Similar presentations
What is Cloud Computing? Massive computing resources, deployed among virtual datacenters, dynamically allocated to specific users and tasks and accessed.
Advertisements

What is Cloud Computing? Massive computing resources, deployed among virtual datacenters, dynamically allocated to specific users and tasks and accessed.
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
A MapReduce Workflow System for Architecting Scientific Data Intensive Applications By Phuong Nguyen and Milton Halem phuong3 or 1.
Distributed Data Processing
A Local-Optimization based Strategy for Cost-Effective Datasets Storage of Scientific Applications in the Cloud Many slides from authors’ presentation.
SLA-Oriented Resource Provisioning for Cloud Computing
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
A Cost-Effective Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Swinburne University.
“Turn you Smart phone into Business phone “
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 4.
ORACLE APPLICATION SERVER BY PHANINDER SURAPANENI CIS 764.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Applied Architectures Eunyoung Hwang. Objectives How principles have been used to solve challenging problems How architecture can be used to explain and.
Software Engineering for Cloud Computing Rao, Feng 04/27/2011.
GRID COMPUTING: REPLICATION CONCEPTS Presented By: Payal Patel.
CLOUD COMPUTING. A general term for anything that involves delivering hosted services over the Internet. And Cloud is referred to the hardware and software.
* Who we are? * Animation Industry, Challenges… * What is Render Cloud Farm? * Render Cloud Farm for Whom? * Scope of Blender? * Types of Rendering farms.
3 Cloud Computing.
1 Introduction to Cloud Computing Jian Tang 01/19/2012.
BCS, The Chartered Institute for IT Mauritius 6 th November 2012.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
1 Copyright © 2004, Oracle. All rights reserved. Introduction to Oracle Forms Developer and Oracle Forms Services.
DISTRIBUTED DATA FLOW WEB-SERVICES FOR ACCESSING AND PROCESSING OF BIG DATA SETS IN EARTH SCIENCES A.A. Poyda 1, M.N. Zhizhin 1, D.P. Medvedev 2, D.Y.
Workflow sharing and integration services by the ER-flow project on behalf of the ER-flow consortium EGI Community Forum, Manchester,
Software Architecture
DISTRIBUTED COMPUTING
Cloud computing.
Cloud Computing Nathan Bosen Kelsie Cagampang MIS 424 May 29, 2013.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Overview of Cloud Computing Sven Rosvall ACCU
Configuration Management (CM)
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at.
Xiao Liu CS3 -- Centre for Complex Software Systems and Services Swinburne University of Technology, Australia Key Research Issues in.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
Performance Evaluation of Image Conversion Module Based on MapReduce for Transcoding and Transmoding in SMCCSE Speaker : 吳靖緯 MA0G IEEE.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
DAME: A Distributed Diagnostics Environment for Maintenance Duncan Russell University of Leeds.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
Company small business cloud solution Client UNIVERSITY OF BEDFORDSHIRE.
Introduction to The Storage Resource.
Geospatial Systems Architecture
Architecture & Cybersecurity – Module 3 ELO-100Identify the features of virtualization. (Figure 3) ELO-060Identify the different components of a cloud.
Martijn Vlek Sr Director Fusion Middleware Oracle EMEA
Windows Azure Enterprises. Table of contents Introduction and Cloud Platform Basics Azure infrastructure scenarios Scenarios for Custom Applications on.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Microsoft Cloud Solution.  What is the cloud?  Windows Azure  What services does it offer?  How does it all work?  How to go about using it  Further.
CLOUD COMPUTING WHAT IS CLOUD COMPUTING?  Cloud Computing, also known as ‘on-demand computing’, is a kind of Internet-based computing,
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Slide 3.1 David Chaffey, E-Business & E-Commerce Management, 5 th Edition, © Marketing Insights Limited 2012 Chapter 3 Managing digital business infrastructure.
Cloud Computing 3. TECHNOLOGY GUIDE 3: Cloud Computing 2 Copyright John Wiley & Sons Canada.
Cloud Computing HOW PROFITABLY CLOUD COMPUTING IS TO YOUR BUSINESS?
Overview on the work performed during EPIKH Training Faiza MEDJEK /INFN, CATANIA 1.
Introduction to Oracle Forms Developer and Oracle Forms Services
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Introduction to Oracle Forms Developer and Oracle Forms Services
UH-IaaS Infrastructure Cloud Trond H. Amundsen, GSD, USIT.
Introduction to Oracle Forms Developer and Oracle Forms Services
Recap: introduction to e-science
Servicenow Overview ServiceNow provides cloud computing, including platform-as-a-service (PaaS) enterprise service, human resources, finance, marketing,
Introduction to D4Science
3 Cloud Computing.
Done by:Thikra abdullah
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology

Outline >Cloud Computing & Cloud Workflow Systems –Introduction to cloud workflow systems. A brief overview of grid workflow systems. >Data Management in Cloud Workflow Systems –New features and research issues >Cloud Computing Environment and SwinDeW-C –Our simulation environment and cloud workflow system

>Cloud Computing & Cloud Workflow Systems

Cloud Computing >Some new features of cloud computing –Large data centres with cheap hardware –Virtualisation –Internet based and SOA SaaS, PaaS, IaaS –Market driven and cost model >Research of cloud computing has emerged in many areas –Data mining, Database, Parallel computing & Scientific application, Content delivery

Cloud Workflow Systems >Grid workflow systems –Kepler, Pegasus, Taverna, MOTEUR, Triana, ASKALON –Gridbus, GridFlow >Build-time: focus on data modelling. –Kepler: actor-oriented data modelling. Taverna - Sculf. ASKALON - AGWL >Runtime: adopt Data Grid system –Grid DataFarm, GDMP, GridDB, SRB, RLS (P-RLS), GSB, DaltOn

Cloud Workflow Systems >Architecture –Based on Internet –Platform as a Service –More distributed

>Data Management in Cloud Workflow Systems

Data Management in Cloud Workflow Systems >New features and challenges –Independent of users and automatic –Cost driven computation cost, storage cost, data transfer cost –Data dependency Task – data, data – data, derivation >Some research issues –Data partition, placement, replication, synchronisation, provenance, catalogue, meta-data, consistence, reduction, storage, movement, etc.

Data Placement in Cloud Workflow Systems >Data Placement: to decide where to store the application data in the distributed data centres >Aims: –Reduce data movement –Reduce task waiting time >Strategy: –Data dependency: dataset – dataset –Build-time: existing data, runtime: generated data (also intermediate data)

Data Replication in Cloud Workflow Systems >Data replication: for one dataset, store several copies in different places (data centres) >Aims: –Increase data security –Fast data access –Reduce data movement >Strategy: –Dynamic replication.

Intermediate Data Storage in Cloud Workflow Systems >Intermediate data storage is especially importance in scientific workflows >Aim: –Reduce system cost >Strategy: –Intermediate data can be regenerated with data provenance information –Selectively store some key intermediate datasets

>Cloud computing environment and SwinDeW-C

Simulation Cloud

Web Portal

Related key system components of SwinDeW-C

End >Questions?