Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology.

Similar presentations


Presentation on theme: "Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology."— Presentation transcript:

1 Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology

2 Outline >Cloud Computing & Cloud Workflow Systems –Introduction to cloud workflow systems. A brief overview of grid workflow systems. >Data Management in Cloud Workflow Systems –New features and research issues >Cloud Computing Environment and SwinDeW-C –Our simulation environment and cloud workflow system

3 >Cloud Computing & Cloud Workflow Systems

4 Cloud Computing >Some new features of cloud computing –Large data centres with cheap hardware –Virtualisation –Internet based and SOA SaaS, PaaS, IaaS –Market driven and cost model >Research of cloud computing has emerged in many areas –Data mining, Database, Parallel computing & Scientific application, Content delivery

5 Cloud Workflow Systems >Grid workflow systems –Kepler, Pegasus, Taverna, MOTEUR, Triana, ASKALON –Gridbus, GridFlow >Build-time: focus on data modelling. –Kepler: actor-oriented data modelling. Taverna - Sculf. ASKALON - AGWL >Runtime: adopt Data Grid system –Grid DataFarm, GDMP, GridDB, SRB, RLS (P-RLS), GSB, DaltOn

6 Cloud Workflow Systems >Architecture –Based on Internet –Platform as a Service –More distributed

7 >Data Management in Cloud Workflow Systems

8 Data Management in Cloud Workflow Systems >New features and challenges –Independent of users and automatic –Cost driven computation cost, storage cost, data transfer cost –Data dependency Task – data, data – data, derivation >Some research issues –Data partition, placement, replication, synchronisation, provenance, catalogue, meta-data, consistence, reduction, storage, movement, etc.

9 Data Placement in Cloud Workflow Systems >Data Placement: to decide where to store the application data in the distributed data centres >Aims: –Reduce data movement –Reduce task waiting time >Strategy: –Data dependency: dataset – dataset –Build-time: existing data, runtime: generated data (also intermediate data)

10 Data Replication in Cloud Workflow Systems >Data replication: for one dataset, store several copies in different places (data centres) >Aims: –Increase data security –Fast data access –Reduce data movement >Strategy: –Dynamic replication.

11 Intermediate Data Storage in Cloud Workflow Systems >Intermediate data storage is especially importance in scientific workflows >Aim: –Reduce system cost >Strategy: –Intermediate data can be regenerated with data provenance information –Selectively store some key intermediate datasets

12 >Cloud computing environment and SwinDeW-C

13 Simulation Cloud

14 Web Portal

15 Related key system components of SwinDeW-C

16 End >Questions?


Download ppt "Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology."

Similar presentations


Ads by Google