Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,

Similar presentations


Presentation on theme: "1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,"— Presentation transcript:

1 1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul, Koréa, 11 September 2006 Design and experimentations of an efficient data management service for NES architectures

2 2Outline  Introduction: the NES context  Related work  Motivations and issues  The data management service  Experimental results  Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

3 3 Introduction: the NES context Example: antenna positioning

4 4 Introduction: the NES context Exec (EXTRACTION, img1,img2) Agent (Broker) RPC-based model Servers (provide services) Client

5 5 Introduction: the NES context Exec (ANTENNA, img3) Agent Data can be reused for further computations

6 6 Introduction: the NES context Exec (EXTRACTION, img1,img2) Agent It is necessary to allow the storage of some data  Data persistency

7 7 Introduction: the NES context Exec (ANTENNA, &img3) Agent It is necessary to allow the storage of some data  Data persistency

8 8 Introduction: the NES context Exec(ANTENNA, &img3) Exec(RENDU,&img3) Exec(ANTENNA,&img3) Agent It is necessary to take advantage of parallelism due to independant tasks  Data replication

9 9Goal To propose a data management service for NES architectures which implements data persistency and data replication concepts in the most transparent way for end-users VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

10 10Outline  Introduction: the NES context  Related work  Motivations and issues  The data management service  Experimental results  Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

11 11 Related work: non-NES architectures  Data Grid context  Separating data physical and logical view  European Data Grid…  Grid Computing context  Large number of widely distributed nodes  GASS, LegionFS…  Stork  Pre-placement tool  Generally coupled with meta-scheduler Concepts VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

12 12 Related work: non-NES architectures  Mainly storage and system oriented  Difficult to adapt to NES environments  Data transfers are explicitely performed at the client level  Lack of transparency VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Drawbacks

13 13 Related work: NES architectures  Decreasing network traffic  Between clients and servers  Ensuring that no unnecessary data are transmitted  NetSolve  Request Sequencing  Distributed Storage Infrastructure (DSI)  Drawbacks  Data management is performed for only one computation sequence  Data transfers are explicit at client level VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Concepts

14 14Outline  Introduction: the NES context  Related work  Motivations and issues  The data management service  Experimental results  Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

15 15Issues  Replicas consistency  For update operations  Do all the replicas have to be updated ?  Or all the replicas are independant copies ?  Data Storage  To store data as close as possible to servers  Physical limitations of storage resources  Security  Secure access policy  Data can be shared  access rights VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 A NES data management service must address the following issues:

16 16Issues  Data localization  For data item stored inside the platform  To find where a data item is stored  Data identification  A data item must be fully identified  a client does not have to know where its data are stored  Data handle = unique reference to a data item  Data redistribution  Bandwith is better between servers than between clients and servers  Move data between computational servers VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 A NES data management service must address the following issues:

17 17Outline  Introduction: the NES context  Related work  Motivations and issues  The data management service  Experimental results  Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

18 18 The data management service: DTM  Data Tree Manager (DTM)  Distributed as a part of the DIET platform  Flexible enough to be implemented in other platform  Distributed Interactive Engineering Toolbox (DIET)  NES CORBA-based platform  Hierarchical architecture  Master and Local Agents  Performance forecasting tool (FAST) VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Basics

19 19 The data management service: DTM VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Architecture

20 20 The data management service: DTM  The Logical Data Manager  It manages a list of tuples (data handle, owners)  data present in its sub-tree  It provides the localization knowledge  The Physical Data Manager  It manages a list of persistent data  It stores data and provides them to its server  It informs its parent when update operations (add, move, delete) occur VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Components

21 21 The data management service: DTM  The Data Mover  It provides mechanisms for data transfers between Data Managers  Data transfer management and data recording are separated  Integration of different transfer protocols: GridFTP, RFT…  The Replica Manager  It sends replication orders to Data Mover  It allows the choice of the best replica to be transferred (NWS tool)  It uses a distributed protocol  no distinction between the original data and its replicas  Replicas are read-only but the architecture allows the implementation of any consistency technique VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Components

22 22 The data management service: DTM  Communiation occurs between DIET and DTM components  Low bandwith consumption for data management  Updates operations are limited to sub-trees  Again low bandwith consumption for data management  DTM minimizes the number of data copy operations (CORBA)  Crucial for large data VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Architecture advantages

23 23 The data management service: DTM  Only end-users have the knowledge of the application they submit  Only end-users have the knowledge of the data that must be managed  The persistence mode  It allows to choose if data must be persistent or not  The data handle  End-users do not need to know where data are stored  The API  Based on the profile concept  Problem name + data or date handle + persistence mode VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 The end-user point of view

24 24Outline  Introduction: the NES context  Related work  Motivations and issues  The data management service  Experimental results  Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

25 25 Experimental results  Previous experiments show:  The good scalability and low overhead of DTM  The following tests show:  The relevance of the data persistency approach  The performances of the data replication policy  Platform: DTM deployed over two laboratories far from 100 km VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Description

26 26 Experimental results  1 MA - 2 LA and 2 servers locally interconnected (100 Mbits/s)  1 client in the remote site (16 Mbits/s)  Linear algebra application VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Data persistence benefits

27 27 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Data persistence benefits

28 28 Experimental results  1 MA - 6 servers  Computing the occurrences number of a letter in a file  Synchronous requests are sent to the platform  When data item are not present they are replicated VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Replication benefits

29 29 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Replication benefits

30 30 Experimental results  Medical imagery application  Input files (from 0.1 Mbytes up to 500 Mbytes)  Several extractions parameters are applied  Result = jpeg file VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Use case: Dividing Cubes

31 31 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Use case: Dividing Cubes

32 32Conclusion  Feasability for NES environments  Fully implemented and integrated in DIET since version 1.1  Promising experimental results  Normalisation proposition (GGF) VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006

33 33 Future work  Finalization of the GGF proposal  Tests on the Grid5000 platform  Fault tolerance  Integration of DTM in data grids VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006


Download ppt "1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,"

Similar presentations


Ads by Google