Download presentation
Presentation is loading. Please wait.
Published byNaomi Webb Modified over 9 years ago
1
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul, Koréa, 11 September 2006 Design and experimentations of an efficient data management service for NES architectures
2
2Outline Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
3
3 Introduction: the NES context Example: antenna positioning
4
4 Introduction: the NES context Exec (EXTRACTION, img1,img2) Agent (Broker) RPC-based model Servers (provide services) Client
5
5 Introduction: the NES context Exec (ANTENNA, img3) Agent Data can be reused for further computations
6
6 Introduction: the NES context Exec (EXTRACTION, img1,img2) Agent It is necessary to allow the storage of some data Data persistency
7
7 Introduction: the NES context Exec (ANTENNA, &img3) Agent It is necessary to allow the storage of some data Data persistency
8
8 Introduction: the NES context Exec(ANTENNA, &img3) Exec(RENDU,&img3) Exec(ANTENNA,&img3) Agent It is necessary to take advantage of parallelism due to independant tasks Data replication
9
9Goal To propose a data management service for NES architectures which implements data persistency and data replication concepts in the most transparent way for end-users VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
10
10Outline Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
11
11 Related work: non-NES architectures Data Grid context Separating data physical and logical view European Data Grid… Grid Computing context Large number of widely distributed nodes GASS, LegionFS… Stork Pre-placement tool Generally coupled with meta-scheduler Concepts VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
12
12 Related work: non-NES architectures Mainly storage and system oriented Difficult to adapt to NES environments Data transfers are explicitely performed at the client level Lack of transparency VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Drawbacks
13
13 Related work: NES architectures Decreasing network traffic Between clients and servers Ensuring that no unnecessary data are transmitted NetSolve Request Sequencing Distributed Storage Infrastructure (DSI) Drawbacks Data management is performed for only one computation sequence Data transfers are explicit at client level VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Concepts
14
14Outline Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
15
15Issues Replicas consistency For update operations Do all the replicas have to be updated ? Or all the replicas are independant copies ? Data Storage To store data as close as possible to servers Physical limitations of storage resources Security Secure access policy Data can be shared access rights VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 A NES data management service must address the following issues:
16
16Issues Data localization For data item stored inside the platform To find where a data item is stored Data identification A data item must be fully identified a client does not have to know where its data are stored Data handle = unique reference to a data item Data redistribution Bandwith is better between servers than between clients and servers Move data between computational servers VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 A NES data management service must address the following issues:
17
17Outline Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
18
18 The data management service: DTM Data Tree Manager (DTM) Distributed as a part of the DIET platform Flexible enough to be implemented in other platform Distributed Interactive Engineering Toolbox (DIET) NES CORBA-based platform Hierarchical architecture Master and Local Agents Performance forecasting tool (FAST) VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Basics
19
19 The data management service: DTM VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Architecture
20
20 The data management service: DTM The Logical Data Manager It manages a list of tuples (data handle, owners) data present in its sub-tree It provides the localization knowledge The Physical Data Manager It manages a list of persistent data It stores data and provides them to its server It informs its parent when update operations (add, move, delete) occur VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Components
21
21 The data management service: DTM The Data Mover It provides mechanisms for data transfers between Data Managers Data transfer management and data recording are separated Integration of different transfer protocols: GridFTP, RFT… The Replica Manager It sends replication orders to Data Mover It allows the choice of the best replica to be transferred (NWS tool) It uses a distributed protocol no distinction between the original data and its replicas Replicas are read-only but the architecture allows the implementation of any consistency technique VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Components
22
22 The data management service: DTM Communiation occurs between DIET and DTM components Low bandwith consumption for data management Updates operations are limited to sub-trees Again low bandwith consumption for data management DTM minimizes the number of data copy operations (CORBA) Crucial for large data VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Architecture advantages
23
23 The data management service: DTM Only end-users have the knowledge of the application they submit Only end-users have the knowledge of the data that must be managed The persistence mode It allows to choose if data must be persistent or not The data handle End-users do not need to know where data are stored The API Based on the profile concept Problem name + data or date handle + persistence mode VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 The end-user point of view
24
24Outline Introduction: the NES context Related work Motivations and issues The data management service Experimental results Conclusion and future work VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
25
25 Experimental results Previous experiments show: The good scalability and low overhead of DTM The following tests show: The relevance of the data persistency approach The performances of the data replication policy Platform: DTM deployed over two laboratories far from 100 km VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Description
26
26 Experimental results 1 MA - 2 LA and 2 servers locally interconnected (100 Mbits/s) 1 client in the remote site (16 Mbits/s) Linear algebra application VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Data persistence benefits
27
27 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Data persistence benefits
28
28 Experimental results 1 MA - 6 servers Computing the occurrences number of a letter in a file Synchronous requests are sent to the platform When data item are not present they are replicated VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Replication benefits
29
29 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Replication benefits
30
30 Experimental results Medical imagery application Input files (from 0.1 Mbytes up to 500 Mbytes) Several extractions parameters are applied Result = jpeg file VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Use case: Dividing Cubes
31
31 Experimental results VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006 Use case: Dividing Cubes
32
32Conclusion Feasability for NES environments Fully implemented and integrated in DIET since version 1.1 Promising experimental results Normalisation proposition (GGF) VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
33
33 Future work Finalization of the GGF proposal Tests on the Grid5000 platform Fault tolerance Integration of DTM in data grids VLDB - DMG Workshop - Séoul, Koréa - 11 September 2006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.