The EU DataGrid Data Management

The EU DataGrid Data Management
The European DataGrid Project Team

EDG Tutorial Overview Information Service Workload Management Services
Data Management Services Networking Information Service Fabric Management

Overview Data Management Issues Main Components Replica Manager
Replica Location Service Replica Metadata Catalog Replica Optimization Service In this lecture you will learn about data management issues within a Data Grid and which are the main components to cope with these issues. Subsequently, four data management tools which are currently deployed on the EDG testbed are presented.

File Management Motivation
Site A Site B Storage Element A Storage Element B File Transfer File Management on the Grid What file management functionalities are expected? The promise of the Grid is that the user files will be managed by the Grid services. The user’s files are identified by a name that is known to the user, the Logical File Name, as shown in this slide (File A,B,C,D,X,Y). A file may be replicated to many Grid sites, in the example shown on the slide, file A and B are available on both sites, C,D,X,Y are only available on one of the two sites. In the following we will argue for the file management functionalities that are necessary for the DataGrid users (HEP, EO, BIO) in a Grid environment. In order to make files available on more than one site, we need File Transfer services between sites. The Storage Element is defined (from the point of view of Data Management) as an abstract storage service that ‘speaks’ several file transfer protocols, for example FTP, HTTP, GridFTP, RFIO. The File Transfer service is a service offered by the Storage Element, the default being GridFTP for DataGrid. File A File X File A File C File B File Y File B File D

Replica Catalog: Map Logical to Site files Site A Site B Storage Element A Storage Element B File Transfer If files are available from more than one storage site (because they have been transferred using the File Transfer service), we need a service that keeps track of them. This service is the Replica Catalog, it contains the mappings of the Logical File Names to their Site File Names, i.e. the name by which the file can be accessed through the Storage Element in a given site. File A File X File A File C File B File Y File B File D

Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Site A Site B Storage Element A Storage Element B File Transfer Next, we need to provide a service that makes a choice between the available replicas based on a set of information available from the Grid: network bandwith, storage element access speed, etc. File A File X File A File C File B File Y File B File D

Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Pre- Post-processing: Prepare files for transfer Validate files after transfer Site A Site B Storage Element A Storage Element B File Transfer Many file types need some kind of pre- and/or post-processing before and/or after transfer. These steps might be related to data extraction, conversion, encryption for pre-processing and importation, conversion, decryption, etc for post-prosessing. Also, as a Quality-of-Service step the file may be validated after transfer as a post-processing step (by checking its checksum, for example). This service should be customizable by the applications that use certain file types. File A File X File A File C File B File Y File B File D

Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Pre- Post-processing: Prepare files for transfer Validate files after transfer Replication Automation: Data Source subscription Site A Site B Storage Element A Storage Element B File Transfer Automated replication based on a subscription model is also desirable especially for HEP, where the data coming from the LHC will have to be distributed automatically over the world. The subscription semantics may be defined through file name patterns, data locations, etc. This service should also be customizable by the Virtual Organizations setting it up. File A File X File A File C File B File Y File B File D

Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Pre- Post-processing: Prepare files for transfer Validate files after transfer Replication Automation: Data Source subscription Site A Site B Load balancing: Replicate based on usage Storage Element A Storage Element B File Transfer There is another desirable automatic replication mechanism that should be available through a service, which is automated replication to balance the access load on ‘popular’ files. This service needs to store and analyze user access patterns on the files available through the Grid. Replication is triggered by rules that are applied to the usage patterns on the existing files. File A File X File A File C File B File Y File B File D

Replica Manager: ‘atomic’ replication operation single client interface orchestrator
File Management Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Pre- Post-processing: Prepare files for transfer Validate files after transfer Replication Automation: Data Source subscription Site A Site B Load balancing: Replicate based on usage Storage Element A Storage Element B File Transfer The user does not want to be burdened to contact each of these services directly, taking care of the correct order of operations and making sure that errors are caught and handled accordingly. This functionality is provided by the Replica Manager which acts as the single interface to all replication services. As an additional QoS functionality it should provide transactional integrity of the steps involved in the high-level usages of the underlying services. It should be able to gracefully recover interrupted transfers, catalog updates, pre- and post-processing steps, etc. File A File X File A File C File B File Y File B File D

Replica Manager: ‘atomic’ replication operation single client interface orchestrator
File Management Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Pre- Post-processing: Prepare files for transfer Validate files after transfer Replication Automation: Data Source subscription Site A Site B Load balancing: Replicate based on usage Metadata: LFN metadata Transaction information Access patterns Storage Element A Storage Element B File Transfer A helper service to be used by the File Management services described until now is the metadata service. It should be able to store all metadata associated with Data Management, like attributes on LFNs, patterns for the load balancing service, transaction locks for the replica manager, etc. File A File X File A File C File B File Y File B File D

Security File Management
Replica Manager: ‘atomic’ replication operation single client interface orchestrator File Management Replica Catalog: Map Logical to Site files Replica Selection: Get ‘best’ file Security Pre- Post-processing: Prepare files for transfer Validate files after transfer Replication Automation: Data Source subscription Site A Site B Load balancing: Replicate based on usage Metadata: LFN metadata Transaction information Access patterns Storage Element A Storage Element B File Transfer Across all of these services we need to impose the stamp of Security, which means that the services need to authenticate and authorize their clients. In order to provide a higher level of consistency between the replicas, we require the users to transfer all administrative rights on the files that they register through the Replica Manager. In order to administer their files they therefore will need to go through the Replica Manager again – administering the files directly through the Storage will be denied. This enables the Replica Manager to act for the user on the user’s behalf – automated replication would not be possible without this restriction. File A File X File A File C File B File Y File B File D

Data Management Tools Tools for On EDG Testbed you have Locating data
Copying data Managing and replicating data Meta Data management On EDG Testbed you have Replica Location Service (RLS) Replica Metadata Service (RMC) Repica Optimisation Service (ROS) Replica Manager (RM) RM RLS ROS RMC To cope with these issues you need several tools that allow to find data on the grid, to copy data, to replicate data (I.e. to produce exact copies to get better locality), and, ideally, high level data management tools that hide most of the complexity. In addition, applications may store additional meta data with their files which of course also needs to be managed. On the EDG Testbed we currently provide for this: A replica catalog that stores physical locations of files, I.e. Physical File Names together with a logical identifier, a Logical File Name; Globus tools (globus-url-copy which uses the GridFTP protocol) for secure file transfer; GDMP a data mirroring package that allows to automatically provide data on places that are interested in this data; the EDG Replica Manager that allows to replicate files while keeping the replica catalog consistent; and Spitfire, a frontend to grid enabled data bases for managing meta-data (this tool is not covered in this tutorial).

What is a Replica ? Stockie LollyPop Physical Name on Storage
Logical File Name (LFN) Stockie LollyPop

Replicas are not synchronised
What is a Replica ? Physical Name on Storage Logical File Name (LFN) Replicas are not synchronised Stockie LollyPop

Replication Services: Basic Functionality
Each file has a unique Grid ID. Locations corresponding to the GUID are kept in the Replica Location Service. Users may assign aliases to the GUIDs. These are kept in the Replica Metadata Catalog. Files have replicas stored at many Grid sites on Storage Elements. Replica Metadata Catalog Replica Location Service Replica Manager The Replica Manager provides atomicity for file operations, assuring consistency of SE and catalog contents. Blue: WP2 services Storage Element Storage Element

Higher Level Replication Services
Hooks for user-defined pre- and post-processing for replication operations are available. The Replica Manager may call on the Replica Optimization service to find the best replica among many based on network and SE monitoring. Replica Metadata Catalog Replica Location Service Replica Manager Replica Optimization Service Blue: WP2 services Storage Element Storage Element SE Monitor Network Monitor

Interactions with other Grid components
Virtual Organization Membership Service User Interface or Worker Node Resource Broker Information Service Replica Metadata Catalog Replica Location Service Replica Manager Replica Optimization Service Blue: WP2 services Applications and users interface to data through the Replica Manager either directly or through the Resource Broker. Management calls should never go directly to the SRM. Storage Element Storage Element SE Monitor Network Monitor

Storage Resource Management (1)
Data are stored on disk pool servers or Mass Storage Systems storage resource management needs to take into account Transparent access to files (migration to/from disk pool) File pinning Space reservation File status notification Life time management SRM (Storage Resource Manager) takes care of all these details SRM is a Grid Service that takes care of local storage interaction and provides a Grid interaface to outside world In EDG we originally used the term Storage Elemement now we use the term SRM to refer to the new service

Storage Resource Management (2)
Original SRM design specification: LBL, JNL, FNAL, CERN Support for local policy Each storage resource can be managed independently Internal priorities are not sacrificed by data movement between Grid agents Disk and tape resources are presented as a single element Temporary locking/pinning Files can be read from disk caches rather than from tape Reservation on demand and advance reservation Space can be reserved for registering a new file Plan the storage system usage File status and estimates for planning Provides info on file status Provide estimates on space availability/usage EDG provides one implementation as part of the current software release

Simplified Interaction Replica Manager - SRM
Catalog 1 2 Replica Manager client 6 SRM 3 4 5 6 Storage The Client asks a catalog to provide the location of a file The catalog responds with the name of an SRM The client asks the SRM for the file The SRM asks the storage system to provide the file The storage system sends the file to the client through the SRM or directly

Naming Conventions Logical File Name (LFN)
An alias created by a user to refer to some item of data e.g. “lfn:cms/ /run2/track1” Site URL (SURL) (or Physical File Name (PFN)) The location of an actual piece of data on a storage system e.g. “srm://pcrd24.cern.ch/flatfiles/cms/output10_1” Globally Unique Identifier (GUID) A non-human readable unique identifier for an item of data e.g. “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” Logical File Name 1 Logical File Name 2 Logical File Name n GUID Physical File SURL n Physical File SURL 1

Replica Metadata Catalog (RMC) vs. Replica Location Service (RLS)
Stores LFN-GUID mappings RLS: Stores GUID-SURL mappings RM RLS RMC ROS Logical File Name 1 Logical File Name 2 Logical File Name n GUID Physical File SURL n Physical File SURL 1 RMC RLS

Replica Location Service (RLS)
The Replica Location Service is a system that maintains and provides access to information about the physical location of copies of data files. It is a distributed service that stores mappings between globally unique identifiers of datafiles and the physical identifiers of all existing replicas of these datafiles. Design is a joint collaboration between Globus and EDG-WP2 RM RLS RMC ROS

Replica Location Service RLS
Local Catalogs hold the actual name mappings Remote Indices redirect inquiries to LRCs actually having the file LRCs are configured to send index updates to any number of RLIs Indexes are Bloom Filters The Replica Location Service has two sub-service components: Local Replica Catalogs (LRC) reside at each site close to the actual data store, cataloging only the data that is at the site. So usually there will be one LRC per Storage Element. Replica Location Indices (RLI) are light-weight services that hold Bloom Filter index bitmaps of LRCs. The LRCs can be configured dynamically to send indices to any number of RLIs. The LRC computes a Bloom Filter bitmap that is compressed and sent to each subscribed RLI. Bloom filters are compact data structures for probabilistic representation of a set in order to support membership queries (i.e. queries that ask: “Is element X in set Y?”). This compact representation is the payoff for allowing a small rate of false positives in membership queries; that is, queries might incorrectly recognize an element as member of the set. The ratio of false positives can be optimized by tuning various parameters of the filter. Small false positive rates require more computation time and memory. Acceptable false positive rates are around Hence RLIs answer lookups by responding with the names of the LRCs that might have a copy of the given file. The LRCs also need to be contacted to get a definitive answer. This method is used in Peer-to-Peer networks successfully, up to a certain size of the network. In the EU DataGrid we don’t foresee more than a few dozen sites, at most O(100), for which this algorithm scales still very well.

RLS Components (1) Local Replica Catalog (LRC)
Stores GUID to SURL (PFN) mappings for a single SRM Stores attributes on SURL (PFN), e.g file size, creator Maintains local state independently of other RLS components complete local record of all replicas on a single SRM Many Local Replica Catalogs in a Grid can be configured, for instance one LRC per site or one LRC per SRM Fairly permanent fixture LRC coupled to the SRM, SRM removal is infrequent new LRCs added with addition of new SRMs to a site Local Replica Catalogue (LRC) Stores logical identifier to physical file mappings for a single grid site or Storage Element Typically one LRC per per Storage Element at many Grid sites Complete, locally consistent record of replicas at a single site. Replica location index (RLI) Indexes over all Local Replica Catalogues in a Grid RM RLS ROS RMC LRC RLI

RLS Components (2) Replica Location Index (RLI)
Stores GUID to LRC mappings Distributed index over Local Replica Catalogs in a Grid Receives periodic soft state updates from LRCs information has an associated expiration time LRCs configured to send updates to RLIs (push) Maintains collective state inconsistencies due to the soft state update mechanism Can be installed anywhere not inherently associated with anything Uses Bloom Filter Indexes Local Replica Catalogue (LRC) Stores logical identifier to physical file mappings for a single grid site or Storage Element Typically one LRC per per Storage Element at many Grid sites Complete, locally consistent record of replicas at a single site. Replica location index (RLI) Indexes over all Local Replica Catalogues in a Grid RM RLI RLS LRC RMC ROS

LRC Implementation LRC data stored in a Relational Database
Runs with either Oracle 9i or MySQL Catalogs implemented in Java and hosted in a J2EE application server Tomcat4 or Oracle 9iAS for application server Java and C++ APIs exposed to clients through Apache Axis (Java) and gSoap (C++) Catalog APIs exposed using WSDL Vendor neutral approach taken to allow different deployment options The HEP community is made up of large institutes with the finances to invest in oracle and many that cannot afford to employ orcale expertise we are required to demonstrate our grid data management tools on open source solutions too

RLI Implementation Updates implemented as a push from the LRCs
RLIs less permanent than LRCs Bloom filter updates only implemented O(106) (or more ) entries in an LRC May contain false positives, but no false negatives rate depends on the configutation of the bloom filter Bloom filters stored to disk Impractical to send full LRC lists to multiple RLIs fine for tests not scalable in a production environment Implemented in Java as a web service as for the LRCs Bloom filter - string (any dta structure) that represents some set and supports membership queries - RLI: set of LRCs that contain a GUID

Replica Location Service
Usage Here at NCP Only one Local Replica Catalog (LRC) One Replica Metadata Catalog Replica Location Service LRC RMC pcncp27.ncp.edu.pk

User Interfaces for Data Management
Users are mainly referred to use the interface of the Replica Manager client: Management commands Catalog commands Optimization commands File Transfer commands The services RLS, RMC and ROS provide additional user interfaces Mainly for additional catalog operations (in case of RLS, RMC) Additional server administration commands Should mainly be used by administrators Can also be used the check the availability of a service RM RLS RMC ROS

The Replica Manager Interface – Management Commands
copyAndRegisterFile args: source, dest, lfn, protocol, streams Copy a file into grid-aware storage and register the copy in the Replica Catalog as an atomic operation. replicateFile args: source/lfn, dest, protocol, streams Replicate a file between grid-aware stores and register the replica in the Replica Catalog as an atomic operation. deleteFile args: source/seHost, all Delete a file from storage and unregister it. Example edg-rm --vo=tutor copyAndRegisterFile file:/home/bob/analysis/data5.dat -d lxshare0384.cern.ch This and the next slide show the API functions of the Replica Manager(RM), to be used by clients and applications. Main command line options: -l: logical file name -s: source file name -d: destination file name -c: config file for the replica catalog -e: verbose error output (important for testing!)

The Replica Manager Interface – Catalog Commands (1)
registerFile args: source, lfn Register a file in the Replica Catalog that is already stored on a Storage Element. unregisterFile args: source, guid Unregister a file from the Replica Catalog. listReplicas args: lfn/surl/guid List all replicas of a file. registerGUID args: surl, guid Register an SURL with a known GUID in the Replica Catalog. listGUID args: lfn/surl Print the GUID associated with an LFN or SURL. This and the next slide show the API functions of the Replica Manager(RM), to be used by clients and applications. Main command line options: -l: logical file name -s: source file name -d: destination file name -c: config file for the replica catalog -e: verbose error output (important for testing!)

The Replica Manager Interface – Catalog Commands (2)
addAlias args: guid, lfn Add a new alias to GUID mapping removeAlias args: guid, lfn Remove an alias LFN from a known GUID. printInfo() Print the information needed by the Replica Manager to screen or to a file. getVersion() Get the versions of the replica manager client. This and the next slide show the API functions of the Replica Manager(RM), to be used by clients and applications. Main command line options: -l: logical file name -s: source file name -d: destination file name -c: config file for the replica catalog -e: verbose error output (important for testing!)

The Replica Manager Interface – Optimization Commands
listBestFile args: lfn/guid, seHost Return the 'best' replica for a given logical file identifier. getBestFile args: lfn/guid, seHost, protocol, streams Return the storage file name (SFN) of the best file in terms of network latencies. getAccessCost args: lfn/guid[], ce[], protocol[] Calculates the expected cost of accessing all the files specified by logicalName from each Computing Element host specified by ceHosts. This and the next slide show the API functions of the Replica Manager(RM), to be used by clients and applications. Main command line options: -l: logical file name -s: source file name -d: destination file name -c: config file for the replica catalog -e: verbose error output (important for testing!)

The Replica Manager Interface – File Transfer Commands
copyFile args: soure, dest Copy a file to a non-grid destination. listDirectory args: dir List the directory contents on an SRM or a GridFTP server. This and the next slide show the API functions of the Replica Manager(RM), to be used by clients and applications. Main command line options: -l: logical file name -s: source file name -d: destination file name -c: config file for the replica catalog -e: verbose error output (important for testing!)

Replica Management Use Case
edg-rm copyAndRegisterFile -l lfn:higgs CERN  LYON edg-rm listReplicas -l lfn:higgs edg-rm replicateFile -l lfn:higgs  NIKHEF edg-rm listBestFile -l lfn:higgs  CERN edg-rm getAccessCost -l lfn:higgs CERN NIKHEF LYON edg-rm getBestFile -l lfn:higgs  CERN Replica Management Demo Scenario A user produced a file at CERN that for some reason cannot be put into a CERN Grid Storage (say the site is down for maintenance). So the user will register the file at LYON instead. Therefore the first Grid instance of the file will be at LYON. The user uploads the file into the Grid at LYON using the copyAndRegisterFile command. The listReplicas command produces exactly one line of output – it lists the one file at LYON. The source instance at CERN is at a location that is not known to the Grid. Say the user wants to make this data also accessible at NIKHEF (because his collaborators work at NIKHEF). In order to achieve this, he issues the replicateFile command. Since the user is probably still at CERN, he is interested in which one of the two replicas is the best one for him to access from CERN. The listBestFile command will give him the answer. This command takes the requestor’s reference location as its argument, in this case CERN – but it could be any other site. The network costs can be viewed also simultaneously for many sites. The getAccessCost is designed to be used by the job scheduling service. It takes a list of destination sites (in our case three, CERN, NIKHEF and LYON) and computes the access cost for the best replica with respect to each destination. So this does the same as listBestFile for each site, but the output is more verbose: it also gives the (remote) access cost in seconds in addition to the name of the best file. If the file is local, the access cost will be zero. So for NIKHEF and LYON this cost will be zero, and for CERN the best file will be either at LYON or at NIKHEF depending on the current network metrics. Now the user has realized that the CERN site is actually available again for storage (the maintenance work was completed) so he wants to have a local copy of the file as well. By issuing getBestFile he will copy the best replica to the given destination (CERN in this case). Now there are three replicas, one each at CERN, NIKHEF and LYON. Because the file at LYON is actually never accessed, the user deletes it again using deleteFile. Now listBestFile of course will return the name of the local CERN copy. edg-rm deleteFile -l lfn:higgs  LYON edg-rm listBestFile -l lfn:higgs  CERN

Metadata Management and Security
Project Spitfire 'Simple' Grid Persistency Grid Metadata Application Metadata Unified Grid enabled front end to relational databases. Metadata Replication and Consistency Publish information on the metadata service Secure Grid Services Grid authentication, authorization and access control mechanisms enabled in Spitfire Modular design, reusable by other Grid Services The aim of Spitfire is to provide a uniform fronted to grid-enabled databases in order to provide simple database functionality. We don’t expect this service to be used for exhaustive queries but for short lookups and inserts of metadata (with high frequency). Security is an important component in spitfire since we need to authorize the database access for grid users. The GSI-enabled fronted to web services is architectured such that it is reusable by other Grid Services.

Conclusions and Further Information
The second generation Data Management services have been designed and implemented based on the Web Service paradigm Flexible, extensible service framework Deployment choices : robust, highly available commercial products supported (eg. Oracle) as well as open-source (MySQL, Tomcat) First experiences with these services show that their performance meets the expectations Further information / documentation:

The EU DataGrid Data Management

Similar presentations

Presentation on theme: "The EU DataGrid Data Management"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The EU DataGrid Data Management

Similar presentations

Presentation on theme: "The EU DataGrid Data Management"— Presentation transcript:

Similar presentations

About project

Feedback