Presentation is loading. Please wait.

Presentation is loading. Please wait.

Architecture of the gLite Data Management System

Similar presentations


Presentation on theme: "Architecture of the gLite Data Management System"— Presentation transcript:

1 Architecture of the gLite Data Management System
Paola Arce UTFSM, Chile November 30th 2010 1

2 Valparaíso, Porting Application school
Outline Challenges of data management in a Grid infrastructure Initial definitions Types of Storage Elements File naming conventions File catalogue Practical exercises (hands on) 2 Valparaíso, Porting Application school 2

3 Valparaíso, Porting Application school
Challenges Heterogeneity Data are stored on different storage systems using different access technologies Distribution Data are stored in different locations (in most cases there is no shared file system or common namespace) Data need to be moved between different locations Data description Data are stored as files (need to describe and locate them according to their content) Storage Resource Manager interface File Catalogue File Transfer Service Metadata Service 3 Valparaíso, Porting Application school 3

4 Valparaíso, Porting Application school
Getting started The DMS provide services for location, access and transfer of files User do not need to know the file location, just its logical name. Files can be replicated or transferred to several locations (SEs) as needed. Files are shared within a VO 4 Valparaíso, Porting Application school 4

5 Valparaíso, Porting Application school
Getting started Storage Elements (SE) is the service which allows users and applications to store/retrieve data. Provide storage space for files. Provide transfer protocol (GSIFTP) ~ GSI (Grid Security Infrastructure) based FTP server Provide an interface for the management of disk and tape storage resources: Storage Resource Manager (SRM) Files located in the Storage Elements (SEs)… Are mostly write-once, read-many. Accessible by users and applications from “anywhere” in the Grid. Several replicas of one file can be replicated at different sites. Cannot be changed unless remove or replaced. 5 Valparaíso, Porting Application school 5

6 Types of Storage Elements
dCache Consists of a server and one or more pool nodes. Centralized admin.: single point of access to the SE. Files are presented in the disk pools under a single virtual filesystem tree. Uses the GSI dCache Access Protocol (gsidcap). Storm Solution best suited to cope with large storage (> or >> 100 TB) Makes full advantage of parallel filesystem (GPFS, Lustre) SRM v2.2 interface CERN Advanced STORage manager (CASTOR) Files are migrated from a disk buffer frontend to a tape mass storage Uses the insecure Remote File I/O protocol (RFIO) Disk Pool Manager (DPM) Used for fairly small SEs (max 10 TB of total space) with disk-based storage only. Uses secure RFIO protocol 6 Valparaíso, Porting Application school 6

7 Storage Resource Manager (SRM)
The SRM is a single interface that takes care of local storage interaction and provides a Grid interface to the outside world. SE CASTOR I talk to them on your behalf I will even allocate space for your files And I will use transfer protocols to send your files there You as a user need to know all the systems!!! SE Storm SRM SE DPM SE dCache 7 Valparaíso, Porting Application school 7

8 Valparaíso, Porting Application school
A practical example (1) She is working on a job which needs: - read MonteCarlo simulations from siteA - read experiment data from siteB - read environmental data from siteC - write output to home siteD Storm at SiteA dCache at SiteB DPM at SiteD DPM at SiteC Valparaíso, Porting Application school

9 File Naming conventions (1)
Grid Unique IDentifier (GUID) Every file has a GUID A non-human-readable unique identifier, e.g.: guid:38ed3f60-c402-11d7-a6b0-f53ee5a37e1d Note: all replicas of a file will share the same GUID Logical File Name (LFN) An alias that can be used to refer to a file, e.g.: lfn://grid/gilda/users/mario/myfile.dat Logical File Name 1 Logical File Name N GUID ... 9 Valparaíso, Porting Application school 9

10 File Naming conventions (2)
Storage URL (SURL) or Physical File Name (PFN) The location of an actual file on a storage system, e.g.: srm://aliserv6.ct.infn.it/dpm/home/gilda/project1/test.dat Note: Used by the system to find where the replica is physically stored Transport URL (TURL) Complete URL with the necessary information to access a file in a SE (including the access protocol) e.g.: rfio://lxshare0209.cern.ch//data/alice/ntuples.dat Logical File Name 1 Physical File SURL 1 TURL 1 ... GUID ... ... Logical File Name N Physical File SURL N TURL 1 10 Valparaíso, Porting Application school 10

11 Valparaíso, Porting Application school
SRM interactions 1 Client SRM 4 2 3 5 SE The client asks the SRM for the file providing an SURL The SRM asks the Storage Element to provide the file The Storage Element notifies the availability of the file and its location The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed The client interacts with the storage using the protocol specified in the TURL 11 Valparaíso, Porting Application school 11

12 LFC = LCG File Catalogue Valparaíso, Porting Application school
Needles in a haystack How do I keep track of all files I have on the Grid? Even if I remember all the LFN’s of my files, what about someone else's files? How does the Grid keep track of the mapping between LFN(s), GUID and SURL(s)? LFC = LCG File Catalogue LCG = LHC Compute Grid LHC = Large Hadron Collider File Catalogue 12 Valparaíso, Porting Application school 12

13 Valparaíso, Porting Application school
File Catalogue Is the service which maintains mappings between LFN(s), GUID and SURL(s) It keeps track of the location of copies (replicas) of files It consists of a unique catalogue, where the LFN is the main key Looks like a “top-level” directory in the Grid For each of the supported VO a separate subdirectory exists under the "/grid" directory. All members of a given VO have read-write permissions in such a directory 13 Valparaíso, Porting Application school 13

14 Valparaíso, Porting Application school
The LFC Service lfn:/grid/gilda/tcaland/mpi.txt File Catalogue SE A SE B User Interface SE C 14 Valparaíso, Porting Application school 14

15 Job submission – example 1
Small files: InputSandbox / OutputSandbox CE WMS User Interface Worker Nodes 15 Valparaíso, Porting Application school 15

16 Data Management – example 2
CE WMS User Interface Worker Nodes LFC SE SE 16 Valparaíso, Porting Application school 16

17 Valparaíso, Porting Application school
LFC commands Interact with the catalogue only Add/replace a comment lfc-setcomment Set file/directory access control lists lfc-setacl Remove a file/directory lfc-rm Rename a file/directory lfc-rename Create a directory lfc-mkdir List file/directory entries in a directory lfc-ls Make a symbolic link to a file/directory lfc-ln Get file/directory access control lists lfc-getacl Delete the comment associated with the file/directory lfc-delcomment Change owner and group of the LFC file-directory lfc-chown Change access mode of the LFC file/directory lfc-chmod 17 Valparaíso, Porting Application school 17

18 lcg-utils commands Copy files to/from/between SEs.
Keep the SEs and the Catalogue up to date. The RPM containing these tools (lcg_util) is installed in the WNs and UIs. lcg-cp Copies a grid file to a local destination lcg-cr Copies a file to a SE and registers the file in the catalog lcg-del Delete one file lcg-rep Replication between SEs and registration of the replica lcg-gt Gets the TURL for a given SURL and transfer protocol lcg-sd Sets file status to “Done” for a given SURL in a SRM request 18 Valparaíso, Porting Application school 18

19 Valparaíso, Porting Application school
Let’s practice! Reference: 19 Valparaíso, Porting Application school 19

20 Environment Variables
Pointing to the right BDII Pointing to the right LFC echo $LCG_GFAL_INFOSYS export LCG_GFAL_INFOSYS = gilda-bdii.ct.infn.it:2170 echo $LFC_HOST export LFC_HOST = lfc-gilda.ct.infn.it 20 Valparaíso, Porting Application school 20

21 Valparaíso, Porting Application school
Before starting… Make sure to have a proxy created voms-proxy-info -all voms-proxy-init --voms gilda 21 Valparaíso, Porting Application school 21

22 LFC: creating a directory
List directories Create your own personal directory inside lfc-ls /grid/gilda/tutorials/ lfc-mkdir /grid/gilda/tutorials/epikh You can check the creation typing: lfc-ls /grid/gilda/tutorials/ 22 Valparaíso, Porting Application school 22

23 Copying and registering a file 1/2
lcg-cr Copies a file to a SE and registers the file in the catalogue lcg-cr --vo <vo name> -l <LFN destination> -d <SE> <local file> Make sure to have a directory in the LFC (/grid/gilda/tutorials/yourname/) Use the lcg-info or lcg-infosites commands to figure out the available SEs This command will return the GUID for your file 23 Valparaíso, Porting Application school 23

24 Copying and registering a file 2/2
lcg-infosites --vo gilda se Avail Space(Kb) Used Space(Kb) Type SEs n.a se.hpc.iit.bme.hu n.a se1-egee.srce.hr n.a gilda-02.pd.infn.it n.a sirius-se.ct.infn.it n.a grisuse.scope.unina.it lcg-cr --vo gilda -l lfn:/grid/gilda/tutorials/epikh/test.txt -d gilda-02.pd.infn.it file://$HOME/test.txt guid:0d8ef3b9-7f73-4c57-80c9-e827bace8597 24 Valparaíso, Porting Application school 24

25 Valparaíso, Porting Application school
Downloading a file First of all, let’s download a file from a SE to start “playing” with it. Basic Usage: Try it: lcg-cp --vo <vo name> <LFN origin> <local destination> lcg-cp --vo gilda lfn:/grid/gilda/tutorials/epikh/test.txt file://$HOME/test1.txt 25 Valparaíso, Porting Application school 25

26 Replicate a file between SEs
Basic Usage: Try it: lcg-rep --vo <vo name> -d <destination SE> <LFN of your file> lcg-rep --vo gilda -d sirius-se.ct.infn.it lfn:/grid/gilda/tutorials/epikh/test.txt 26 Valparaíso, Porting Application school 26

27 From where it was downloaded?
List the Replicas of the file: This command will return the SURL of all replicas A file can be stored on multiple SE's so that a job can download it from the closest SE while is running. lcg-lr --vo gilda lfn:/grid/gilda/tutorials/epikh/test.txt 27 Valparaíso, Porting Application school 27

28 Valparaíso, Porting Application school
Deleting a file Basic Usage: When used with '-a' switch will delete all replicas and delete entry from catalog Try it: lcg-del -a --vo <vo name> <LFN> lcg-del -a --vo gilda lfn:/grid/gilda/tutorials/epikh/test.txt 28 Valparaíso, Porting Application school 28

29 Removing a LFC directory
Basic Usage: Try it: lfc-rm -r <LFC file path> lfc-rm -r /grid/gilda/tutorials/epikh 29 Valparaíso, Porting Application school 29

30 Valparaíso, Porting Application school
Thank you Any questions ? Valparaíso, Porting Application school 30


Download ppt "Architecture of the gLite Data Management System"

Similar presentations


Ads by Google