Data services in gLite “s” gLite and LCG.

Slides:



Advertisements
Similar presentations
Workflows over Grid-based Web services General framework and a practical case in structural biology gLite 3.0 Data Management Hands-on David García Aristegui.
Advertisements

Workflows over Grid-based Web services General framework and a practical case in structural biology gLite 3.0 Data Management David García Aristegui Grid.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Grid Data Management Assaf Gottlieb - Israeli Grid NA3 Team EGEE is a project funded by the European Union under contract IST EGEE tutorial,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
EGEE-II INFSO-RI Enabling Grids for E-sciencE gLite Data Management System Yaodong Cheng CC-IHEP, Chinese Academy.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Grid Services/SRB/SRM & Practical Hai-Ning Wu Academia Sinica Grid Computing.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware Data Management in gLite.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks gLite Middleware Usage Dusan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
E-science grid facility for Europe and Latin America Data Management Services E2GRIS1 Rafael Silva – UFCG (Brazil) Universidade Federal.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
SEE-GRID-SCI Storage Element Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
Further aspects of EGEE middleware components INFN, Catania EGEE is funded by the European Union under contract IST
Data and storage services on the NGS.
Data Management The European DataGrid Project Team
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data management in EGEE.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Data Management Hands-on Juan Eduardo Murrieta.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
INFSO-RI Enabling Grids for E-sciencE University of Coimbra gLite 1.4 Data Management System Salvatore Scifo, Riccardo Bruno Test.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) LFC Installation and Configuration Dong Xu IHEP,
Grid Data Management Assaf Gottlieb Tel-Aviv University assafgot tau.ac.il EGEE is a project funded by the European Union under contract IST
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
Martedi 8 novembre 2005 Consorzio COMETA “Progetto PI2S2” FESR Data Management System Annamaria Muoio -- INFN Catania PI2S2 First Tutorial -- Messina,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
EGEE Data Management Services
Jean-Philippe Baud, IT-GD, CERN November 2007
Grid2Win Porting of gLite middleware to Windows XP platform
LFC Server Installation & Configuration
gLite Basic APIs Christos Filippidis
Java API del Logical File Catalog (LFC)
Data services on the NGS
The gLite Data Management System
Data Bridge Solving diverse data access in scientific applications
LFC Installation and Configuration
Practical: The Information Systems
gLite Data management system overview
gLite Grid Services Salma Saber
An Introduction to gLite middleware
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Comparison of LCG-2 and gLite v1.0
Data services on the NGS
Grid2Win: Porting of gLite middleware to Windows XP platform
Introduction to reading and writing files in Grid
Grid Services Ouafa Bentaleb CERIST, Algeria
Hands-On Session: Data Management
LFC Installation and configuration
Data Management in Release 2
Riccardo Bruno, Salvatore Scifo gLite - Tutorial Catania, dd.mm.yyyy
Enrico Fattibene INFN-CNAF
Data Management Ouafa Bentaleb CERIST, Algeria
EGEE Middleware: gLite Information Systems (IS)
Architecture of the gLite Data Management System
gLite Data and Metadata Management
Data Management system in gLite middleware
Presentation transcript:

Data services in gLite “s” gLite and LCG

Data services on Grids Simple data files on grid-specific storage Middleware supporting Replica files to be close to where you want computation For resilience Logical filenames Catalogue: maps logical name to physical storage device/file Virtual filesystems, POSIX-like I/O Services provided: storage, transfer, catalogue that maps logical filenames to replicas. Solutions include gLite data services Globus: Data Replication Service Storage Resource Broker Other data! e.g. …. Structured data: RDBMS, XML databases,… Files on project’s filesystems Data that may already have other user communities not using a Grid Require extendable middleware tools to support Computation near to data Controlled exposure of data without replication Basis for integration and federation GReIC See tomorrow OGSA –DAI In Globus 4 Not (yet...) in gLite A grid is a set of resources that share mechanisms of authorisation and authentication Current gatekeepers examine the DN in the proxy certificate Map DN to local accounts/privileges – most often to permit computation on compute elements OGSA-DAI extends these mechanisms to diverse data resources but permits exposure of data without its relocation and most vitally, moves computation to be close to the data.

Scope of data services in gLite Files that are write-once, read-many If users edit files then They manage the consequences! Maybe just create a new filename! No intention of providing a global file management system 3 service types for data Storage Catalogs Transfer (+ Metadata catalog – AMGA)

Data management example LCG FileCatalogue (LFC) “User interface” Input “sandbox” DataSets info Output “sandbox” Resource Broker Max. 10MByte Input “sandbox” + Broker Info Output “sandbox” Storage Element 2 Storage Element 1 Slide inherited from EDG – European Data Grid Computing Element 1st job writes and replicates output onto 2 SEs

Data management example cont. Max. 10MByte DataSets info LCG FileCatalogue (LFC) Keep computation close to storage data “User interface” Input “sandbox” Output “sandbox” Resource Broker Input “sandbox” + Broker Info Output “sandbox” Storage Element 2 Storage Element 1 Slide inherited from EDG – European Data Grid Computing Element 2nd job reads input from an SE

Resolving logical file name LCG FileCatalogue (LFC) “User interface” “Myfile.dat” Myfile.dat File_on_se1 File_on_se2 Storage Element 2 Storage Element 1 Slide inherited from EDG – European Data Grid Content is available on 2 SEs

Resolving logical file name LCG FileCatalogue (LFC) “User interface” “Myfile.dat” File_on_se1 (“SURL”: site URL) Myfile.dat “Logical filename” “GUID” Global Unique Identifier File_on_se2 (“SURL”: site URL) File content cannot change  No need to synchronize replicas Storage Element 2 Storage Element 1 Slide inherited from EDG – European Data Grid Content is available on 2 SEs

Name conventions Logical File Name (LFN) An alias created by a user to refer to some item of data, e.g. lfn:/grid/gilda/budapest23/run2/track1 Globally Unique Identifier (GUID) A non-human-readable unique identifier for an item of data, e.g. guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 Site URL (SURL) (or Physical File Name (PFN) or Site FN) The location of an actual piece of data on a storage system, e.g. srm://pcrd24.cern.ch/flatfiles/cms/output10_1 (SRM) sfn://lxshare0209.cern.ch/data/alice/ntuples.dat (Classic SE) Transport URL (TURL) Temporary locator of a replica + access protocol: understood by a SE, e.g. rfio://lxshare0209.cern.ch//data/alice/ntuples.dat

Mapping by the “LFC” catalogue server Name conventions Users primarily access and manage files through “logical filenames” - LFC Defined by the user LFC Namespace LFC has a directory tree structure lfn:/grid/<VO_name>/ <you create it> Mapping by the “LFC” catalogue server

lfn:/grid/gilda/sipos/run2/ LFC directories LCG FileCatalogue (LFC) lfn:/grid/gilda/sipos/run2/ Storage Element 1 sfn://grid005.iucc.ac.il/storage/gilda/generated/2007-06-23/fileb233d43f-5bc6-4ede-a5fe-611d48be2ba5 input1 input2 Storage Element 2 srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/2007-06-23/filea21ab3e2-8ff6-4a44-82a7-f2 input3 Storage Element 3 sfn://trigriden01.unime.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f Storage Element 4 sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f LFC directories = virtual directories Each entry in the directory may be stored on different SEs

Two sets of commands lfc-* lcg-* LFC = LCG File Catalogue LCG = LHC Compute Grid LHC = Large Hadron Collider Use LFC commands to interact with the catalogue only To create catalogue directory List files Used by you, your application and by lcg-utils (see below) lcg-* Couples catalogue operations with file management Keeps SEs and catalogue in step! Copy files to/from/between SEs Replicated

LFC has a directory tree structure LFC basics Defined by the user LFC Namespace LFC has a directory tree structure /grid/<VO_name>/ <you create it> All members of a given VO have read-write permissions in their directory Commands look like UNIX with “lfc-” in front (often)

File request and VOMS proxy Authentication, authorization Storage Element Provides Storage for files : massive storage system - disk or tape based Transfer protocol (gsiFTP) ~ GSI based FTP server Striped file transfer – cluster as back-end protocols server resources File request and VOMS proxy Storage Element Disk protocol protocol Tape Tape Tape protocol Authentication, authorization

Practical: LFC and LCG utils List directory Create a local file then upload it to an SE and register with a logical name (lfn) in the catalogue Create a duplicate in another SE List the replicas LCG File Catalogue (LFC) lfc-* Storage Element 2 User interface lcg-* Storage Element 1

Practical: LFC and LCG utils List directory Create a local file then upload it to an SE and register with a logical name (lfn) in the catalogue Create a duplicate in another SE List the replicas Create a second logical file name for a file Download a file from an SE to the UI LCG File Catalogue (LFC) lfc-* Storage Element 2 User interface ? lcg-* Storage Element 1

Open the practical from the agenda

LFC Catalog commands Summary of the LFC Catalog commands Add/replace a comment lfc-setcomment Set file/directory access control lists lfc-setacl Remove a file/directory lfc-rm Rename a file/directory lfc-rename Create a directory lfc-mkdir List file/directory entries in a directory lfc-ls Make a symbolic link to a file/directory lfc-ln Get file/directory access control lists lfc-getacl Delete the comment associated with the file/directory lfc-delcomment Change owner and group of the LFC file-directory lfc-chown Change access mode of the LFC file/directory lfc-chmod

Summary of lcg-utils commands Replica Management lcg-cp Copies a grid file to a local destination lcg-cr Copies a file to a SE and registers the file in the catalog lcg-del Delete one file lcg-rep Replication between SEs and registration of the replica lcg-gt Gets the TURL for a given SURL and transfer protocol lcg-sd Sets file status to “Done” for a given SURL in a SRM request

Summary of fts client commands glite-transfer-submit Submit a transfer job : needs at least source and destination SURL glite-transfer-status Given one or more job ID, query about their status glite-transfer-cancel Delete the transfer with the give Job ID glite-transfer-list Query about status of all user’s jobs; support options for query restrictions glite-transfer-channel-list Show all available channel; detailed info only if user has admin privileges