Data Management & Information Systems

Slides:



Advertisements
Similar presentations
21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Advertisements

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
DPM Name Server (DPNS) Namespace Authorization Location of physical files DPM Server Requests queuing and processing Space Management SRM Servers v1.1,
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
EGEE-III INFSO-RI Enabling Grids for E-sciencE The Medical Data Manager : the components Johan Montagnat, Romain Texier, Tristan.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Medical Data Manager 1 Dicom retrieval : overview of the DPM One command line to retrieve a file:
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Data Management Components Presenter.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Management cluster summary David Smith JRA1 All Hands meeting, Catania, 7 March.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Implementation of GLUE 2.0 support in the EMI Data Area Elisabetta Ronchieri on behalf of JRA1’s GLUE 2.0 Working Group INFN-CNAF 13 April 2011, EGI User.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
INFSO-RI Enabling Grids for E-sciencE Security needs in the Medical Data Manager EGEE MWSG, March 7-8 th, 2006 Ákos Frohner on behalf.
EGEE Data Management Services
Grid based telemedicine application
Jean-Philippe Baud, IT-GD, CERN November 2007
gLite Basic APIs Christos Filippidis
Classic Storage Element
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
The lightweight Grid-enabled Disk Pool Manager (DPM)
Security and Replication of Metadata with AMGA
Cross-health enterprises Medical Data Management on the EGEE grid
gLite Information System(s)
gLite Data management system overview
GDB 8th March 2006 Flavia Donno IT/GD, CERN
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
BDII Performance Tests
Comparison of LCG-2 and gLite v1.0
Introduction to gLite GRID Enviroment
Introduction to Data Management in EGI
Grid2Win: Porting of gLite middleware to Windows XP platform
SRM2 Migration Strategy
Encrypted Data Store, Hydra & Delegation Interface
Short update on the latest gLite status
The INFN Tier-1 Storage Implementation
Data Management cluster summary
gLite Information System(s)
Data Management Ouafa Bentaleb CERIST, Algeria
AWS Cloud Computing Masaki.
Data services in gLite “s” gLite and LCG.
R-GMA (Relational Grid Monitoring Architecture) for monitoring applications “s” gLite and LCG.
Architecture of the gLite Data Management System
gLite Information System
INFNGRID Workshop – Bari, Italy, October 2004
Information System (BDII)
Site availability Dec. 19 th 2006
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

Data Management & Information Systems Markus Schulz – SA3 - CERN OGF - EGEE-II User Forum Manchester - 9 May 2007

Ask questions! Disclaimer Material that went into this presentation has been provided by many developers from inside JRA1, SA3, NA4 and external contributors Many thanks Ask questions! OGF - EGEE II User Forum - Manchester - 9 May 2007

Data Management LFC SRM gfal and lcg-utils FTS AMGA DPM gfal and lcg-utils FTS AMGA Medical data management components Hydra Keystore, Data Encryption, DPM-DICOM OGF - EGEE II User Forum - Manchester - 9 May 2007

EGEE Data Management VO Frameworks User Tools Data Management lcg_utils FTS Cataloging Storage Data transfer GFAL Vendor Specific APIs (RLS) LFC SRM (Classic SE) gridftp RFIO OGF - EGEE II User Forum - Manchester - 9 May 2007

LFC LCG File Catalog LHC Computing Grid File Catalog Large Hadron Collider Computing Grid File Catalog

LCG “File” Catalog The LFC stores mappings between Users’ file names File locations on the Grid The LFC is accessible via CLI, C API, Python interface, Perl interface Supports sessions and bulk operations Data Location Interface (DLI) Web Service used for match making: given a GUID, returns physical file location ORACLE backend for high performance applications Read-only replication support … File replica 2 GUID File replica 1 File replica m LFC file name 1 LFC file name n OGF - EGEE II User Forum - Manchester - 9 May 2007

LFC features Hierarchical Namespace GSI security Permissions and ownership ACLs (based on VOMS) Virtual ids Each user is mapped to (uid, gid) VOMS support To each VOMS group/role corresponds a virtual gid /grid /vo /data LFC DLI lfc-ls –l /grid/vo/ file lfc-getacl /grid/vo/data OGF - EGEE II User Forum - Manchester - 9 May 2007

Secondary groups support What's new ? LFC bulk operations New method: lfc_getreplicas Greatly improves replicas listing performance Secondary groups support Since LFC version 1.6.3 (in production) OGF - EGEE II User Forum - Manchester - 9 May 2007

User 2 from VO1 with VOMS Role Before LFC version 1.6.3 1. Creates directory LFC dir 1 775 (uid1, gid1) User 1 from VO1 Mapped to (uid1, gid1) 2. Tries to create file Before secondary groups User 2 could not register a file in dir1 As (s)he did not belong to gid1 Solution: set ACLs on dir1 NEW User 2 from VO1 with VOMS Role Mapped to (uid2, gid2) OGF - EGEE II User Forum - Manchester - 9 May 2007

User 2 from VO1 with VOMS Role Since LFC version 1.6.3 1. Creates directory LFC dir 1 775 (uid1, gid1) User 1 from VO 1 Mapped to (uid1, gid1) 2. Tries to create file With secondary groups User 2 can register a file in dir1 As (s)he belongs to gid2 and gid1 But: User 1 cannot register a file in a directory created by User 2, if (s)he does not have the same VOMS Role ! NEW User 2 from VO1 with VOMS Role Mapped to (uid2, gid2, gid1) Also belongs to VO1 OGF - EGEE II User Forum - Manchester - 9 May 2007

DPM (Disk Pool Manager)

Storage Element Storage Resource Manager (SRM) hides the storage system implementation (disk or active tape) handles authorization translates SURLs (Storage URL) to TURLs (Transfer URLs) disk-based: DPM, dCache,+; tape-based: Castor, dCache File I/O: posix-like access from local nodes or the grid GFAL (Grid File Access Layer) OGF - EGEE II User Forum - Manchester - 9 May 2007

What is a DPM ? Disk Pool Manager Manages storage on disk servers SRM support 1.1 2.1 (for backward compatibility) 2.2 (released in DPM version 1.6.3) GSI security ACLs VOMS support Secondary groups support (see LFC) OGF - EGEE II User Forum - Manchester - 9 May 2007

Target: small to medium sites DPM strengths Easy to use Hierarchical namespace $ dpns-ls /dpm/cern.ch/home/vo/data Easy to administrate Easy to install and configure Low maintenance effort Easy to add/drain/remove disk servers Target: small to medium sites Single disks --> several disk servers OGF - EGEE II User Forum - Manchester - 9 May 2007

DPM: user's point of view /domain /home CLI, C API, SRM-enabled client, etc. /vo (uid, gid1, …) DPM head node file data transfer DPM Name Server Namespace Authorization Physical files location Disk Servers Physical files Direct data transfer from/to disk server (no bottleneck) External transfers via gridFTP DPM disk servers OGF - EGEE II User Forum - Manchester - 9 May 2007

GFAL & lcg_util Data management access libs. GFAL Shield users from complexity Interacts with information system, catalogue and SRM-SEs GFAL Posix like C API for file access SRMv2.2 support User space tokens correspond to A certain retention policy (custodial/replica) A certain access latency (online/nearline) lcg_util (command line + C API ) Replication, catalogue interaction etc. OGF - EGEE II User Forum - Manchester - 9 May 2007

LFC & DPM deployment status EGEE Catalog 110 LFCs in production 37 central LFCs 73 local LFCs EGEE SRM Storage Elements CASTOR dCache DPM 96 DPMs in production Supporting 135 VOs LFC and DPM Stable and reliable production quality services Well established services Require low support effort from administrators and developers Storage Element instances published in EGEE’s Top BDII OGF - EGEE II User Forum - Manchester - 9 May 2007

FTS overview gLite File Transfer Service is a reliable data movement fabric service (batch for file transfers) FTS performs bulk file transfers between multiple sites Transfers are made between any SRM-compliant storage elements (both SRM 1.1 and 2.2 supported) It is a multi-VO service, used to balance usage of site resources according to the SLAs agreed between a site and the VOs it supports VOMS aware OGF - EGEE II User Forum - Manchester - 9 May 2007 18

FTS Why is it needed ? For the user, the service it provides is the reliable point to point movement of Storage URLs (SURLs) and ensures you get your share of the sites’ resources For the site manager, it provides a reliable and manageable way of serving file movement requests from their VOs and an easy way to discover problems with the overall service delivered to the users For the VO production manager, it provides ability to control requests coming from his users Re-ordering, prioritization,… The focus is on the “service” delivered to the user It makes it easy to do these things well with minimal manpower OGF - EGEE II User Forum - Manchester - 9 May 2007 19

FTS: key points Reliability Security Service and performance It handles the retries in case of storage / network failures VO customizable retry logic Service designed for high-availability deployment Security All data is transferred securely using delegated credentials with SRM / gridFTP Service audits all user / admin operations Service and performance Service stability: it is designed to efficiently use the available storage and network resources without overloading them Service recovery: integration of monitoring to detect service-level degradation OGF - EGEE II User Forum - Manchester - 9 May 2007 20

Service scale Designed to scale up to the transfer needs of very data intensive applications Currently deployed in production at CERN Running the production WLCG tier-0 data export Target rate is ~1 Gbyte/sec 24/7 Over 9 petabytes transferred in last 6 months > 10 million files Also deployed at ~10 tier-1 sites running a mesh of transfers across WLCG Inter-tier1 and tier-1 to tier-2 transfers Each tier-1 has transferred around 0.2 – 0.5 petabytes of data OGF - EGEE II User Forum - Manchester - 9 May 2007 21

The gLite AMGA Metadata Catalogue

Metadata in EGEE Metadata is information about data stored in files usually lives in relational databases AMGA is a joint JRA1-NA4 development Used by several application domains ( BioMed, HEP, EarthObs….) Implementation: SOAP and Text front-ends Streamed Bulk Operations ----> performance Supports single calls, sessions & connections SSL security with grid certs (X509) and others, passwords, Kerberos Own User & Group management + VOMS PostgreSQL, Oracle, MySQL, SQLite backends Query parser supports good fraction of SQL: Access permissions per directory/entry via ACLs AMGA integrates support for replication of metadata Asynchronous replication: Ideal for WAN

AMGA Clients & APIs AMGA Clients (for setup, administration) Shell-like client Graphical Browser (Python) Many Programming APIs Diverse user community requested/provided C/C++, Java, Python, Perl, PHP SOAP interface Works with gSOAP, Axis, PySOAP

Performance Performance comparable to direct DB access C++, TCP streaming protocol, very fast SSL sessions Logarithmic Scale! 100 1000 10000 100000 1e+06 1 10 Throughput [entries/s] # clients AMGA 1000 rows JDBC 1000 rows AMGA 1 row JDBC 1 row Throughput comparison between AMGA and direct access via JDBC reading same table on a LAN

Scale LHCb (HEP VO use case) 100 Million entries successfully tested! 150GB data 100 000 entries/day insert rate expected 10 entries/second read-rate Uses ORACLE RAC backend For most demanding use cases

Conclusions AMGA provides Grid Layer to relational databases: Abstraction of different DB vendors Efficient LAN/WAN access Fast X509 Grid security, VOMS integration Rich set of features: Transactions, Views, Sequences, complex Joins.... AMGA is building block for distributed databases: Asynchronous replication Management tools for distributed AMGA servers BUT: AMGA misses distributed Query Processing AMGA in production replacing conventional RDBMS based systems: Very large, high performance installation for HEP High security solutions for MDM

Encrypted Data Storage

Motivation Medical community as the principal user large amount of images privacy concerns vs. processing needs ease of use (image production and application) Strong security requirements anonymity (patient data is separate) fine grained access control (only selected individuals) privacy (even storage administrator cannot read) Legacy service in use, based on gLite-1.5 Described components are under development

Building Blocks Hospitals: Grid: SE = SRM + gridftp + I/O DICOM = Digital Image and COmmunication in Medicine Grid: SE = SRM + gridftp + I/O and a client (application processing an image) Goal: data access at any location DICOM SE SRM gridftp I/O

Exporting Images “wrapping” DICOM : anonymity: patient data is separated and stored in AMGA access control: ACL information on individual files in SE (DPM) privacy: per-file keys distributed among several Hydra key servers fine grained access control Image is retrieved from DICOM and processed to be “exported” to the grid. AMGA metadata Hydra KeyStore Hydra KeyStore Hydra KeyStore gridftp patient data keys trigger SRMv2 image file ACL I/O DICOM DICOM-SE

Accessing Images image ID is located by AMGA key is retrieved from the Hydra key servers file is accessed by SRM (access control in DPM) data is read and decrypted block-by-block in memory only (GFAL and hydra-cli)---> useful for all Still to be solved: ACL synchronization among SEs Hydra KeyStore Hydra KeyStore Hydra KeyStore AMGA metadata gridftp 2. keys 1. patient look-up SRMv2 3. get TURL GFAL image I/O 4. read DICOM DICOM-SE

Information Systems R-GMA BDII (ldap based information system) OGF - EGEE II User Forum - Manchester - 9 May 2007

Relational Grid Monitoring Architecture R-GMA Publish Tuples Producer application Producer Service API SQL “INSERT” Register Registry Service Query Tuples SQL “SELECT” Locate Send Query Consumer application Consumer Service API Receive Tuples Schema Service SQL “CREATE TABLE” For users R-GMA appears similar to a single relational database. Implementation of GGF’s Grid Monitoring Architecture (GMA) Rich set of APIs (WebBrowsers, Java, C/C++, Python) Backbone of EGEE monitoring (almost every activity leaves traces) See Dashboard, Realtime Monitor ++++++ about 20 tools Used by EGEE accounting as transport OGF - EGEE II User Forum - Manchester - 9 May 2007

Service discovery SD provides simple methods for locating services hides underlying information system (simplified use) plug-ins for R-GMA, BDII and XML files API available for Java, C/C++ and command line tools OGF - EGEE II User Forum - Manchester - 9 May 2007

The Information System Berkeley Data Base Information Index BDII top-level FCR Queries (15HZ) WMS VO specific filter, based on live status 2 minutes WN Site BDII site-level UI FTS Based on ldap Standardized information provider (GIP) GLUE-1.3 schema Used with 230+ sites Roughly 60 instances in EGEE Top level BDII at CERN 15HZ query rate >20MByte of data BDII resource MDS GRIS provider provider OGF - EGEE II User Forum - Manchester - 9 May 2007

Grid foundation: Information Systems Generic Information Provider (GIP) Provides LDIF information about a grid service in accordance to the GLUE Schema BDII: Information system in gLite 3.0 (by LCG) LDAP database that is updated by a process More than one DBs is used separate read and write A port forwarder is used internally to select the correct DB Freedom of choice portal: VOs can white- or black-list resources so that BDII DBs are updated accordingly Sites failing Site Functional Tests may also be excluded GIP Cache Provider Plugin LDIF File Config File 2171 LDAP 2172 2173 2170 Port Fwd Update DB & Modify DB Swap DBs OGF - EGEE II User Forum - Manchester - 9 May 2007

Inside A BDII FCR Write to cache 2171 LDAP 2172 LDAP 2173 LDAP Update DB & Modify DB Write to cache Write to cache Write to cache ldapsearch Swap DBs 2170 Port Fwd 2170 Port Fwd OGF - EGEE II User Forum - Manchester - 9 May 2007

Load Balanced BDII DNS Round Robin Alias Queries BDII 2170 BDII 2170 OGF - EGEE II User Forum - Manchester - 9 May 2007

Queries Analysis of a log file from one top-level BDII over 4 hours Multiply by 8 to get the values for the load balance service (15Hz) No of Queries Query 6075 Find the Close CE to an SE 5475 Find the VOs SA for an SE 5043 Find all SRMs 4791 Find an SE 2432 Find the Close SE to a CE 2117 Find all Services for a VO 664 Find all CEs for a VO 638 Find all SAs for a VO 479 Find all SubClusters 448 Find the GlueVOView for a CE OGF - EGEE II User Forum - Manchester - 9 May 2007

Queries To a Top-Level BDII OGF - EGEE II User Forum - Manchester - 9 May 2007

Next Steps Short term Medium term Put the site-level BDIIs on a stand alone node to improve scalability Run the CE information provider on the site-level BDII Introduce regional top-level BDIIs Top spread the query load Must ensure all have the same quality of service Medium term Improve the efficiencies of the queries Add some form of caching in the existing tools Improve the query performance of the BDII service OGF - EGEE II User Forum - Manchester - 9 May 2007

Generic Information Provider GIN BDII GIN BDII ARC BDII Used by the GIN group Generic Information Provider Provider EGEE Provider OSG Provider NDGF Provider Naregi Provider Teragrid Provider Pragma EGEE Site OSG Site NDGF Site Naregi Grid Teragrid Grid Pragma Grid OGF - EGEE II User Forum - Manchester - 9 May 2007

Information Systems Current problems SLAPD demons on loaded systems can starve CEs drop out of the system Move info provider from CE SiteBDIIs co-hosted on busy systems time out Loss of an entire site in the info system Move on large sites to low load node Improve reliability by fail back top level BDIIs Needs work in the clients Scalability tests indicate limits (1-2 years time) Cache static data more aggressively Smarter schema (OGF- GLUE) Change underlying technology Simple insulation API needed ------> Standardization (OGF SAGA?) OGF - EGEE II User Forum - Manchester - 9 May 2007