Grid DataBase Management with the GRelC DAS Sandro Fiore, Ph.D Salvatore Vadacca SPACI Consortium and University of Salento (Lecce), Italy Grid Tutorial per l’Università di Palermo, 11/12/2007
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Outline Grid Data Access Service GRelC Project GRelC DAS –Architecture –Features –Queries –SDK –Experimental Results –Porting on gLite –XGRelC GUI –GRelC Portal –Deployment –On Line User Tutorial (GILDA) Conclusions
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project In the last decade a lot of efforts were concentrated on grid storage systems (coarse grained approach) Data Grids should also provide a low level framework for grid- database access and management (Grid DAS, fine grained approach) A Grid DAS is not a new DBMS or it does not introduce a new query language It interacts with legacy systems/databases (SQL) It provides more complex and efficient “queries” in grid It must be interoperable with current standards/middleware (gLite, Globus, …) Main requirements: security, transparency, interoperability, efficiency, robustness, etc. Grid Data Access Service
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Introducing the GRelC Project Grid Relational Catalog is a project which aims at designing and developing a set of efficient, secure and transparent Data Grid Services (Starting date, Jan 2001). GRelC Data Access Service aims at providing a large set of functionalities to access both relational and non relational DataBases in a grid environment. DB XML
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC Project: a bit of history
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS Architecture (in the large)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS: Main Features Entirely based on C programming language Multithreaded web service It exposes the web service interface GSI enabled and WS-I compliant Mutual authentication based on GSI (X.509v3 digital certificates) GRelC DAS Authorization based on ACL for local management VOMS Support, for global management Information System Support (BDII compliant) Wide set of data access control policyies Full GSI support: data encryption, data integrity, protection against replay attacks and detection of out of sequence packets
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS: Main Features XML data validation for recordset SingleQuery, MultiQuery and MultiSingleQuery Support Support for synchronous and asynchronous queries Dinamic binding to heterogeneous DBMSs Two levels logging (users, connections, queries, etc.) GSI enabled remote administration tools and remote log Compression, chunking, prefetching and streaming to enhance performance on a WAN Wide SDK for developers (both for C, C++ and Java) No dependencies concerning other middleware (only GSI) Portal GRelC
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS Architecture (in the small) GRelC Data Access Service
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Standard Database Access Interface Features: Standard access to data sources Types uniformity Error uniformity Plug-in architecture based on dynamic libraries PostgreSQL driver MySQL driver Database Access Library (Grid-DAS back end) Grid Database Access Service (front end) Other Applications PostgreSQL MySQL UnixODBC driver UnixODBC SQLite driver SQLite Dinamic binding to: PostgreSQL MySQL SQLite IBM/DB2, Oracle9.i, MS-SQL Server, UnixODBC, Textual DBs, etc.
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS Queries Latest release of GRelC DAS supports the following queries: Single Query Memory (+ chunk management) Single Query File (+ chunk management) Single Query File + ZIP (+ chunk management) Single Query Prefetch (parallel chunk donwload/processing) Single Query Stream (resultset streaming) Asynchronous Query Web Single Query XHTML (+ chunk management / paging) –CSS v2.0, XHTML v1.0 Strict Results displayed in the following formats: Tabular XML HTML RAW
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Single Query File Approach (Zip) DBMS Client SQS query Data Delivery SQL Recordset GrelC Recordset in XMLformat GrelC Recordset GrelCLoad Recordset GrelCRecordset APIs GRelC Data Access Recordset XML File Recordset XML File This kind of query is suitable to retrieve medium/large resultsets
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Single Query File + chunk (Zip) DBMS Client SQS query Data Delivery SQL Recordset GrelC Recordset in XMLformat GrelC Recordset GrelCLoad Recordset GrelCRecordset APIs GRelC Data Access Recordset XML file Recordset XML file Recordset XML file Chunk 1 Chunk 2 Chunk 3 Recordset XML file Recordset XML file Recordset XML file Chunk 1 Chunk 2 Chunk 3 Chunk 1 Chunk 2 Chunk 3 Chunk 1 Chunk 2 Chunk 3 This kind of query is suitable to retrieve medium/large resultsets
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Single Query Stream DBMS Client SQS submission SQL Recordset GRelC-Data-Access Result submision This kind of query is suitable to retrieve VERY LARGE resultsets
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Single Query HTML DBMS Client SQS submission SQL Recordset Http connection GRelC-Data-Access URI Result
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Web Page GACL DNs Single Query HTML (Security Issues) DBMS Client SQS submission SQL Recordset HTTPS connection using X.509 Certificates GRelC-Data-Access URI Result
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Asynchronous Query Asynchronous queries - Batch mode - Users can define a lifetime for results availability on the GRelC DAS - decoupling client/server (e.g. WN gLite) - New clients (submission, status, abort…) - Additional thread to manage requests - Preliminary internal tests were ok - Added within the current release v2.3.0
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Asynchronous Query DBMS Client SQS query ID Query SQL Recordset GrelC Recordset in XMLformat GrelC Recordset GrelCLoad Recordset GrelCRecordset APIs GRelC Data Access Recordset XML File Recordset XML File ID Query Get File Data Delivery 1 – Asynchronous Query Submission 2 – Request Dispatching 3 – Data Delivery 4 – Data Manipulation
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC Data Access: Clients
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project SDK for Developers Grid Enabled Stand alone GUI (XGRelC) Grid Enabled Web Apps (GRelC Portal) High Performance Grid Services & Clients (CLI) GRelC DAS
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project The GRelC Library: APIs Classification Database access and query services – bind – unbind – query submission Remote manipulation services – get_value – get_current_tuple Resultset store and retrieving services – store_result_disk – fetch_stored_recordset User management services – add_user – remove_user – set_user_policy Enterprise Grid management services – add_host – add_dbms Virtual space management services – create_virtual_database – register_database QoS services – relocate_database Wide SDK both for C and C++ developers
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC-CppProxy: a C++ Module A C++ module was created in order to allow an easy development of new web services client with this language. This module hides the communication layer with Web Services C++ Module C++ Generic Client SOAP GRelC DAS
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project class CppProxy { public: int bind (string grelc_db_name); int unbind (); int query_submission (string query);.... int create_database (string grelc_db_name, string identity, string dbms, string host, string istance, int log_type); int create_physical_database_and_register(string grelc_db_name, string dbms, string host, string istance, int log_type); int drop_database (string grelc_db_name); int get_log(int num_lines, string &log); int get_log_database(string grelc_dbname, int num_lines, string &log); int get_host_position_info(HostInfoResponse &response); int get_value (int row, int column, string &value); private: struct soap *soap; struct gsi_plugin_data *data; char* connection; bool connected; bool enable_credential; string dn; }; CppProxy Class Low level details concerning with soap, gsi, conn. status are concealed
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project XGRelC: A consolle for Grid-DBs Mng Functionalities: User management Web Service registration Host Management Logging DBMS configuration Database creation Import Database Database configuration Query submission Map deployment
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project XGRelC: A consolle for Grid-DBs Mng Functionalities: User management Web Service registration Host Management Logging DBMS configuration Database creation Import Database Database configuration Query submission Map deployment
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project International Testbed Lecce (Italy) Bejing (China)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Test Performance (II)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC & gLite
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC on gLite: Porting Porting of GRelC on gLite was straighforward Porting on gLite is ok both for client and server side The middleware works fine both on LCG and current gLite 3.x middleware GRelC DAS runs also on several platforms: –Linux –MAC OS X –FreeBSD Both IA64 and IA32 platforms are supported (we currently installed on SPACI-LECCE-IA64 (EGEE SA1 partner) the GRelC DAS)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Straighforward integration within the EGEE farm model GRelC DAS provides fine grained data mng service This service can be used both as farm service and as VO service depending on the context, the database policies/constraints Extended EGEE Farm Model GRelC on gLite: A New Service
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC on gLite: VOMS We provide global authorization by means of VOMS Extensions High level of scalability concerning DAPs related to VOs Double level authorization framework: both local and global policies management can be provided (mixed mode) gilda grelc das host1 host2host3 grid-db1 grid-db2 Root level (VO) GRelC level DAS Level Host Level Grid-DB Level Roles Mandatory Groups insert delete update Fine Grained Coarse Grained
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Two-level authorization Global authorization (through VOMS extensions) Local authorization (by means of the local GRelC DAS authorization framework) The two masks obtained from global and local authorization are combined to infer the final User Privileges Mask (UPM) 3 scenarios –global mode, coarse grained approach –local mode, fine grained approach –combined mode
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Combined Mode - An Example
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC on gLite: BDII GLUE schema extension providing information about VOs and Databases (we plan to interact with OGF GLUE-WG) Local admin can set up the Information Provider Level parameter Min: 0 to publish just basic info (only the contact string) Max: 7 for all info (contact string, VOs, DBs, tables, fields, etc.) Information System Extensions Database specific Information
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC on gLite: Porting on SLC4.x Porting on SLC4.x is an on going activity Preliminary results are very good Porting will be completed before EGEE Conference in Budapest A release based on SLC4.x will be available on the GRelC website in December Current test are connected both with IA32 and IA64 (Itanium2 processors) platforms This activity is part of the SPACI-LECCE-IA64 SA1 activity within the EGEE Project
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC Portal Functionalities: Login GRelC DAS Registration Host Management Instance Management User Management Query submission Deployment Map Features: Seamless and ubiquitous access to GRelC DAS enabled resources No additional software installation / configuration is required Complete and user-friendly Grid Data Portal Interface (It entirely replaces CLI)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC Portal: Some Snapshots
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC WebSite Main sections: GRelC Portal Download (rpms available) News Publications Events Deployment Documentation Components ….. GRelC Website URL: Mailing List mail:
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project INFN GRID Deployment SPACI Lecce DAS Server DAS Client INAF Trieste INFN Padova INFN Bari INFN Catania Involved Sites INAF Trieste (IA32) INFN Bari (IA32) INFN Catania (IA32) INFN Padova (IA32) SPACI LECCE (IA32, IA64) Testing Activities Sequential tests Concurrent tests Bugs report Bug Fixing Optimization SPACI-Lecce
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC Data Access Data Sources (DB) gandalf.unile.it Linux x86 sara.unile.it Mac OS X sigma2.unile.it Linux IA64 gridsurfer.unile.it FreeBSD galileo.hpcc.unical.it Linux IA64 sepac00.projects.cscs.ch Linux x86 spacina.na.infn.it Linux IA64 SEPAC Grid Deployment
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GILDA Deployment
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project GRelC DAS User Tutorial on GILDA Grid CT Wiki Website Info about: -Log in to the grid -Query Submission For any information about GILDA t-Infrastructure please contact & GRelC DAS Tutorial link: User tutorial: GILDA t-Infrastructure Special thanks to the GILDA Staff for their support
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Christmas Release Info Tag: GRelC DAS v2.4.0 Release date: 21/12/2007 Support: SCL3.x & SLC4.x Features: New driver for SQLite New GRelC DAS and Grid-DB Log Remote Log Support XML Support for –eXist –Xindice –libxml2 based documents Complete set of CLI for XML New GILDA Tutorial for XML CLI XML Grid Information Provider What’s Next… GRelC DAS 2.4.0
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Conclusions GRelC DAS provides support in Grid for a wide range of DBMSs It is currently tested on several grid environments (SPACI, SEPAC, GILDA, INFNGRID) A wide SDK is available for developers XGRelC Graphical User Interface developed in Qt GRelC Portal to ease Grid-DB mng via Web Interface gLite compliant (porting on gLite 3.x and integration with VOMS framework, BDII, etc.) Support for several platforms (IA32 and IA64) Currently the software is candidate at the EGEE Respect Program
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project For any information Supervisor: Prof. Giovanni Aloisio Project P. I.: Sandro Fiore, Ph. D. Team Members: Massimo Cafaro, Ph. D. Alessandro Negro Salvatore Vadacca GRelC WebSite: Mailing lists:
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Global Mode User credentials must be obtained through voms- proxy-init The UPM is inferred from the available VOMS extensions No additional authorization setting is required on the GRelC DAS Easy and fast setup procedure It scales well Feasible for a real production grid environment
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Global Mode
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Local Mode User credentials must be obtained through grid- proxy-init The UPM is drawn out of the GRelC DAS metadata catalogue No VOMS extensions are added to the user proxy The setup procedure must be carried out on each GRelC DAS Scalability is worse
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Local Mode
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Combined Mode User credentials must be obtained through voms- proxy-init The UPM is inferred joining information on access policies coming from VOMS extensions and the GRelC DAS metadata catalogue VOMS level (grant or revoke) GRelC DAS level (setting, undefining, unsetting)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Roles and Groups on VOMS (I) /gilda/grelc/das/host1/grid-db1/Role=grelc-db-insert Case A (fine grained)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Roles and Groups on VOMS (II) /gilda/grelc/das/host1/Role=grelc-db-insert Case B (intermediate level)
Grid Tutorial, Palermo - Ph.D. Sandro Fiore - GRelC Project Roles and Groups on VOMS (III) /gilda/grelc/das/Role=grelc-db-insert Case C (coarse grained)