CORAL CORAL a software system for vendor-neutral access to relational databases Ioannis Papadopoulos, Radoval Chytracek, Dirk Düllmann, Giacomo Govi, Yulia Shapiro, Zhen Xie International Conference on Computing for High Energy and Nuclear Physics February 2006, T.I.F.R. Mumbai, India
Project Scope ● CORAL is: – the relational software domain of the LCG persistency framework (POOL) – an one-year-old product, following one year of prototype developments within POOL. – a C++, SQL-free, technology-independent API for accessing relational databases – serving the access patterns of the major applications of the LHC experiments – addressing the special requirements of distributed deployment ● CORAL is not: – attempting to cover all possible use cases and access patterns with relational databases – yet another general-purpose RDBMS connectivity library
Deployment of Distributed Databases ● Involves multiple technologies – A typical scenario: Tier0,1 on Oracle, Tier2 on MySQL ● Requires service indirection and fail-over mechanisms – The conditions data from a user perspective: /my/conditions/data – The actual “physical” locations of the data: ● oracle://ora_host1/ora_schema ● oracle://ora_host2/ora_schema ● mysql://my_host1/my_db ● Requires safe authentication mechanisms – To avoid wide and open distribution of user names and passwords. ● Requires monitoring – Client-side monitoring to complement the server-side monitoring
coral::ISchema& schema = session.nominalSchema(); coral::TableDescription tableDescription; tableDescription.setName( “T_t” ); tableDescription.insertColumn( “I”, “long long” ); tableDescription.insertColumn( “X”, “double” ); schema.createTable( tableDescription); CREATE TABLE T_t( I BIGINT, X DOUBLE PRECISION) CREATE TABLE “T_t” ( I NUMBER(20), X BINARY_DOUBLE) Example 1: Table creation OracleMySQL C++, SQL-free API (I)
C++, SQL-free API (II) coral::ITable& table = schema.tableHandle( “T_t” ); coral::IQuery* query = table.newQuery(); query->addToOutputList( “X” ); query->addToOrderList( “I” ); query->limitReturnedRows( 5 ); coral::ICursor& cursor = query->execute(); SELECT * FROM ( SELECT X FROM “T_t” ORDER BY I ) WHERE ROWNUM < 6 SELECT X FROM T_t ORDER BY I LIMIT 5 Example 2: Issuing a query OracleMySQL
Component Architecture RDBMS Implementation (oracle) RDBMS Implementation (sqlite) RDBMS Implementation (frontier) RDBMS Implementation (mysql) Authentication Service (xml) Authentication Service (environment) Lookup Service (xml) Lookup Service (lfc) Relational Service implementation Monitoring Service implementation Connection Service implementation CORAL Interfaces (C++ abstract classes user-level API) CORAL C++ types (Row buffers, Blob, Date, TimeStamp,...) Plug-in libraries, loaded at run-time, interacting only through the interfaces Client Software Common Implementation developer-level interfaces
RDBMS Components (I) ● API highlights – DDL operations ● Creating, altering tables, views ● Multicolumn keys, indices and constraints ● BLOB, fixed or variable-size string variables – DML operations ● “Bulk” operations ● Insert-select statements – Queries ● May involve tables and views within a schema ● Nested queries ● Set operations ● Row pre-fetching
RDBMS Components (II) ● Technologies supported: – Oracle ● Fully implements the CORAL API ● Based on OCI version 10g – MySQL ● Best suited where low level of administration is required ● Based on native C API version 4.0 (currently migrating to 5.0) – SQLite ● File-based, embedded SQL engine ● No administration, very lightweight! ● A means of transferring small amounts of relational data? – Frontier ● squid caches between client and Oracle database server
Authentication Mechanisms ● “Physical” connection string does not contain information about authentication credentials – eg: oracle://oradb/app_schema – Simply tells where is the data ● The loaded Authentication Service provides the credentials given a connection string – Environment-variables – XML file – Secure, certificate-based authentication is required for a real-life, grid-based operation! ● Work in progress
Database Service Indirection ● A File Catalog for databases ● 1-N mapping of logical to physical data locations – XML-based implementation – Immediate future: LFC-based implementation
Client-Side Monitoring ● Why client-side monitoring: – To complement the information from server-side monitoring (usually an administration task) – To assist application tuning and optimization ● Primary aim in CORAL: – provide the necessary hooks for the RDBMS plug-in components to push information such as: ● Time stamp of the session and transaction boundaries ● Time of client waiting for the server to execute a query ● The SQL statement itself ● Simple prototype exists as an implementation example – Experiment-specific implementations are expected to be used eventually in production
Connection Pooling & Fail-Over Mechanisms ● ConnectionService component: – Prototype implementation inherited from ATLAS (see #32) – Attempts the minimization of resource consumption on the server by an application's components: ● Pooling/re-using connections ● Sharing read-only connections – Attempts re-trial and fail-over in case of lost connection – Overall configuration management ● The user's entry point to a CORAL system
A client requires a session handle... ● ConnectionService internal operations: – What are the available replicas for “/my/conditions/data”? – The Lookup Service will reply “oracle://host/schema”,... – The Relational Service will load the oracle plug-in” – The Authentication Service will return user name and password corresponding to “oracle://host/schema”. – The oracle plug-in will connect and authenticate ● In case of failure the Connection Service tries the next replica – The Monitoring Service will register the start of the session ● The user has a valid handle for performing DDL, DML, queries. coral::ISessionProxy* session = connectionService->connect(“/my/database/schema”, coral::Update );
Why develop a CORAL-based application? ● to benefit from the introduction of new RDBMS implementations ● to be able to run – at any Tier level of the Grid – in TGV mode ● to avoid thinking in terms of database optimizations – Best practices are: ● enforced by design – eg. use of bind variables ● implemented inside the CORAL components – eg. efficient management of server-side cursors ● to assist database administrators – they have to deal with only a limited set of SQL patterns
CORAL in Use ● Relational components of POOL – File Catalog – Collections – ORA Relational Storage Manager (see #330) ● COOL (see #337) ● Experiment-specific software – CMS conditions – ATLAS detector geometry – ATLAS on-line applications –...
Outlook ● CORAL has already been integrated into the software of the LHC experiments – directly (eg. on-line applications) – indirectly (through POOL and COOL) ● Immediate development items address: deployment requirements ● Highest priority: focus on this year's service challenges
for further information...