Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataGrid is a project funded by the European Union CHEP 2003 – 24-28 March 2003 – Next Generation Data Mgmt... – n° 1 James Casey CERN

Similar presentations


Presentation on theme: "DataGrid is a project funded by the European Union CHEP 2003 – 24-28 March 2003 – Next Generation Data Mgmt... – n° 1 James Casey CERN"— Presentation transcript:

1 DataGrid is a project funded by the European Union CHEP 2003 – 24-28 March 2003 – Next Generation Data Mgmt... – n° 1 James Casey CERN James.Casey@cern.ch James.Casey@cern.ch On behalf of EU DataGrid WP2 Next-Generation EU DataGrid Data Management Services

2 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 2 Talk Outline u Introdution to EU DataGrid workpackage 2 u WP2 Service Design and Interactions n Spitfire n Replication Services n Security u Conclusions and outlook Authors Diana Bosio, James Casey, Akos Frohner, Leanne Guy, Wolfgang Hoschek, Peter Kunszt, Erwin Laure, Levi Lucio, Heinz Stockinger, Kurt Stockinger - CERN Giuseppe Andronico, Federico DiCarlo, Andrea Domenici, Flavia Donno, Livio Salconi – INFN William Bell, David Cameron, Gavin McCance, Paul Millar, Caitriona Nicholson – PPARC, University of Glasgow Joni Hahkala, Niklas Karlsson, Ville Nenonen, Mika Silander, Marko Niinimaki – Helsinki Institute of Physics Olle Mulmo, Gian Luca Volpato – Swedish Research Council

3 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 3 Grid middleware architecture hourglass Current Grid architectural functional blocks: OS, Storage & Network services Basic Grid Services High Level Grid Services Grid Application Services (LCG) Common application layer CMSATLASCMSLHCb Specific application layer GLOBUS 2.2 EU DataGrid middleware

4 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 4 EU DataGrid WP2 Data Management Work Package Responsible for u Transparent data location and secure access u Wide-area replication u Data access optimization u Metadata access NOT responsible for (but partially relying on other WPs for) u Data storage u Proper Relational Database bindings u Remote I/O u Security infrastructure

5 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 5 WP2 Service Paradigms u Choice of technology: n Java-based servers using Web Services s Tomcat, Oracle 9iAS n Interface definitions in WSDL n Client stubs for many languages (Java, C, C++) s Axis, gSOAP n Persistent service data in Relational Databases s MySQL, Oracle u Modularity n Modular service design for pluggability and extensibility n No vendor specific lock-ins u Evolvable n Easy adaptation to OGSA foreseen, based on the same technology n Largely independent of underlying OS, RDBMS

6 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 6 Spitfire: Grid-enabling RDBMS u Capabilities: n Simple Grid enabled front end to any type of local or remote RDBMS through secure SOAP-RPC n Sample generic RDBMS methods may easily be customized with little additional development, providing WSDL interfaces n Browser integration n GSI authentication n Hooks in place for local authorization u Status: current version 2.1 n Used by EU DataGrid Earth Observation and Biomedical applications. n Not suitable for the retrieval of LARGE result sets

7 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 7 Storage Element Replication Services: Basic Functionality Replica Manager Replica Location Service Replica Metadata Catalog Storage Element Files have replicas stored at many Grid sites on Storage Elements. Each file has a unique Grid ID. Locations corresponding to the GUID are kept in the Replica Location Service. Users may assign aliases to the GUIDs. These are kept in the Replica Metadata Catalog. The Replica Manager provides atomicity for file operations, assuring consistency of SE and catalog contents.

8 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 8 Storage Element Higher Level Replication Services Replica Manager Replica Location Service Replica Optimization Service Replica Metadata Catalog SE Monitor Network Monitor Replica Subscription Service Storage Element The Replica Manager may call on the Replica Optimization service to find the best replica among many based on network and SE monitoring. The Replica Subscription Service issues Replication commands automatically, based on a set of subscription rules defined by the user. Hooks for user-defined pre- and post- processing for replication operations are available.

9 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 9 Storage Element Interactions with other Grid components Replica Manager Replica Location Service Replica Optimization Service Replica Metadata Catalog SE Monitor Network Monitor Information Service Resource Broker User Interface or Worker Node Replica Subscription Service Storage Element Virtual Organization Membership Service Applications and users interface to data through the Replica Manager either directly or through the Resource Broker. Management calls should never go directly to the SE.

10 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 10 Replication Services Status u Current Status n All components are deployed right now n Initial tests show that expected performance can be met n Need proper testing in a ‘real user environment’ – EDG2; LCG1 u Features for next release n Currently Worker Nodes need outbound connectivity – Replica Manager Proxy Service needed. Needs proper security delegation mechanism. n Logical collections support n Service-level authorization n Subscription Service does not handle individual users – due to missing delegation.

11 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 11 Security: Infrastructure for Java- based Web Services u Trust Manager n Mutual client-server authentication using GSI (ie PKI X509 certificates) for all WP2 services n Supports everything transported over SSL u Authorization Manager n Supports coarse grained authorization: Mapping user->role->attribute n Fine grained authorization through policies, role and attribute maps n Web-based Admin interface for managing the authorization policies and tables u Status: n Fully implemented, authentication is enabled on the service level n Delegation implementation needs to be finished n Authorization needs more integration, waiting for deployment of VOMS

12 CHEP 2003 – 24-28 March 2003 –Next Generation Data Management... – n° 12 Conclusions and outlook u The second generation Data Management services have been designed and implemented based on the Web Service paradigm u Flexible, extensible service framework u Deployment choices : robust, highly available commercial products supported (eg. Oracle) as well as open-source (MySQL, Tomcat) u First experiences with these services show that their performance meets the expectations u Real-life usage will show its strengths and weaknesses on the LCG-1 and EDG2.0 testbeds during the rest of this year. Thanks to the EU and our national funding agencies for their support of this work


Download ppt "DataGrid is a project funded by the European Union CHEP 2003 – 24-28 March 2003 – Next Generation Data Mgmt... – n° 1 James Casey CERN"

Similar presentations


Ads by Google