WP2: Data Management Gavin McCance University of Glasgow November 5, 2001
Overview Deliverables Replication: GDMP Meta-data: Spitfire GridPP effort Future work Query Optimisation
Deliverables EU DataGrid WP2: Major M9 deliverables met GDMP delivered Spitfire delivered Architecture Document
GDMP Generic mirroring tool for any file type (read only replica) Particular plug-ins for Objectivity database files Subscription model for automatic synchronisation of files Automatic update of replica catalogue Currently uses Globus Replica Catalogue
…GDMP BrokerInfo API from WP1 Allows users of GDMP to obtain information from the job scheduler Mass Storage Interface from WP5 e.g. Support for file staging Security is provided via standard GSI (single sign-on) Authorisation via grid mapfile File transfer made using GridFTP Installation: RPM and tarball
…GDMP usage 1. A,B) Start GDMP services (inetd) 2. B) Registers itself with site A gdmp_host_subscribe 3. A) New files Register them gdmp_register_local_file This updates the local (on A) catalogue 4. A) Tell the world (well..all subscribed sites) gdmp_publish_catalogue Will update the import catalogue on all subscribed sites Site ASite B
…GDMP usage 1. B) Get the new files from site A gdmp_replicate_get The new files will be transferred from site A site B Globus replica catalogue updated Filters so you only get files you want CRC checking of file transfer Site ASite B
Spitfire Provides grid enabled access to any relational database SQL Database Service Storage of general meta-data Service Index soon… Secure access via GSI (single sign-on) Installation: RPM and tarball
Allows any HTTP compliant system e.g. Web- browsers / standard C++ HTTP libraries to access any relational database across the grid… …Spitfire = SQL Database Service (Spitfire) Oracle PostgreSQL + Grid Security + Standard communication protocols (XML over HTTPS) JAVA Servlet based
…Spitfire security Authentication is currently provided Standard user & server grid certificates For both application programs and web browsers Authorisation matrix coming soon… Will map grid identity to role(s) Reader, info-update, manager Roles will then map to a given database connection with given permissions on a database Eg. query-only, insert, update, create new tables
…Spitfire Easy to install Good documentation Ready to run examples For grid-based meta-data catalogue needs.. … we need feedback!
WP2 GridPP Effort Based at Glasgow Effort will focus on primarily the query optimisation task of WP2 1 PhD student, 1.5 RA Continuing effort in development of Spitfire and related applications 0.7 RA
Future Spitfire work Look at common ground between WP2 and WP3 Spitfire and R-GMA? Security Authorisation mechanisms Other spitfire applications Service Index, Replica Catalogue Work on scaleable architectures Common with e.g. replica catalogue work
Query Optimisation work Categorise possible areas for optimisation: User oriented: high performance Minimising cost for specific job Grid oriented: high throughput Maximise efficient usage of resources Site oriented: local policy Respond to specific site policies / requirements Much preliminary work done! Workshop in December 2001…
…Query Optimisation Short term: Data Access optimisation Replica Optimiser component How long will it take to get the data here? Developing and evaluating appropriate algorithms for working this out and choosing best replica…
…Query Optimisation Modelling and Simulation Best not to test out the more crazy algorithms on the experiment testbed Work underway with MONARC tool Evaluating suitability as simulation tool for this particular work Integrate into the QO work
Summary Major deliverables for M9 met GDMP and Spitfire GridPP will concentrate effort on Query Optimisation task of WP2 + continued Spitfire development Work already underway