OGSA-DAI Lectures Part 2 Tom Sugden, EPCC 2 nd International Summer School on Grid Computing, Vico Equense, Italy
2nd International Summer School on Grid Computing2 Outline l Inside a Grid Data Service (15 mins) l OGSA-DAI User Guide (30 mins) l The Client Toolkit APIs (20 mins) l Wrap-up (15 mins)
2nd International Summer School on Grid Computing3 Status l OGSA-DAI middleware u Release 4 of 7 u functional and flexible u performance and scalability issues l Depends on: u Globus Toolkit 3.2 u Java 1.4+ u Apache Ant l Supports various databases u MySQL, Oracle, DB2, PostgreSQL, Xindice
Inside a Grid Data Service
2nd International Summer School on Grid Computing5 Grid Data Service Data Resource Perform Document Response Document Result Data
2nd International Summer School on Grid Computing6 Overview l Low-level components of a Grid Data Service u Engine u Activities u Data Resource Implementation u Role Mapper l Extensibility of OGSA-DAI architecture u Interfaces u Abstract classes u Implementations
2nd International Summer School on Grid Computing7 Data Resource Implementation Role Mapper The Engine GDS Internals data query perform document response document element credentials Query Activity Transform Activity Delivery Activity role credentials connection role
2nd International Summer School on Grid Computing8 Grid Data Service l GDS has a document based interface u Consumes perform documents u Produces response documents u Additional operations for 3 rd party data delivery l Motivation for using a document interface u Change in behaviour ≠> interface change u Reduce number of operation calls u Extensible
2nd International Summer School on Grid Computing9 The GDS Engine l Engine is the central GDS component l Dictates behaviour when perform documents are submitted u Parses and validates perform document u Identifies required activities implementations u Processes activities u Composes response document u Returns response document to GDS
2nd International Summer School on Grid Computing10 Perform Documents l Perform documents u Encapsulate multiple interactions with a service into a single interaction u Abstract each interaction into an “activity” u Data can flow from one activity to another u Not quite workflow l No control constructs present (conditionals, loops, variables) Query Transformation Delivery
2nd International Summer School on Grid Computing11 Activities l An Activity dictates an action to be performed u Query a data resource u Transform data u Deliver results l Engine processes a sequence of activities l Subset of activities available to a GDS u Specified in a configuration file l Data can flow between activities HTML data WebRowSet data SQL Query Statement XSLT Transform Delivery ToURL
2nd International Summer School on Grid Computing12 Activity Taxonomy l Statement u Interact with the data resource l Delivery u Deliver data to and from 3 rd parties l Transform u Perform transformations on data Activity StatementDeliveryTransform l Activities fall into three main functional groups
2nd International Summer School on Grid Computing13 Building Blocks Predefined Activities sqlQueryStatement sqlStoredProcedure sqlUpdateStatement sqlBulkLoadRowset xPathStatement xUpdateStatement xQueryStatement xmlResourceManagement xmlCollectionManagement relationalResourceManager gzipCompression zipArchive xslTransform inputStream outputStream DeliverFromURL DeliverToURL DeliverToGFTP DeliverFromGFTP DeliverToStream DeliverFromGDT DeliverToGDT
2nd International Summer School on Grid Computing14 The Activity Framework l Extensibility point l Users can develop additional activities u To support different query languages l XQuery u To perform different kinds of transformation l STX u To deliver results using a different mechanism l WebDAV l An activity requires u XSD schema sql_query_statement.xsd u Java implementation SQLQueryStatementActivity
2nd International Summer School on Grid Computing15 The Activity Class l All Activity implementations extend the abstract Activity class Activity ~ mContext: ActivityContext + Activity( element: Element ) ~ cleanUp() ~ initialise() ~ processBlock() : void ~ setCompleted()
2nd International Summer School on Grid Computing16 Connected Activities Sql Query Statement Deliver ToURL select * from myTable where id=10
2nd International Summer School on Grid Computing17 Connected Activities cont. Deliver ToURL select * from myTable where id=10 Sql Query Statement
2nd International Summer School on Grid Computing18 The Perform Document <gridDataServicePerform xmlns=" xmlns:xsi=" xsi:schemaLocation=" This example performs a simple select statement to retrieve one row from the test database then delivers the results to an FTP location. select * from littleblackbook where id=10
2nd International Summer School on Grid Computing19 SQL Query Statement Activity Inputs and Outputs l Activities read and write blocks of data u Allows efficient streaming between activities u Reduces memory overhead l A block is a Java Object u Untyped but usually a String or byte array l Interfaces for reading and writing u BlockReader and BlockWriter XSL Transform Activity Deliver To URL
2nd International Summer School on Grid Computing20 Relational database Data Resource Implementations l Governs access to a data resource u Open/close connections u Validate user credentials using a RoleMapper u Facilitate connection pooling l Provided for JDBC and XML:DB open connection close connection JDBC Data Resource get connection return connection SQL Query Statement
2nd International Summer School on Grid Computing21 Accessing Data Resource Sequence Diagram :Activity:RoleMapper :DataResource Implementation Get connection using user credentials Get database role using user credentials :DatabaseRole Get user ID and password open connection using user ID and password Do exciting things with the connection Return connection :Context Get user credentials and data resource implementation
2nd International Summer School on Grid Computing22 Advantages of the Activity Model l Avoid multiple message exchanges u Multiple activities within a single request l Extensible u Developers can add functionality u Could import third party trusted activities l Simplicity u Internal classes manage data flow, access to databases, etc
2nd International Summer School on Grid Computing23 Issues with Activity Model l Incomplete syntax u No typing of inputs and outputs l How do you determine the data types that can be accepted? l Keeping implementation and XML Schema fragment in synch l Puts workload on the server u May need dynamic job placement l DAIS has factored out the perform document from the draft specs
2nd International Summer School on Grid Computing24 Summary l The Engine is the central component of a GDS l Activities perform actions u Querying, Updating u Transforming u Delivering l Data Resource Implementations manage access to underlying data resources l Architecture designed for extensibility u New Activities u New Role Mappers u New Data Resource Implementations
OGSA-DAI User Guide
2nd International Summer School on Grid Computing26 OGSA-DAI in a Nutshell l All you need to know to get started with OGSA- DAI in a handy pocket sized book! l Updated for Version 4
2nd International Summer School on Grid Computing27 Overview Installing OGSA-DAI Configuring Grid Data Service Factories Registering Services Using Grid Data Services Writing perform documents Using the supplied client applications Using the client toolkit Learn by scenario
2nd International Summer School on Grid Computing28 Scenario: Red Eyed Tree Frogs Alice is a molecular biologist Based at the University of Edinburgh Mapped the genetic sequence of the Red-Eyed Tree Frog
2nd International Summer School on Grid Computing29 Background Alice wants to make her work available to the scientific community Publish an on-line database Use OGSA-DAI Alice Bob Carroll
2nd International Summer School on Grid Computing30 Alice’s Database Tree Frogs MySQL relational database jdbc:mysql://localhost:3306/TreeFrogs Contains 1 table with 1,000,000 rows GeneticSequence JDBC Database Driver org.gjt.mm.mysql.Driver Driver
2nd International Summer School on Grid Computing31 Installing OGSA-DAI l Download OGSA-DAI software u l Follow installation notes u Set-up prerequisite software l Java (JDK1.3 or newer) l Web services container (Tomcat) l Grid Middleware (Globus Toolkit 3.2) l Build tool (Ant) l Additional libraries (Log4J, database drivers, etc) u Deploy OGSA-DAI
2nd International Summer School on Grid Computing32 Configuring Services l Configure Grid Data Service Factories (GDSF) 1. Allow specific users read/write access 2. Allow anonymous users to search data Tree Frogs Public Factory Private Factory creates GDS creates GDS read/write read
2nd International Summer School on Grid Computing33 Part 1: Configuring Private Factory l Allow specific users to perform u SQL query statements u SQL update statements u Bulk load of data l To configure the factory: u Create data resource configuration file u Create activity configuration file u Create database roles file u Update server configuration
2nd International Summer School on Grid Computing34 Data Resource Configuration <roleMap implementation="...rolemap.SimpleFileRoleMapper" configuration="path/PrivateDatabaseRoles.xml"/> <dataResource implementation="...SimpleJDBCDataResourceImplementation"> jdbc:mysql://localhost:3306/treefrogs l Configuration file describes the data resource u Create TreeFrogsPrivate.xml u Base on examples\GDSFConfig\dataResourceConfig.xml
2nd International Summer School on Grid Computing35 Activity Configuration <activity name="sqlQueryStatement" implementation="package.SQLQueryStatementActivity" schemaFileName="path/sql_query_statement.xsd"/> <activity name="sqlUpdateStatement" implementation="package.SQLUpdateStatementActivity" schemaFileName="path/sql_update_statement.xsd"/> l Describes the activities that are supported by the data resource u Create TreeFrogsPrivateActivities.xml u Base on examples\GDSFConfig\activityConfig.xml
2nd International Summer School on Grid Computing36 Create Database Roles l Enables access to TreeFrogs database u Create file PrivateDatabaseRoles.xml u Base on examples\RoleMap\ExampleDatabaseRoles.xml alice / amph1b1an bob / tadp0le
2nd International Summer School on Grid Computing37 Edit Server Configuration l Specifies the services for the container l Loaded when Tomcat starts-up u Edit file server-config.xml... <parameter name="ogsadai.gdsf.config.xml.file" value="path/TreeFrogsPrivate.xml"/> <parameter name="ogsadai.gdsf.activity.xml.file" value="path/TreeFrogsPrivateActivities.xml"/>......
2nd International Summer School on Grid Computing38 Starting the Factory l Start service container (Tomcat) l View the factory using a web/service browser u Causes factory to start up ogsa/services/ogsadai/ TreeFrogFactoryPrivate ?wsdl
2nd International Summer School on Grid Computing39 Milestone 1 l Configuration for Private Tree Frog Factory complete l Specific users can u locate factory using known location u create GDS u query and update database Tree Frogs Private Tree Frog Factory creates GDS read/write
2nd International Summer School on Grid Computing40 Use-case 1: Remote update l Bob is a Professor of Biology u Based at the University of Sydney u Working in collaboration with Alice on the Red-Eyed Tree Frog genome l Through Alice’s OGSA-DAI services u Bob can contribute new sequences
2nd International Summer School on Grid Computing41 Interactions Client Tree Frogs 5. updated row count 4. bulk upload of data 3. new gene sequence 6. updated row count Private Tree Frog Factory Tree Frog Service 2. creates 1. creation parameters
2nd International Summer School on Grid Computing42 Perform Documents l Perform documents are used to communicate with GDS l Contain only supported activity types u sqlQueryStatement u sqlUpdateStatement u sqlBulkLoadRowSet l Results delivered in the response document l Many examples provided with OGSA-DAI GDS perform document response document specified in data resource configuration
2nd International Summer School on Grid Computing43 Simple Query l Select a range of chromosomes from GeneSequence l Use sqlQueryStatement activity SELECT Chromosome FROM GeneSequence WHERE Position > 1.1 AND Position < 1.2
2nd International Summer School on Grid Computing44 Simple Query Response l Response contained Web Row Set XML
2nd International Summer School on Grid Computing45 OGSA-DAI Clients l Send perform documents to a GDS using a client l OGSA-DAI provides 3 simple clients u Command-Line Client u Graphical Demonstrator u Data Browser > java uk.org.ogsadai.client.Client registryURL|factoryURL performDocPath > ant demonstrator > ant databrowser
2nd International Summer School on Grid Computing46 Performing Remote Update l Bob stores his new gene sequence in a local file l Use deliverFromURL and sqlBulkLoadRowSet activities to update remote database file://path/to/newSequence.xml
2nd International Summer School on Grid Computing47 Tree Frogs Tree Frogs updated row count Client GDS Interactions perform document updates GDS response document data pulled by GDS new gene sequence file
2nd International Summer School on Grid Computing48 handle Part 2: Configure Public Factory l Publish to the UK National Biology Registry Tree Frogs Public Factory creates GDS read l Allow anonymous users to search data handle National Biology Registry register find services
2nd International Summer School on Grid Computing49 Public Factory Set-up l Database changes u Alice defines findGene stored procedure l Supported activities u SQL stored procedure l To configure factory: u Create data resource configuration u Create activity configuration file u Create database roles file u Create service registration list u Update server configuration
2nd International Summer School on Grid Computing50 Data Resource Configuration l Configuration file describes the data resource u Create TreeFrogsPublic.xml u Base on examples\GDSFConfig\dataResourceConfig.xml <roleMap implementation="...rolemap.SimpleFileRoleMapper" configuration="path/PublicDatabaseRoles.xml"/> <dataResource implementation="...SimpleJDBCDataResourceImplementation"> jdbc:mysql://localhost:3306/treefrogs
2nd International Summer School on Grid Computing51 Activity Configuration <!– Only the sqlStoredProcedure activity is available to this GridDataService --> <activity name="sqlStoredProcedure" implementation="package.SQLStoredProcedureActivity" schemaFileName="path/sql_stored_procedure.xsd"/> l Describes the activities that are supported by the data resource u Create TreeFrogsPublicActivities.xml u Base on examples\GDSFConfig\activityConfig.xml
2nd International Summer School on Grid Computing52 Create Database Roles <User dn="No Certificate Provided" userid="guest" password="guest"/> l Enables access to TreeFrogs database u Create file PublicDatabaseRoles.xml u Base on examples\RoleMap\ExampleDatabaseRoles.xml guest / guest
2nd International Summer School on Grid Computing53 Edit Server Configuration l Specifies the services for the container l Loaded when Tomcat starts-up u Edit file server-config.xml... <parameter name="ogsadai.gdsf.config.xml.file" value="path/TreeFrogsPublic.xml"/> <parameter name="ogsadai.gdsf.activity.xml.file" value="path/TreeFrogsPublicActivities.xml"/> <parameter name="ogsadai.gdsf.registrations.xml.file" value="path/TreeFrogsRegistrationList.xml"/>......
2nd International Summer School on Grid Computing54 Create Service Registration List l Specifies a list of service group registries l Factory is registered with each registry u Create file TreeFrogsRegistrationList.xml u Base on example\GDSFConfig\registrationList.xml <gdsfRegistration... gsh=" ogsadai/NationalBiologyRegistry"/> GDSF-Private register National Biology Registry
2nd International Summer School on Grid Computing55 Starting the Factory l Start service container (Tomcat) l View the factory using a web/service browser u Causes factory to start up u Automatically registers with NationalBiologyRegister ogsa/services/ogsadai/ TreeFrogFactoryPublic ?wsdl
2nd International Summer School on Grid Computing56 Milestone 2 Tree Frogs GDSF-Private creates GDS read/write National Biology Registry GDSF-Public creates GDS read registers l Configuration for Public and Private Factories complete u Specific users have read/write access u Anonymous users can search data via stored procedure
2nd International Summer School on Grid Computing57 Use-case: Query with transformations l Carroll is a biochemist u Works for a small drugs company in Chicago u Investigating toxin in saliva of Fire Bellied Toad u Wants to compare proteins with Red Eyed Tree Frog
2nd International Summer School on Grid Computing58 protein sequence Transforming Sequences l Carroll has a protein sequence l Alice’s data is encoded as a gene sequence l There is a public Grid Data Transformation Service available at Newcastle University Transform Service gene sequence
2nd International Summer School on Grid Computing59 Interactions 1. Transform protein sequence needed for query Transform Service 1.2 gene sequence Client Tree Frog Service 1.1 protein sequence
2nd International Summer School on Grid Computing60 Transform Service Interactions 1. Transform protein sequence needed for query 2. Query tree frog gene sequence asynchronously 1.2 gene sequence Client 2.1 asynchronous query using gene sequence Tree Frog Service 1.1 protein sequence
2nd International Summer School on Grid Computing61 Transform Service Interactions 1. Transform protein sequence needed for query 2. Query tree frog gene sequence asynchronously 3. Transform results back into protein sequence 3.3 results as protein sequence Client 2.1 asynchronous query using gene sequence 3.2 results as gene sequence Tree Frog Service 3.1 pull results
2nd International Summer School on Grid Computing62 Client Toolkit l Why? Writing XML is a pain! l A programming API which makes writing applications easier u Now: Java u Next: Perl, C, C#? // Create a query SQLQuery query = new SQLQuery(SQLQueryString); // Perform the query Response response = gds.perform(query); // Display the result ResultSet rs = query.getResultSet(); displayResultSet(rs, 1);
2nd International Summer School on Grid Computing63 Conclusion l OGSA-DAI provides middleware tools to grid-enable existing databases access discovery integration transformation collaboration
2nd International Summer School on Grid Computing64 Amy Krause and Tom Sugden The Client Toolkit
2nd International Summer School on Grid Computing65 Overview l The Client Toolkit l OGSA-DAI Service Types l Locating and Creating Data Services l Requests and Results l Delivery l Data Integration Example
2nd International Summer School on Grid Computing66 Why use a Client Toolkit? l Nobody wants to read or write XML! l Protects developer from u Changes in activity schema u Changes in service interfaces u Low-level APIs u DOM manipulation
2nd International Summer School on Grid Computing67 OGSA-DAI Services l OGSA-DAI uses three main service types u DAISGR (registry) for discovery u GDSF (factory) to represent a data resource u GDS (data service) to access a data resource accesses represents DAISGR GDSF GDS Data Resource locates creates
2nd International Summer School on Grid Computing68 ServiceFetcher l The ServiceFetcher class creates service objects from a URL ServiceGroupRegistry registry = ServiceFetcher.getRegistry( registryHandle ); GridDataServiceFactory factory = ServiceFetcher.getFactory( factoryHandle ); GridDataService service = ServiceFetcher.getGridDataService( handle );
2nd International Summer School on Grid Computing69 Registry l A registry holds a list of service handles and associated metadata l Clients can query registry for all Grid Data Factories GridServiceMetaData[] services = registry.listServices( OGSADAIConstants.GDSF_PORT_TYPE ); l The GridServiceMetaData object contains the handle and the port types that the factory implements String handle = services[0].getHandle(); QName[] portTypes = services[0].getPortTypes();
2nd International Summer School on Grid Computing70 Creating Data Services l A factory object can create a new Grid Data Service. GridDataService service = factory.createGridDataService(); l Grid Data Services are transient (i.e. have finite lifetime) so they can be destroyed by the user. service.destroy();
2nd International Summer School on Grid Computing71 Interaction with a GDS Client GDS Activity Request Activity l Client sends a request to a data service l A request contains a set of activities
2nd International Summer School on Grid Computing72 Interaction with a GDS Client GDS Result Response Result l The Data service processes the request l Returns a response document with a result for each activity
2nd International Summer School on Grid Computing73 Activities and Requests l A request contains a set of activities l An activity dictates an action to be performed u Query a data resource u Transform data u Deliver results l Data can flow between activities HTML data WebRowSet data SQL Query Statement XSLT Transform Deliver ToURL
2nd International Summer School on Grid Computing74 gzipCompression zipArchive xslTransform Predefined Activities sqlQueryStatement sqlStoredProcedure sqlUpdateStatement sqlBulkLoadRowset xPathStatement xUpdateStatement xQueryStatement xmlResourceManagement xmlCollectionManagement relationalResourceManager inputStream outputStream DeliverFromURL DeliverToURL DeliverToGFTP DeliverFromGFTP DeliverToStream DeliverFromGDT DeliverToGDT DeliverToFile DeliverFromFile fileWriting directoryAccess fileAccess fileManipulation
2nd International Summer School on Grid Computing75 Examples of Activities l SQLQuery SQLQuery query = new SQLQuery( "select * from littleblackbook where id='3475'"); l XPathQuery XPathQuery query = new XPathQuery( ); l XSLTransform XSLTransform transform = new XSLTransform(); l DeliverToGFTP DeliverToGFTP deliver = new DeliverToGFTP( "ogsadai.org.uk", 8080, "myresults.txt" );
2nd International Summer School on Grid Computing76 Simple Requests l Simple requests consist of only one activity l Send the activity directly to the perform method SQLQuery query = new SQLQuery( "select * from littleblackbook where id='3475'"); Response response = service.perform( query );
2nd International Summer School on Grid Computing77 Constructing a Request SQL Query Statement XSLT Transform Delivery ToURL Request add
2nd International Summer School on Grid Computing78 Constructing a Request cont. SQL Query XSL Transform Delivery ToURL ActivityRequest ActivityRequest request = new ActivityRequest; request.add( query ); request.add( transform ); request.add( delivery );
2nd International Summer School on Grid Computing79 Data Flow l Connecting activities SQLQuery query = new SQLQuery( "select * from littleblackbook where id<=1000"); DeliverToURL deliver = new DeliverToURL( url ); deliver.setInput( query.getOutput() ); SQL Query Statement Deliver ToURL
2nd International Summer School on Grid Computing80 Performing Requests l Finally… perform the request! Response response = service.perform( Request ); l The response contains status and results of each activity in the request. System.out.println( response.getAsString() );
2nd International Summer School on Grid Computing81 Processing Results l Varying formats of output data u SQLQuery l JDBC ResultSet: ResultSet rs = query.getResultSet(); u SQLUpdate l Integer: int rows = update.getModifiedRows(); u XPathQuery l XML:DB ResourceSet: ResourceSet results = query.getResourceSet(); l Output can always be retrieved as a String String output = myactivity.getOutput().getData();
2nd International Summer School on Grid Computing82 Delivery l Data can be pulled from or pushed to a remote location. l OGSA-DAI supports third-party transfer using FTP, HTTP, or GridFTP protocols. DeliverToURL deliver = new DeliverToURL( url ); deliver.setInput( myactivity.getOutput() ); DeliverToGFTP deliver = new DeliverToGFTP( “ogsadai.org.uk”, 8080, “tmp/data.out” ); deliver.setInput( myactivity.getOutput() );
2nd International Summer School on Grid Computing83 Delivery Methods GDS GridFTP server Local Filesystem Web Server FTP server DeliverFromURL DeliverTo/FromURL DeliverTo/FromGFTP DeliverTo/FromFile
2nd International Summer School on Grid Computing84 Delivering data to another GDS l The GDT port type allows to transfer data from one data service to another. l An InputStream activity of GDS1 connects to a DeliverToGDT activity of GDS2 l Alternatively, an OutputStream activity can be connected to a DeliverFromGDT activity InputStream GDS1 GDS2 DeliverToGDT
2nd International Summer School on Grid Computing85 Delivering Data l Transfer in blocks or in full l InputStream activities wait for data to arrive at their input l Therefore, the InputStream activity at the sink has to be started before the DeliverToGDT activity at the source l Same for OutputStream and DeliverFromGDT
2nd International Summer School on Grid Computing86 Data Integration Scenario GDS2 GDS3 Relational Database Relational Database GDS1 Relational Database Client select + output stream select + output stream deliver deliver from GDT bulk load join tables
2nd International Summer School on Grid Computing87 Conclusion l Easy to use u No XML! u Less low-level APIs u improves usability and shortens learning curve for OGSA-DAI client development l Protects developer u Shielded from schema changes, protocols, GT3 l Limitations u Metadata and service-data not addressed adequate u Higher-level abstraction possible (no factory)
OGSA-DAI Wrap-up
2nd International Summer School on Grid Computing89 Overview Future Developments The OGSA-DAI Webpage Support Information Tutorials Links
2nd International Summer School on Grid Computing90 Future Developments Jan '04 - Feb '04 - Mar '04 – Apr '04 - May '04 - Jun '04 - Jul '04 - Aug '04 - Sep '04 - Oct '04 - Nov '04 - Dec '04 - Jan '05 - Feb '05 - Mar '05 – Apr '05 - May '05 - Jun '05 - Jul '05 - Aug '05 - Sep '05 - Oct '05 - Nov '05 - Dec '05 - R3.1: Technical preview of parts of R4 R5: Compliance with DAIS, distributed query and transactions, improved performance, scalability, dependability and security, installation wizard, coordinated contributor community R4: Enhancements and additional DBMS, SQL, File, Client toolkitR6: Features depend on user priorities, context and research R7: Maintainable release for the user community
2nd International Summer School on Grid Computing91 R5 R7 l R5 October 04 u Compliance with DAIS standards proposal u Distributed Relational Query Processing u Improved dependability and security integration u Extended & integrated XML and relational facilities u Distributed transaction participation u Coordinated OGSA-DAI contributor community l R6 April 05 u Integrated with GT4 u New facilities depend on user priorities, context and research u OGSA-DAI components from contributor community l R7 October 05 u Maintainable release for the user community
2nd International Summer School on Grid Computing92 OGSA-DAI Project Webpage l Background News & Events Software Releases Documentation Support Training Courses Links
2nd International Summer School on Grid Computing93 Support l Long term support for OGSA-DAI provided by UK Grid Support Centre u u l Web forms for submission of u General queries u Problems with installation and configuration u Problems with usage of software l Submissions are tracked and logged
2nd International Summer School on Grid Computing94 FAQ and Mailing List l Frequently Asked Questions u u updated as common problems become clear l Users mailing list u u general discussion of OGSA-DAI, data and the Grid u use support instead to report problems l Suggestions for additions and improvements to support service welcome
2nd International Summer School on Grid Computing95 Tutorials l Graphical Demonstrator User Guide l How to write an Activity Tutorial l Using the Client Toolkit Tutorial
2nd International Summer School on Grid Computing96 Links l OGSA-DAI Webpage u l Globus Toolkit 3 u l Database Access and Integration Services (DAIS-WG) u l Grid Technology Repository u l ELDAS - Enterprise-Level Data Access Services (Eldas) u l Web Services Choreography u
2nd International Summer School on Grid Computing97 Projects using OGSA-DAI l DQP - u Service Based Distributed Query Processor l FirstDIG - u Data mining analysis of OGSA-DAI service-enabled data sources l BIOGRID - u Construction of a Supercomputer Network to meet IT needs for biology and medical science in Japan l OGSA-WebDB - u Provides a uniform view of heterogeneous database resources in a grid environment l BioSimGrid - u A distributed database for biomolecular simulations l More projects–
2nd International Summer School on Grid Computing98 ODD-Genes l Data Analysis for genetics u Sites: l GTI (microarray data) l HGU (genex data) l EPCC (compute server) u Software: l OGSA-DAI (Data) l TOG (Computation) l Globus Toolkit 2 and 3 u
2nd International Summer School on Grid Computing99 FirstDIG l Data mining with the First Transport Group, UK u Example: “When buses are more than 10 minutes late there is an 82% chance that revenue drops by at least 10%” u OGSA-DAI OGSA-DAI Client Application Data Mining Application
2nd International Summer School on Grid Computing100 EdSkyQuery-G l Collaboration between OGSA-DAI & Eldas l Based on SkyQuery project by John Hopkins University, Baltimore, USA l Identify astronomical objects and dropouts amongst different distributed catalogues l Large scale data transport l Plug-in algorithms l Platform and DBMS independence
2nd International Summer School on Grid Computing101 EdSkyQuery-G Sky Data Sky Data Sky Data Sky Data
2nd International Summer School on Grid Computing102 EdSkyQuery-G Challenges l Data formats u XML (WebRowSet) u CSV u Binary u Compressed CSV or XML l Data transport u SOAP over HTTP/HTTPS u FTP, Secure-FTP, Grid-FTP l Importing/Exporting data u Through services u Direct from stored procedures u Using native tools
2nd International Summer School on Grid Computing103 SkyQuery.net
2nd International Summer School on Grid Computing104 Conclusion l Try out OGSA-DAI u It’s free! u Supported l Please send us feedback! l Evolving and improving u Data integration u Performance and scalability l Become involved u Write activities u Contribute to the DAIS working group
2nd International Summer School on Grid Computing105 HPC-Europa l EC-funded research visit programme l Fully-funded, multi-disciplinary l Visits between 3 and 13 weeks u EPCC in Edinburgh u CEPBA-CESCA in Barcelona/Catalonia u HLRS in Stuttgart u CINECA in Bologna u SARA in Amsterdam u IDRIS in Paris l
2nd International Summer School on Grid Computing106 OGSA-DAI Tutorial l Introduction to data access and integration on the Grid using OGSA-DAI u Using the Data Browser u Writing Clients using the Client Toolkit APIs l Start workstations in Windows mode u OGSA-DAI, Tomcat, MySQL and Xindice have already been configured