Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian Bunn, Saima Iqbal CALTECH ( California Institute of Technology ). 4/16/2017
OUTLINE 4/16/2017 Introduction What is Relational data warehouse Distributed Heterogeneous Relational Data Warehouse Databases (DHRD) and Grid How DHRD could be integrated with the Grid Why web services? Building blocks of Web Services Vital parts of Web Services How DHRD could be integrated with the Grid as a Web Service Grid services Grid services architecture (GGF) [draft 16th Feb. 2003] Grid services client infrastructure (GGF) [ draft 5th Jun. 2003] Proposed web services architecture based on Grid services to use DHRD in Grid environment Technologies employed UDDI complaint registry service Working of web services prototype (demo) Conclusion Future work Questions? 4/16/2017
INTRODUCTION Can databases integrated with the Grid ? Most of the existing and proposed Grid applications are file based. Very little work has been done on how Distributed Heterogeneous Databases can be made available on the Grid. Web Services can help in accessing Distributed Heterogeneous Databases as a single “Virtual Database” across the Grid. 4/16/2017
Distributed Data Warehouse The distributed database system allows applications to access data from local & remote databases. It helps to move some of data and some of the users to separate servers and databases. Allow to keep data by a particular workgroup at Tier 2 and Tier 3, on a server nearby. Reduce the need for massive central computing and network delays. 4/16/2017
Distributed Heterogeneous Relational Data Warehouse (DHRD) Databases and Grid Is it possible to access DHRD databases across Grid by adopting the existing Grid services that handle files? While relational databases offers much richer set of operations like queries and transactions. There is much differences exists among different DBMS as that of different file systems. Even within one paradigm different databases products ( Oracle, MS-SQL, DB2) vary in their functionality and interfaces. 4/16/2017
How DHRD Could Be Integrated with The Grid The diversity of DHRD makes it difficult to design a single solution to integrate DHRD databases with Grid. The Open Grid Services Architecture (OGSA) for distributed system provide the concept of Grid Services (like Web Services) to access resources across distributed and heterogeneous environment. These Grid Services/Web Services can help in providing the distributed databases across the Grid as a “Virtual Database System”. 4/16/2017
Why Web Services? Web Services are centered on the Service definition and messages Web Services build on set of well established technologies and protocols - XML used for service description and data interchange http used as a transport protocol - widely deployed with trusted security features Web Services standards are structured and extensible - Interface evolution without breaking what is already working Provide solution for the access of heterogeneous, web-wide resources. 4/16/2017
Building blocks Of Web Services Web Services are modular software components wrapped inside a specific set of Internet communication protocols and that can be run over the Internet. At the heart, web services architecture is the need for program-to-program communications. Key roles in the web services architecture are : - a service provider - a service registry - a service requestor 4/16/2017
Building blocks Of Web Services (cont’d) - Together they perform three operation on web services Publish, Find and Bind SERVICE PROVIDER 1 Publish 3 Bind Make the service description publicly available Allows the service to be used by the requestor SERVICE REQUESTOR SERVICE REGISTRY 2 Find Discover the service 4/16/2017
Vital Parts of Web Services SOAP (Simple Object Access Protocol) through which the service provider, service registry and service requestor communicate. WSDL( Web Services Description Language) is the language used to create service description. UDDI (Universal Description Discovery and Integration) is the directory technology used by service registries that contain the description of web services and allows the directory to be searched for a particular web service. 4/16/2017
How DHRD Could Be Integrated with The Grid As A Web Service The Distributed Heterogeneous Relational Databases can register themselves as a web service in a UDDI registry. These web services can be accessible by a client through web application by using WSDL. In this architecture Client is very important because this Client can dynamically discover services, configure the remote calls on the basis of the inputs it gets from http call. 4/16/2017
Grid Services The OGSA integrates key Grid technologies (including Globus toolkit) with Web Services mechanisms to create a distributed system framework around the OGSI (Open Grid Services Infrastructure). A Grid Service is a Web Service that conforms to a set of conventions (interfaces & behavior) that define how a client interacts with a services available across Grid. 4/16/2017
Grid Services Architecture (cont’d) (Grid Database Service specification (GGF)) GridDataService GridServicePort FindServiceData <ServiceData> GridDataServicePort Perform Requester <Response> GridDataTransport Port Put/get <Response> GridDataService 4/16/2017 Requester Using Grid Data Service Ports
Grid Services Architecture (Grid Database Service specification (GGF)) GridServiceRegistry FindServiceData GSH(GridServiceHandler) CreateService Requester GridDataServiceFactory <ServiceInformation> create GridDataService 4/16/2017 Database Servers Creating a Grid Data Service
Grid Services Client Infrastructure (Grid Database Service specification (GGF)) Proxy Binding Selection Client Application Protocol 1 (binding) Specific stub Invocation of Web Service A Client-Server Interface Protocol 2 (binding) Specific stub A Client-Side runtime architecture 4/16/2017
Proposed Web Services Architecture Based on Grid Services To Use DHRD In Grid Environment ORACLE9i SERVER DATA (META DATA) MonaLisa Data Replication through SSL (Service Registry) UUDI Registry Server ORACLE9i SERVER DATA (META DATA) SOAP JAVA XML API to connect with Database Server Web Server HTTP Server Server with Master Database SOAP Processor DISTRIBUTED DATABASE WSDL file (Service Provider) UDDI SOAP Request and Response SOAP Bind with the provided service MS-SQL DATA (META DATA) Server with Materialized View Database Client Web Application to connect with database 4/16/2017 (Service Requestor)
Technologies Employed Java Web Services Developer Pack 1.0 (JWSDP) Apache Tomcat 4.1.2 for Java Web Services Developer Pack 1.0 -Apache web server -Tomcat servlet engine Java API for XML Registries (JAXR) 1.0_02 Java API for XML-based RPC (JAX-RPC) 1.0_01 Web Application Deployment Tool for JWSDP XRPCC tool to generate WSDL JWSDP Registry Server 1.0_02 -Xindice database, the repository for registry data -implements Version 2.00 of the Universal Description, Discovery and Integration (UUDI) 4/16/2017
UDDI Complaint Service Registry A standardized, transparent mechanism for describing the service A simple mechanism for invoking the service An accessible central registry services Make use of XML and SOAP Provide service discovery platform on WWW Suitable for “Black Box” web environment Allow to store as much as detail about a service and its implementation as desired The UDDI version 2.0 API defines approx. 40 messages to perform inquiry and publishing functions against any UDDI complaint service registry The schema defines 25 requests and 15 responses 4/16/2017
Working of Web Services Prototype 6 SOAP Message Program Implementation Database Server Ties 7 JAX-RPC Runtime SOAP Message 8 5 JAX-RPC Runtime JAXR Registry Server Web server Stubs JAX-RPC 3 Program Interface 9 2 4 Find-service 1 10 SOAP Message http Web Service Requester 4/16/2017
Working of Web Services Prototype 4/16/2017
Working of Web Services Prototype DEMO 4/16/2017
Conclusion It seems possible that we can make the Distributed Heterogeneous Relational Data Warehouse Databases available across the Grid in form of Web Services/Grid Services. 4/16/2017
Future Work Integration of MonALISA (Grid monitoring tool), for the location of required web service with optimal network resources Exploit UDDI with its full functionality Provide an API to integrate this Grid Services based Web Services prototype into the Globus toolkit 4/16/2017
Questions? 4/16/2017