San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
TSpaces Services Suite: Automating the Development and Management of Web Services Presenter: Kevin McCurley IBM Almaden Research Center Contact: Marcus.
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida A Data Storage Language for.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Distributed components
Distributed Heterogeneous Data Warehouse For Grid Analysis
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Web service testing Group D5. What are Web Services? XML is the basis for Web services Web services are application components Web services communicate.
Data Integration in Service Oriented Architectures Rahul Patel Sr. Director R & D, BEA Systems Liquid Data – XML-based data access and integration for.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Web Services Mohamed Fahmy Dr. Sherif Aly Hussein.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
CIS 375—Web App Dev II Microsoft’s.NET. 2 Introduction to.NET Steve Ballmer (January 2000): Steve Ballmer "Delivering an Internet-based platform of Next.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Programming Gridflows using Matrix Arun Jagatheesan Architect, SDSC.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Dataflows in SRB using SDSC Matrix Arun Jagatheesan Architect & Team.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
1 HKU CSIS DB Seminar: HKU CSIS DB Seminar: Web Services Oriented Data Processing and Integration Speaker: Eric Lo.
James Holladay, Mario Sweeney, Vu Tran. Web Services Presentation Web Services Theory James Holladay Tools – Visual Studio Vu Tran Tools – Net Beans Mario.
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Web Services BOF This is a proposed new working group coming out of the Grid Computing Environments Research Group, as an outgrowth of their investigations.
The Anatomy of the Grid Introduction The Nature of Grid Architecture Grid Architecture Description Grid Architecture in Practice Relationships with Other.
Dr. Azeddine Chikh IS444: Modern tools for applications development.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for.
Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.
Web Services Martin Nečaský, Ph.D. Faculty of Mathematics and Physics Charles University in Prague, Czech Republic Summer 2014.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Business Process Execution Language (BPEL) Pınar Tekin.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Java Web Services Orca Knowledge Center – Web Service key concepts.
What is BizTalk ?
SuperComputing 2003 “The Great Academia / Industry Grid Debate” ?
WEB SERVICES.
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Service-centric Software Engineering
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
San Diego Supercomputer Center University of California, San Diego
Technical Issues in Sustainability
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
SDMX IT Tools SDMX Registry
Presentation transcript:

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida Data Grid Services and Pipelines Arun Jagatheesan Architect & Technical Lead, SDSC Matrix NPACI Summer Computing Institute August 18, 2003, San Diego

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 2 Credit / Acknowledgements Participants Allen Ding Lucas Gilbert Reena Mathew Erik Vandiekieft (IBM) Xi Cynthia Sheng Well Wishers Reagan Moore & SRB Team Kim Baldridge YOU !!! Sponsors NSF GriPhyN, NSF SCEC, NPACI REU, NIH BIRN

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 3 Lecture Outline Concepts Distributed Data Management Process Flow Pipelines Web Services; Grid Services Theory Data Grid Language (DGL) Practice (Hands-on) SDSC Matrix Web Demo Matrix Java API

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 4 Grid as Utility Computing

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 5 Logical Layers (bits,data,information,..) Storage Resource Transparency Storage Location Transparency E:\srbVault\image.jpg /users/srbVault/image.jpg Select … from srb.mdas.td where... Data Identifier Transparency image_0.jpg…image_100.jpg Data Replica Transparency image.sqlimage.cgiimage.wsdl Virtual Data Transparency Semantic data Organization (with behavior) patientRecordsCollectionmyActiveNeuroCollection Inter- organizational Information Storage Management

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 6 Is that all? We need more Hey, Who is this Guy?

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 7 Data  Discovery Digital entities Meta-data Services State New data updates relationships among data in collections Services invoked to analyze new relationships DGMS applications get notified of state updates

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 8 Distributed Data Management Data collecting Sensor systems, object ring buffers Data organization Collections, manage data context Data sharing Data grids, manage heterogeneity Data publication Digital libraries, support discovery Data preservation Persistent archives, manage technology evolution Data analysis Processing pipelines, choreograph data and knowledge extraction Data mediation Semantic data, mappings between data, information, knowledge Services, Data flow pipeline Management

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 9 Data process-flow pipelines Compute Archive Digital Library Research Input Coordinated execution amongst flows

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 10 Web Services Web Page (HTML) Searched and used by human being Any computer Useful for dissemination of information on any topic Web Service Searched and used by computer programs Any programming language, OS etc Useful for dissemination of services for any topic XML/ WSDL – Web Service Description SOAP (HTTP/SMTP) – Transport/Access UDDI - Discover WSDL SOAP (HTTP/SMTP) UDDI HTML – describe data layout HTTP – transport data Google – discover data

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 11 Lecture Outline Concepts Distributed Data Management Process Flow Pipelines Web Services; Grid Services Theory Data Grid Language (DGL) Practice (Hands-on) SDSC Matrix Web Demo Matrix Java API

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 12 Need for Standard DGL Database (DBMS) SQL 121.Event Hits.sql University of Gators 121.Event Thit.xml National Lab DGMS XML based, Invoke Operations Subset Xquery Process flow DGL DDL, DML, DQL

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 13 Data Grid Language XML based asynchronous protocol Describe data sets, collections, datagrid operations,... Access and Manage data grids, data-flow pipelines Query on data resource (based on W3C XQuery) Facilitates Grid Workflow Sharing of granular state information about execution of each datagrid operation amongst different processes or services

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 14 Data Grid Request (DReq) Datagrid Request Asynchronous requests for data/process-flow in datagrids Requests are either a Transaction or a Status Query Each Transaction consists of one or more Flows Each Flow consists of one ore more datagrid operations Datagrid operation = data transformation or data query A flow can be executed sequential or parallel

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 15 Data Grid Request

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 16 Data Grid Response Datagrid Response Either Transaction Acknowledgement or Status Response Status Response contains the results of a Transaction Response could be received at any granular level Status response is used for coordination of flows and inter-process notifications

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 17 Data Grid Response (DRes)

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 18 Lecture Outline Concepts Distributed Data Management Process Flow Pipelines Web Services; Grid Services Theory Data Grid Language (DGL) Practice (Hands-on) SDSC Matrix Web Demo Matrix Java API

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 19 “Lets play who wants to be a coder” Now its your turn to take the red pill from Matrix It gets interesting from here, lets us all do coding

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 20 SDSC Matrix Architecture Matrix Agent Abstraction In Memory StoreJDBC SRB AgentsOGSA Agent WSDL Agent Persistence (Store) Abstraction Termination Handler Matrix Data Grid Request Processor Transaction Handler Status Query Handler Data flow pipeline Meta data Manager JMS Messaging System JAXM Wrapper OGSARPC-Style for SOAP SOAP Service Wrapper Abstraction Flow Handler and Execution Manager Pipeline Query Processor XQuery Processor Event Publish Subscribe, Notification

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 21 Lesson – 1 : Data Grid Request Create Data Grid Request and its components

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 22 Learn it your self : Task - 1 Create Flow(0) in a Data Grid Request [DGREQ] Create a simple Data Grid Request using Web Demo Add Flow Make it Sequential Add Step Create Collection Collection Name : Click on Flow0 again, to add one more step in this Flow0 Create Container Container Name : Click on DGRequest link to see Flow0 with 2 steps

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 23 Learn it your self : Task - 2 Create Flow(1) in a Data Grid Request [DGREQ] Click on DGRequest link to see Flow0 with 2 steps Click on Add Flow Make it of type parallel Add Step Rename Collection Old Collection : to new name Click on Flow1 link, to add one more step in this Flow1 Create Collection Collection Name : Click on DGRequest link to see 2 Flows with 2 steps each

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 24 Learn it your self : Task - 3 Add Doc Meta for [DGREQ] Click on DOCMETA Fill your name (optional) Press >> to save details Doc Meta is just for reference. The Author is the process which created the request. The Author could have created the request for another user

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 25 Learn it your self : Task - 4 Add USERINFO for [DGREQ] Click on USERFINO Add user id : Organization: Challege Response: Home Directory </home/du22.npaci Storage Resource Press >> to save

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 26 Learn it your self : Task - 5 Add VOINFO for [DGREQ] Click on VOINO Add Server : Port: Click >> to save this in our demo VO Info is for Virtual Organization Information

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 27 Learn it your self : Task - 5 Send Data Grid Request First check if all components are ready We just learnt the components of a DReq. They all must be [Y] in demo, indicating they are ready Click Send If all the components are ok, the Data Grid Request is shown in XML Click Send DGReq

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 28 Lesson – 2 : Data Grid Acknowledgement, Status Get Data Grid Acknowledgement, Send Status Request, Receive Status Response

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 29 Data Grid Acknowledgement Data Grid Requests responded asynchronously Data Grid Acknowledgement Transaction ID to get status and result of DGReq All valid results are responded by this acknowledgement before they are processes Clients use this Acknowledgement Transaction ID The ID may be passed to third parties which can subscribe to these events (Grid Process Pipelines)

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 30 Data Grid Status Req and Response Transaction ID used to find status Later versions can use publish/subscribe Third party subscription also possible

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 31 Lesson – 3 : Query Data XQuery

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 32 XQuery W3C’s long waited answer – next SQL? As always, SDSC and our group lead the way A subset of Xquery on Data Grid has been implemented Built our own Xquery parser Demo CDL (in house project for NPACI Chemistry Digital Library)

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 33 Lesson – 4 : Java API Java API for Matrix

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 34 Demo Java Program Remember, its for programmatic exchange of state information for coordinated execution of data flow pipelines Java API. Sample Program Just download this zip file Unzip the file Type rundemo.bat Type rundemoquery.bat

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 35 Summary Coordinated execution of process-flow pipelines in Grid Environment necessary Data Grid Language in Data Grid like a SQL for databases SDSC Matrix Process flow pipelines Dynamic control of SRB and other services Discovery of process based on the data Check out our latest release Imagine what we can do for your project