Overview of Grids, Services Oriented Architectures and Science Portals Overview of Grids, Services Oriented Architectures and Science Portals Sriram Krishnan,

Slides:



Advertisements
Similar presentations
PRAGMA – TeraGrid – AIST Interoperation Testing Philip Papadopoulos.
Advertisements

PRAGMA 17 (10/29/2009) Resources Group Pacific Rim Application and Grid Middleware Assembly Resources.
PRAGMA BioSciences Portal Raj Chhabra Susumu Date Junya Seo Yohei Sawai.
Cindy Zheng, SC2006, 11/12/2006 Cindy Zheng PRAGMA Grid Testbed Coordinator P acific R im A pplication and G rid M iddleware A ssembly San Diego Supercomputer.
An Introduction to Web Services Sriram Krishnan, Ph.D.
PRAGMA19 – PRAGMA 20 Collaborative Activities Resources Working Group.
High Performance Computing Course Notes Grid Computing.
Chapter 9 Designing Systems for Diverse Environments.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
SOA, BPM, BPEL, jBPM.
THE NEXT STEP IN WEB SERVICES By Francisco Curbera,… Memtimin MAHMUT 2012.
PRAGMA21 – PRAGMA 22 Collaborative Activities Resources Working Group.
PRAGMA20 – PRAGMA 21 Collaborative Activities Resources Working Group.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Web services: Why and How OOPSLA 2001 F. Curbera, W.Nagy, S.Weerawarana Nclab, Jungsook Kim.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
DISTRIBUTED COMPUTING
Status of PRAGMA Activities at KISTI Jongbae Moon 1.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
PRAGMA 17 – PRAGMA 18 Resources Group. PRAGMA Grid 28 institutions in 17 countries/regions, 22 compute sites (+ 7 site in preparation) UZH Switzerland.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
© 2007 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing July 31, 2007 Wilfred.
Tools for collaboration How to share your duck tales…
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
Rochester Institute of Technology Cyberaide Shell: Interactive Task Management for Grids and Cyberinfrastructure Gregor von Laszewski, Andrew J. Younge,
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
SC2008 (11/19/2008) Resources Group Pacific Rim Application and Grid Middleware Assembly Reports.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
PRAGMA19 – PRAGMA 20 Collaborative Activities Resources Working Group.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Clouds , Grids and Clusters
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Distribution and components
Grid Computing.
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Introduction to Grid Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

Overview of Grids, Services Oriented Architectures and Science Portals Overview of Grids, Services Oriented Architectures and Science Portals Sriram Krishnan, PhD

Outline What is Grid Computing? What are Services Oriented Architectures? What are Science Portals? How are all the pieces tied together? Some case studies

Cluster Computing Independent computers combined into a unified system through software and networking Typical Setup –Collection of commodity computers (PCs) –Using a commodity network (Ethernet) –Typically running open-source operating system (Linux) Interconnect –Gigabit Ethernet (commodity) High Latency Cheap –Myrinet, Infiniband, … (non-commodity) Low Latency OS-bypass Expensive –Programming model is Message Passing History –Network Of Workstations (NOW) pioneered the vision for clusters of commodity processors –Beowulf popularized the notion and made it very affordable

© 2008 UC Regents High Performance Computing Cluster Front-end Node Public Ethernet Private Ethernet Network Application Network (Optional) Node Power Distribution (Net addressable units as option)

Clusters now Dominate High-End Computing

Grid Computing “Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organization.” [Foster, Kesselman, Tuecke] –Coordinated - multiple resources working in concert, eg. Disk & CPU, or instruments & database, etc. –Resources - compute cycles, databases, files, application services, instruments. –Problem solving - focus on solving scientific problems –Dynamic - environments that are changing in unpredictable ways –Virtual Organization - resources spanning multiple organizations and administrative domains, security domains, and technical domains

Other Terms Cyberinfrastructures –Encompasses advanced scientific computing, as well as a more comprehensive infrastructure for research and education based upon distributed, federated networks of computers, information resources, on-line instruments, and human interfaces (Atkins Report, 2003) eScience –Computationally intensive science that is carried out in highly distributed network environments (e.g. in the context of the U.K. eScience program)

Grids are not the same as Clusters! Ian Foster’s 3 point checklist –Resources not subjected to centralized control –Use of standard, open, general-purpose protocols and interfaces –Delivery of non-trivial qualities of service Grids are typically made up of multiple clusters

Popular Misconception Misconception: Grids are all about CPU cycles –CPU cycles are just one aspect, others are: Data: For publishing and accessing large collections of data, e.g. Geosciences Network (GEON) Grid Collaboration: For sharing access to instruments (e.g. UCSD TeleScience Grid), and collaboration tools (e.g. Global MMCS at IU)

How do you build a “Grid”? Start with raw hardware, Add storage and networks, Mix in scientific datasets, Build collaboratory and visualization tools How do you manage, provision, schedule, authenticate, monitor, program, and access these resources?

Uses 1000s of internet connected PCs to help in search for extraterrestrial intelligence When the computer is idle, the software downloads ~ 1/2 MB chunk of data for analysis. Results of analysis sent back to the SETI team, combined with 1000s of other participants Largest distributed computation project in existence –Total CPU time: years –Users: Statistics from 2006

NCMIR TeleScience Grid An ability to dynamically link resources together as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications IMAGING INSTRUMENTS COMPUTATIONAL RESOURCES LARGE-SCALE DATABASES DATA ACQUISITION,ANALYSIS ADVANCED VISUALIZATION “Telescience Grid”

TeraGrid TeraGrid is a “top-down”, planned Grid PSC Extensible Terascale Facility Members: IU, ORNL, NCSA, PSC, Purdue, SDSC, TACC, ANL, NCAR 280 Tflops of computing capability 30 PB of distributed storage High performance networking between partner sites Linux-based software environment, uniform administration Focus is a national, production Grid

PRAGMA Grid Member Institutions 31 institutions in 15 countries/regions (+ 7 in preparation) UZurich Switzerland NECTEC ThaiGrid Thailand UoHyd India MIMOS USM Malaysia CUHK HongKong ASGC NCHC Taiwan HCMUT IOIT-HCM Vietnam AIST OsakaU UTsukuba TITech Japan BII IHPC NGO NTU Singapore MU Australia APAC QUT Australia KISTI Korea JLU China SDSC USA CICESE Mexico UNAM Mexico UCN Chile UChile Chile UUtah USA NCSA USA BU USA ITCR Costa Rica BESTGrid New Zealand CNIC GUCAS China LZU China UPRM Puerto Rico

Usability Issues Access to Grid resources is still very complicated –User account creation –Management of credentials (identities) –Installation and deployment of scientific software –Interaction with Grid schedulers –Data management

Technical Challenges Security –Grids traverse organizational boundaries Different administration domains have different authentication mechanisms Resources have different use agreements and sharing priorities –Need to provide Single Sign-On (SSO), Authentication, Authorization Resource Management –Resources loosely-coupled Higher network latencies Planned and unplanned disruptions –Requirements Seamless access to Grid resources QoS guarantees for jobs Scheduling/co-scheduling of resources Failure management

Technical Challenges Data Management –Data Transfer GridFTP: High-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks –Managing large-scale scientific data across different sites Storage Request Broker (SRB): Shared collections that can be distributed across multiple organizations and heterogeneous storage systems

Technical Challenges Interoperability –In the past, different projects used different protocols and APIs Legion, Condor, Globus, SGE, etc –Need to use standard, open mechanisms Current thrust towards the use of Service oriented architectures and Web service technologies for interoperability

Service Oriented Architectures (SOAs) “SOA represents a model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create applications” - Erl, Thomas (2005). Service- Oriented Architecture: Concepts, Technology, and Design.

Benefits of SOAs Reduce complexity by encapsulating the back-end implementation –Service interfaces can be published and used by a number of clients Enable interoperability across systems through the use of open standards –Web services (WSDL, SOAP, XML Schemas) are de facto standards –Lend themselves well to the creation of workflows Support a loosely-coupled model where clients can bind to services at run-time –Enables greater flexibility and fault tolerance

What are Web Services? Many different definitions are available IBM (Gottschalk, et al): A Web service is an interface that describes a collection of operations that are network accessible through standardized XML messaging. Microsoft (on MSDN): A programmable application logic accessible using standard Internet protocols. Simply put, a Web service is a network service that provides a programmatic interface to remote clients

Web Services: Features Independent of programming language and OS All information required to contact a service is captured by the Web Service Description –Web Services Description Language (WSDL) provides a way to encapsulate an interface definition, data types being used, and the protocol information Web services provide programmatic access to remote clients using standard internet protocols

Web Services Lifecycle Service Registry Service Requestor Service Provider Lookup Publish Interact

Open Grid Services Architecture A standards-based distributed service system that supports the creation of sophisticated distributed services required in inter-organizational computing environments The standards are described by a set of specifications called the Web Services Resource Framework (WSRF)

Open Grid Services Architecture The evolution of the Grid to an architecture based on prior Grid and Web service technologies –Open: Extensibility, Vendor-neutrality, Committed to community standardization Use of WSDL to achieve self-describing, discoverable services & interoperable protocols Support for reliable & secure invocation, lifetime management, notification, policy & credential management, and virtualization

Open Grid Services Architecture

From Theory to Practice

SOAs in eScience Grid and scientific communities have been adopting SOAs over the past few years –Open Grid Services Architecture (OGSA) –Web Services Resource Framework (WSRF) However, in general, most past efforts have focuses on middleware, and not science –For instance, the Globus Toolkit –More recently, there are several efforts to build infrastructures for Services Oriented Science * I. Foster. “Services Oriented Science”. In the Science Magazine, 2005

Application-level Services Traditional model: Services for middleware tools, e.g. job launch, data transfer, etc Current trend: “Services Oriented Science” –Scientific applications as first class services –Delegation of middleware management to the services back-end –End-users are presented with science-oriented, and not middleware-oriented interfaces

Enabling Multiple User Interfaces Gemstone: ADT: GridSphere: Kepler:

What is a Web Portal? Web portals aggregate information content from diverse sources, and present them in a unified way Traditional Model –Monolithic websites, all information content co-located on central server Current Trend –Information content geographically distributed, and implemented as an SOA –Portals provide a single point of entry, by aggregating geographically distributed resources

What is a Web Portal? “ A portal is a web based application that commonly provides personalization, single sign on, content aggregation from different sources and hosts the presentation layer of Information Systems ” (JSR 168) Grid/Science Portals build upon the familiar Web portal model, such as Yahoo or Amazon, to deliver the benefits of Grid computing to virtual communities of users, providing a single access point to Grid services and resources.

Portals: Pros & Cons Pros –Single point of entry to diverse information sources –Ubiquitous access to applications (browser based) –No need to install complex software Cons –Limited interaction with local desktop tools –Interfaces may not be rich enough for complex tasks such as visualization –Not very easy to make highly interactive interfaces

Portal Technology JSR 168 Portlet API –Similar to Servlet API in providing reusable Web applications –Ratified in August 2003 by vendors including BEA, Sun, IBM, Oracle, Plumtree, etc GridSphere: –JSR 168 Compliant –Used by several projects at UCSD such as GEON, NEES, NBCR, CAMERA

What is a Portlet? Unit of composition for a portal - a portal is simply an aggregation of portlets Standardized packaging model to share applications among portal vendors Builds off Servlet API and specification so no major surprises for existing Java portal developers API provides useful methods for storing per user data and configuration settings Can be used as building blocks to aggregate content from disparate information sources

Putting it all together: NEES Architecture

Case Study: The NBCR SOA Transparent access to distributed resources by grid- enabling biomedical codes and biological and biomedical databases –Researchers should be able harness the computational and data resources without having to worry about the complexity of the back-end infrastructure Enable integration of applications across different scales (e.g. atomic to macro-molecular, to cellular and tissue, and so on) –With the help of commodity workflow tools and Problem Solving Environments (PSEs)

Approach Scientific applications wrapped as Web services –Provision of a SOAP API for programmatic access Clients interact with application Web services, instead of Grid resources

Security Services (GAMA) NBCR SOA: Big Picture Condor poolSGE Cluster PBS Cluster GlobusDRMAAGlobus Application Services State Mgmt Web PortalsADTKeplerContinuity

Scientific SOA: Benefits Applications are installed once, and used by all authorized users –No need to create accounts for all Grid users –Use of standards-based Grid security mechanisms Users are shielded from the complexities of Grid schedulers Data management for multiple concurrent job runs performed automatically by the Web service State management and persistence for long running jobs Accessibility via a multitude of clients

Web Portal based Access

Scientific Workflows Need for automation of scientific processes –An end-to-end application is typically more than a single application run Must be reproducible and maintainable Should be easy to compose from individual components

Molecular Visualization Using the Vision Workflow Toolkit

Bioinformatics Workflows Using Kepler

Conclusions Grid computing provides coordinated resource sharing and problem solving in dynamic multi-institutional virtual organization Service oriented Architectures (SOA) provide a model in which functionality is decomposed into small, distinct services, which can be distributed over a network and can be combined together and reused to create applications –Grid computing and eScience moving towards SOAs Web portals aggregate information content from diverse sources that are implemented as SOAs, and present them in a unified way –Services can also be accessed via a multitude of other clients

Questions?