The Gridbus Middleware: Building and Managing Utility Grids for Powering e-Science and e-Business Applications Dr. Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Laboratory Dept. of Computer Science and Software Engineering The University of Melbourne, Australia ww.gridbus.org ww.gridbus.org Gridbus Sponsors
2 Outline Introduction to E-Science Collaborative Science & Challenges Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
3 Big Science Problems & Collaborative Research Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation. Life Sciences Digital Biology Finance: Portfolio analysis ~PBytes/sec Newswire & data mining: Natural language engineering Astronomy Internet & Ecommerce High Energy PhysicsBrain Activity Analysis Quantum Chemistry
4 e-Science Environment: Supporting Collaborative Science Distributed instruments Distributed computation Distributed data Peers sharing ideas and collaborative interpretation of data/results Remote Visualization Data & Compute Service Cyberinfrastructure E-Scientist
5 Multi-institution Collaboration Challenges Security Resource Allocation & Scheduling Data locality Network Management System Management Resource Discovery Uniform Access Computational Economy Application Construction
6 Grid Computing Solution: (1) providing Cyberinfrastructure for e-Science; (2) delivering IT services as the 5 th utility E-Science E-Business E-Government E-Health E-Education …
7 Outline Introduction to E-Science Collaborative Science & Challenges Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
8 What is Grid? [Buyya et. al] A type of parallel and distributed system that enables the sharing, exchange, selection, & aggregation of geographically distributed “autonomous” resources: Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; Software – e.g., ASPs renting expensive special purpose applications on demand; Catalogued data and databases – e.g. transparent access to human genome database; Special devices/instruments – e.g., radio telescope – searching for life in galaxy. People/collaborators. depending on their availability, capability, cost, and user QoS requirements. Wide area
9 How does Grids look like? A Bird Eye View of a Global Grid Grid Resource Broker Resource Broker Application Grid Information Service Grid Resource Broker database R2R2 R3R3 RNRN R1R1 R4R4 R5R5 R6R6 Grid Information Service
10 How Are Grids Used? High-performance computing Collaborative data-sharing Collaborative design Drug discovery Financial modeling Data center automation High-energy physics Life sciences E-Business E-Science Natural language processing & Data Mining Utility computing
11 Classes of Grid Services / Types of Grids Computational Services – CPU cycles Pooling computing power: TeraGrid, AusGrid, ChinaGrid, IndiaGrid, UK Grid,… Data Services Collaborative data sharing generated by instruments, sensors, persons: LHC Grid, Napster Application Services Access to remote software/libraries and license management—NetSolve Interaction Services eLearning, Virtual Tables, Group Communication (Access Grid), Gaming Knowledge Services The way knowledge is acquired, processed and managed—data mining. Utility Computing Services Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities. Computational Grid Data Grid ASP Grid Interaction Grid Knowledge Grid Utility Grid infrastructure Users
12 Some Characteristics of Grids: Sources of Resource Management and Application Scheduling Challenges Numerous resources Different security requirements & policies Resources are heterogeneous Geographically distributed Different resource management policies Connected by heterogeneous, multi-level networks Owned by multiple organizations & individuals Unreliable resources and environments Slide by Hiro
13 Some Grid Initiatives Worldwide Australia Nimrod-G Gridbus GrangeNet. APACGrid ARC eResearch Brazil OurGrid, EasyGrid LNCC-Grid + many others China ChinaGrid – Education CNGrid - application Europe UK eScience EU Grids.. and many more... India Garuda Japan NAREGI Korea... N*Grid Singapore NGP USA Globus GridSec AccessGrid TeraGrid Cyberinfrasture and many more... Industry Initiatives IBM On Demand Computing HP Adaptive Computing Sun N1 Microsoft -.NET Oracle 10g Infosys – Enterprise Grid Satyam – Business Grid StorageTek –Grid.. and many more Public Forums Open Grid Forum Australian Grid Forum Conferences: CCGrid Grid HPDC E-Science billion – 3 yrs 1 billion – 5 yrs 450million – 5 yrs 486million – 5 yrs 1.3 billion (Rs) 27 million 2? billion 120million – 5 yrs
14 Layers of Grid Architecture Grid resources Desktops, servers, clusters, networks, applications, storage, devices + resource manager + monitor Security Services Authentication, Single sign-on, secure communication Job submission, info services, Storage access, Trading, Accounting, License Resource management and scheduling Grid programming environment and tools Languages, API, libraries, compilers, parallelization tools Grid applications Web Portals, Applications, System level User level Adaptive Management Core Middleware User-Level Middleware
15 Open-Source Grid Middleware Projects
16 The Gridbus Melbourne: Enable Leasing of ICT Services on Demand WWG Pushes Grid computing into mainstream computing Gridbus
17
18 Outline Introduction to E-Science Collaborative Science & Challenges Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
19 What does Grid players want? Grid Consumers Execute jobs for solving varying problem size and complexity Benefit by utilizing distributed resources wisely Tradeoff timeframe and cost Strategy: minimise expenses Grid Providers Contribute resources for executing consumer jobs Benefit by maximizing resource utilisation Tradeoff local requirements & market opportunity Strategy: maximise return on investment
20 What does Grid players require? They need tools and technologies that help them in value expression, value translation, and value enforcement. Grid Service Consumers (GSCs): How do I express QoS requirements ? How do I trade between timeframe & cost ? How do I map jobs to resources to meet my QoS needs? How do I manage Grid dynamics and get my work done? … Grid Service Providers (GSPs) How do I decide service pricing models ? How do I specify them ? How do I translate them into resource allocations ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? …
21 Solution 1: Service Oriented Architecture (SOA) A SOA is a contractual architecture for offering and consuming software as services. There are four entities that make up an SOA service provider, service registry, and service consumer (also known as service requestor). The functions or tasks that the service provider offers, along with other functional and technical information required for consumption, are defined in the service definition or contract. provider registry consumer contract
22 Solution 2: Market-Oriented Grid Computing - (a) Sustained Resourced Sharing and (b) Effective Management of Shared Resources Grid Economy
23 Grid Node N Service-Oriented Grid Architecture Grid Servuce Consumer Programming Environments Grid Resource Broker Grid Service Providers Grid Explorer Schedule Advisor Trade Manager Job Control Agent Deployment Agent Trade Server Resource Allocation Resource Reservation R1R1 Misc. services Information Service R2R2 RmRm … Pricing Algorithms Accounting Grid Node1 … Core Middleware Services … … Health Monitor Grid Market Services JobExec Info ? Secure Trading QoS Storage Sign-on Grid Bank Applications Data Catalogue
24 Layers of Grid Architecture Grid resources Desktops, servers, clusters, networks, applications, storage, devices + resource manager + monitor Security Services Authentication, Single sign-on, secure communication Job submission, info services, Storage access, Trading, Accounting, License Resource management and scheduling Grid programming environment and tools Languages, API, libraries, compilers, parallelization tools Grid applications Web Portals, Applications, Adaptive Management Application Development and Deployment Environment Distributed Resources Coupling Services Core Middleware User-Level Middleware System level User level Autonomic/ Grid Economy
25 Gridbus and Complementary Technologies – realizing Utility Grid AIX Solaris WindowsLinux.NET Grid Fabric Software Grid Applications Core Grid Middleware User-Level Middleware Grid Bank Grid Exchange & Federation JVM Grid Brokers: X-Parameter Sweep Lang. Gridbus Data Broker MPI CondorSGETomcatPBS Alchemi Workflow IRIXOSF1 Mac Libra GlobusUnicore … … Grid Market Directory PDBCDB Worldwide Grid Grid Fabric Hardware … … PortalsScienceCommerceEngineering … … Collaboratories … … Workflow Engine Grid Storage Economy Grid Economy NorduGridXGrid ExcellGrid Nimrod-G Gridscape
26 On Demand Assembly of Services: Putting Them All Together ASP Catalogue Grid Info Service Grid Market Directory GSP (Accounting Service) Gridbus GridBank GSP (e.g., UofM) PE GSP (e.g., VPAC) PE GSP (e.g., IBM) CPU or PE Grid Service (GS) (Globus) Alchemi GS GTS Cluster Scheduler Job 8 Grid Resource Broker 2 Visual Application Composer Application Code Explore data Results 97 Results+ Cost Info Bill 12 Data Catalogue
27
28 Alchemi:.NET-based Enterprise Grid Platform & Web Services Internet Alchemi Worker Agents Alchemi Manager Alchemi Users Web Services like Model General Purpose Dedicated/Non-dedicate workers Role-based Security.NET and Web Services C# Implementation GridThread and Job Model Programming Easy to setup and use Widely in use!
29 Some Users of Alchemi Tier TechnologiesTier Technologies, USA Large scale document processing using Alchemi framework CSIROCSIRO, Australia Natural Resource Modeling The Friedrich Miescher Institute (FMI) for Biomedical ResearchThe Friedrich Miescher Institute (FMI) for Biomedical Research, Switzerland Patterns of transcription factors in mammalian genes Satyam Computers Applied Research LaboratorySatyam Computers Applied Research Laboratory, India Micro-array data processing using Alchemi framework The University of Sao PauloThe University of Sao Paulo, Brazil The Alchemi Executor as a Windows Service stochastix GmbHstochastix GmbH, Germany Serving clients in International Banking/Finance sector Many users in Universities: See next for an example.
30 Students' project gives old computers new life - 1/25/2005
31 Outline Introduction to E-Science Collaborative Science & Challenges Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
32 A resource broker for scheduling task farming data Grid applications with static or dynamic parameter sweeps on global Grids. It uses computational economy paradigm for optimal selection of computational and data services depending on their quality, cost, and availability, and users’ QoS requirements (deadline, budget, & T/C optimisation) Key Features A single window to manage & control experiment Programmable Task Farming Engine Resource Discovery and Resource Trading Optimal Data Source Discovery Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & sharing of results Accounting Grid Service Broker (GSB)
33 Gridbus Broker Architecture Grid Middleware Gridbus Client Gribus Client Grid Info Server Schedule Advisor Trading Manager Gridbus Farming Engine Record Keeper Grid Explorer GE GIS, NWS TM TS RM & TS Grid Dispatcher RM: Local Resource Manager, TS: Trade Server G G C U Globus enabled node. A L Alchemi enabled node. (Data Grid Scheduler) Data Catalog Data Node Unicore enabled node. $ $ $ App, T, $, Opt (Bag of Tasks Applications)
34 Gridbus Broker and Remote Service Access Enablers Alchemi Gateway UnicoreData Store Access Technology Grid FTP SRB -PBS -Condor -SGE Globus Job manager fork()batch() Gridbus agent Data Catalog -PBS -Condor -SGE -XGrid SSH fork() batch() Gridbus agent Credential Repository MyProxy Home Node/Portal Gridbus Broker fork() batch() -PBS -Condor -SGE -Alchemi -XGrid Portlets
35 Gridbus Services for eScience applications Application Development Environment: XML-based language for composition of task farming (legacy) applications as parameter sweep applications. Task Farming APIs for new applications. Web APIs (e.g., Portlets) for Grid portal development. Threads-based Programming Interface Workflow interface and Gridbus-enabled workflow engine. Resource Allocation and Scheduling Dynamic discovery of optional computational and data nodes that meet user QoS requirements. Hide L ow-Level Grid Middleware interfaces Globus (v2, v4), SRB, Alchemi, Unicore, and ssh-based access to local/remote resources managed by XGrid, Condor, SGE.
36 Discover Resources Distribute Jobs Establish Rates Meet requirements ? Remaining Jobs, Deadline, & Budget ? Evaluate & Reschedule Discover More Resources Compose & Schedule Adaptive Scheduling Steps
37 Deadline (D) and Budget (B) Constrained Scheduling Algorithms AlgorithmExecution Time (D) Execution Cost (B) Compute Grid Data Grid Cost OptLimited by DMinimize Yes Cost-Time OptMinimize if possible Minimize Yes Time OptMinimizeLimited by B Yes Conservative- Time Opt MinimizeLimited by B, jobs have guaranteed minimum budget Yes
38
39 Figure 3 : Logging into the portal. Drug Design Made Easy! Click Here for Demo
40 Excel Plugin to Access Gridbus Services Excel ExcelGrid Add-In ExcelGrid Runner ExcelGridJob ExcelGrid MiddlewareGridbus BrokerEnterprise Grid 210 0
41 Outline Introduction to the University Melbourne, GRIDS Lab, and Opportunities Recap of the First Lecture What are Grids, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
42 Case Study: High Energy Physics and Data Grid The Belle Experiment KEK B-Factory, Japan Investigating fundamental violation of symmetry in nature (Charge Parity) which may help explain “why do we have more antimatter in the universe OR imbalance of matter and antimatter in the universe?”. Collaboration 1000 people, 50 institutes 100’s TB data currently
43 Case Study: Event Simulation and Analysis B0->D*+D*-Ks Simulation and Analysis Package - Belle Analysis Software Framework (BASF) Experiment in 2 parts – Generation of Simulated Data and Analysis of the distributed data Analyzed 100 data files (30MB each) that were distributed among the five nodes within Australian Belle DataGrid platform.
44 Australian Belle Data Grid Testbed VPAC Melbourne
45 Belle Data Grid (GSP CPU Service Price: G$/sec) NA G$4 Data node G$6 VPAC Melbourne G$2
46 Belle Data Grid (Bandwidth Price: G$/MB) NA G$4 Data node G$6 VPAC Melbourne G$
47 Deploying Application Scenario A data grid scenario with 100 jobs and each accessing remote data of ~30MB Deadline: 3hrs. Budget: G$ 60K Scheduling Optimisation Scenario: Minimise Time Minimise Cost Results:
48 Time Minimization in Data Grids Time (in mins.) Number of jobs completed fleagle.ph.unimelb.edu.aubelle.anu.edu.aubelle.physics.usyd.edu.aubrecca-2.vpac.org
49 Results : Cost Minimization in Data Grids Time(in mins.) Number of jobs completed fleagle.ph.unimelb.edu.aubelle.anu.edu.aubelle.physics.usyd.edu.aubrecca-2.vpac.org
50 Observation Organization Node detailsCost (in G$/CPU-sec)Total Jobs Executed TimeCost CS,UniMelbbelle.cs.mu.oz.au 4 CPU, 2GB RAM, 40 GB HD, Linux N.A. (Not used as a compute resource) -- Physics, UniMelbfleagle.ph.unimelb.edu.au 1 CPU, 512 MB RAM, 40 GB HD, Linux CS, University of Adelaide belle.cs.adelaide.edu.au 4 CPU (only 1 available), 2GB RAM, 40 GB HD, Linux N.A. (Not used as a compute resource) -- ANU, Canberrabelle.anu.edu.au 4 CPU, 2GB RAM, 40 GB HD, Linux 42 2 Dept of Physics, USyd belle.physics.usyd.edu.au 4 CPU (only 1 available), 2GB RAM, 40 GB HD, Linux VPAC, Melbournebrecca-2.vpac.org 180 node cluster (only head node used), Linux 623 2
51 Outline Introduction to E-Science Collaborative Science & Challenges Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions Service-Oriented Grid Architecture and Gridbus Solutions Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker Architecture, Design and Implementation Performance Evaluation: Experiments in Creation and Deployment of Applications on Global Grids A Case Study in High Energy Physics Summary and Conclusion
52 Summary and Conclusion Grids exploit synergies that result from cooperation of autonomous entities: Resource sharing, dynamic provisioning, and aggregation at global level Great Science and Great Business! Grids have emerged as enabler for Cyberinfrastructure that powers e-Science and e-Business applications. SOA + Market-based Grid Management = Utility Grids Grids allow users to dynamically lease Grid services at runtime based on their quality, cost, availability, and users QoS requirements. Delivering ICT services as computing utilities. Grids offer enormous opportunities for realizing e-Science and e-Business at global level. Use our Gridbus technology to realise this and make money!
53 Thanks for your attention! We Welcome Cooperation in Research and Development!