Seamless, Scalable Computing from Desktops to Global Computational Power Grids: Hype or Reality ? Rajkumar Buyya School of Computer Science and Software Engineering Monash University, Melbourne, Australia Email: rajkumar@csse.monash.edu.au | rajkumar@buyya.com Internet Home: www.csse.monash.edu.au/~rajkumar | www.buyya.com Thanks to: David Abramson Jonathan Giddy Jack Dongarra Wolfgang Gentzsch http://www.buyya.com/ecogrid/
Agenda Computing Platforms and How Grid is Different ? Cluster Computing Technologies and new challenges ? Clusters of Clusters (Federated/Hyper Clusters) Towards Global (Grid) Computing Grid Resource Management and Scheduling Challenges Grid Technologies (in development) Grid Substrates and Interoperability Grid Middleware Resource Brokers and High Level Tools Applications and Portals Conclusion
Computing Power (HPC) Drivers Solving grand challenge applications using computer modeling, simulation and analysis E-commerce/anything Life Sciences Aerospace Digital Biology CAD/CAM Military Applications Military Applications Military Applications
How to Run App. Faster ? There are 3 ways to improve performance: 1. Work Harder 2. Work Smarter 3. Get Help Computer Analogy 1. Use faster hardware: e.g. reduce the time per instruction (clock cycle). 2. Optimized algorithms and techniques 3. Multiple computers to solve problem: That is, increase no. of instructions executed per clock cycle.
Computing Platforms Evolution Breaking Administrative Barriers 2100 ? PERFORMANCE 2100 Administrative Barriers Individual Group Department Campus State National Globe Inter Planet Universe Desktop (Single Processor?) SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planet Cluster/Grid ??
Towards Inexpensive Supercomputing It is: Cluster Computing.. The Commodity Supercomputing!
Clustering of Computers for Collective Computing: Trends 1990 1995+ 1960
Clustering Today Clustering is gained momentum when 4 technologies converged: 1. Killer Micros/CPUs: Very HP Microprocessors workstation performance = yesterday supercomputers. 2. Killer Networks: High speed communication Comm. between cluster nodes >= between processors in an SMP. 3. Killer Tools: Standard tools for parallel/ distributed computing & their growing popularity. 4. Killer Applications: scientific, engineering, Database, and Internet/Web serving.
Cluster Computer Architecture
Major issues in cluster design Scalability (physical & application) Performance High Availability (fault-tolerance and failure management) Single System Image (look-and-feel of one system) Fast Communication (networks & protocols) Load Balancing (CPU, Memory, Disk, Net) Security and Encryption (clusters of clusters) Distributed Environment (Social issues) Manageability (administration and control) Programmability (simple API if required) Applicability (cluster-aware and non-aware apps.)
Killer Micro: Technology Trend (lead to the death of traditional supercomputers)
Killer Networks and Protocols Ethernet (10Mbps), Fast Ethernet (100Mbps), Gigabit Ethernet (1Gbps) SCI (Dolphin - MPI- 12micro-sec latency) ATM Myrinet (1.2Gbps) Digital Memory Channel Internet!!! Protocols (Light weight protocols (User Level) TCP/IP? Active Messages (Berkeley) Basic Interface for Parallelism (Lyon, France) Fast Messages (Illinois) U-net (Cornell) VIA (Industry standard)…...
Killer Tools/Middleware (e.g., HKU JESSICA) http://www.srg.csis.hku.hk/jessica.htm
Killer Applications Numerous Scientific & Engineering Apps. Parametric Simulations Business Applications E-commerce Applications (Amazon.com, eBay.com ….) Database Applications (Oracle on cluster) Decision Support Systems Internet Applications Web serving Infowares (yahoo.com, AOL.com) ASPs (application service providers) eChat, ePhone, eBook, eCommerce, eBank, eSociety, eAnything! Computing Portals Mission Critical Applications command control systems, banks, nuclear reactor control, star-war, and handling life threatening situations.
Science Portals - e.g., PAPIA system Pentiums Myrinet NetBSD/Linuux PM Score-D MPC++ RWCP Japan: http://www.rwcp.or.jp/papia/ PAPIA PC Cluster
Cluster Computing Forum IEEE Task Force on Cluster Computing (TFCC) http://www.ieeetfcc.org
Adoption of the Approach
Clusters of Clusters (HyperClusters) Scheduler Master Daemon Execution Submit Graphical Control Clients Cluster 2 Cluster 3 Cluster 1 LAN/WAN
Towards Grid Computing…. For illustration, placed resources arbitrarily on the GUSTO test-bed!!
What is Grid ? An infrastructure that couples Computers (PCs, workstations, clusters, traditional supercomputers, and even laptops, notebooks, mobile computers, PDA, and so on) … Software ? (e.g., ASPs renting expensive special purpose applications on demand) Databases (e.g., transparent access to human genome database) Special Instruments (e.g., radio telescope--SETI@Home Searching for Life in galaxy, Austrophysics@Swinburne for pulsars) People (may be even animals who knows ?) across the local/wide-area networks (enterprise, organisations, or Internet) and presents them as an unified integrated (single) resource.
Conceptual view of the Grid Leading to Portal (Super)Computing http://www.sun.com/hpc/
Grid Application-Drivers Old and New applications getting enabled due to coupling of computers, databases, instruments, people, etc: (distributed) Supercomputing Collaborative engineering high-throughput computing large scale simulation & parameter studies Remote software access / Renting Software Data-intensive computing On-demand computing
The Grid Vision: To offer “Dependable, consistent, pervasive access to [high-end] resources” Dependable: Can provide performance and functionality guarantees Consistent: Uniform interfaces to a wide variety of resources Pervasive: Ability to “plug in” from anywhere Source: www.globus.org
The Grid Impact! “The global computational grid is expected to drive the economy of the 21st century similar to the electric power grid that drove the economy of the 20th century”
Sources of Complexity in Grid Resource Management No single adminstrative control No single policy each resource owners have their own policies or scheduling mechanisms. Users must honor them (particularly external users of the grid). Heterogenity of resources (static and dynamic) Unreliable - resource may come or disappear (die) No uniform cost model (it cannot be) varies from one user to another and time to time. No Single access mechanism
Resource Management Challenges introduced by the GRID Site Autonomy Heteregenous Resources and Substrate Each site has systems with different architectures (SMPs, Clusters.Linux/NT/Unix./Intel) each resource owners have their own policies or scheduling mechanisms (Codine/Condor). Policity Extensibility (through resource brokers) Resource Allocation/Coallocation Online Control (can Apps (Graphics) tolerate nonavailablity of required resource and adapt itself) Computational Economy
Grid Resource Management: Challenging Issues Authentication (once) Specify simulation (code, resources, etc.) Discover resources Negotiate authorization, acceptable use, Cost, etc. Acquire resources Schedule Jobs Initiate computation Steer computation Access remote data-sets Collaborate on results Account for usage Domain 1 Domain 2 Ack.: globus..
Grid Components … … … … Grid Apps. Scientific Engineering Applications and Portals Grid Apps. … Scientific Engineering Collaboration Prob. Solving Env. Web enabled Apps Development Environments and Tools Grid Tools … Languages Libraries Debuggers Monitoring Resource Brokers Web tools Distributed Resources Coupling Services Grid Middleware … Comm. Sign on & Security Information Process Data Access QoS Local Resource Managers Operating Systems Queuing Systems Libraries & App Kernels … TCP/IP & UDP Grid Fabric Networked Resources across Organisations … Computers Clusters Storage Systems Data Sources Scientific Instruments
Many GRID Projects and Initiatives PUBLIC FORUMS Computing Portals Grid Forum European Grid Forum IEEE TFCC! GRID’2000 and more. Australia Nimrod/G EcoGrid and GRACE DISCWorld Europe UNICORE MOL METODIS Globe Poznan Metacomputing CERN Data Grid MetaMPI DAS JaWS and many more... Public Grid Initiatives Distributed.net SETI@Home Compute Power Grid USA Globus Legion JAVELIN AppLes NASA IPG Condor Harness NetSolve NCSA Workbench WebFlow EveryWhere and many more... Japan Ninf Bricks http://www.gridcomputing.com/
Distributed ASCI Supercomputer Many GRID Testbeds... GUSTO Distributed ASCI Supercomputer NASA IPG
Scheduler Architecture Alternatives Scheduler Model Scheduler Architecture Example System Centralized (Single Resource) R Unix, Linux, Windows OS R1 R2 Rn . Cluster systems such as LSF, Codine, Nimrod, Condor Centralized (Multiple Resources) (Single/Multiple Domains) (Jobs) (Queue) R Multi-clusters or Grid Systems such as AppLes, Nimrod/G, HTB, Legion, Condor Hierarchical (Multiple Domains) R1 (Local Scheduler) . (Resource Broker or Super Scheduler) . ... R Rn R Cluster systems like MOSIX follows simple model. Grid systems can follow, but it is complex to realize. ……. Decentralized (Self coordinated or Job Pool) (Single/Multiple Domains) R1 (Job Pool) . . Rn
Grid Components and our Focus: Resource Brokers and Computational Economy Apps. Applications High-level Services and Tools GlobusView Testbed Status Grid Tools DUROC MPI MPI-IO CC++ Nimrod/G globusrun Core Services Grid Middleware Heartbeat Monitor Nexus GRAM GRACE Globus Security Interface Gloperf MDS GASS GARA Grid Fabric Local Services Condor MPI TCP UDP LSF Easy NQE AIX Irix Solaris Source: Globus
Nimrod/G Resource Broker Nimrod/G Approach to Resource Management and Scheduling
What is Nimrod/G ? A global scheduler for managing and steering task farming (parametric simulation) applications on computational grid based on deadline and computational economy. Key Features A single window to manage and control whole experiment Resource Discovery Trade for Resources Scheduling Steering & data management Leverages Globus Services Other parallel application models can be supported easily (MPI & DUROC co-allocation)
A Nimrod/G User Console Deadline Cost Available Machines
Nimrod/G Architecture Nimrod/G Client Nimrod/G Client Nimrod/G Client Nimrod Engine Schedule Advisor Persistent Store Trading Manager Dispatcher Grid Explorer TM TS Middleware Services GE GIS Grid Information Services RM & TS RM & TS RM & TS GUSTO Test Bed RM: Local Resource Manager, TS: Trade Server
Nimrod/G Interactions Resource location MDS server Additional services used implicitly: GSI (authentication & authorization) Nexus (communication) Scheduler Trade Server Resource allocation (local) Prmtc.. Engine Dispatcher GRAM server Queuing System Job Wrapper User process GASS server File access Root node Gatekeeper node Computational node
Resource Location/Discovery Need to locate suitable machines for an experiment Speed Number of processors Cost Availability User account Available resources will vary across experiment Supported through Directory Server (Globus MDS)
Computational Economy Resource selection on based real money and market based A large number of sellers and buyers (resources may be dedicated/shared) Negotiation: tenders/bids and select those offers meet the requirement Trading and Advance Resource Reservation Schedule computations on those resources that meet all requirements
Cost Model 1 3 2 User 5 Machine 1 User 1 Machine 5 non-uniform costing time to time one user to another usage duration encourages use of local resources first user can access remote resources, but pays a penalty in higher cost.
Nimrod/G Scheduling Algorithm Find a set of machines (MDS search) Distribute jobs from root to machines Establish job consumption rate for each machine For each machine Can we meet deadline? If not, then return some jobs to root If yes, distribute more jobs to resource If cannot meet deadline with current resource Find additional resources
Nimrod/G Scheduling algorithm ... Locate more Machines Locate Machines Establish Rates Re-distribute Jobs Distribute Jobs Meet Deadlines?
Resource Usage (for various deadlines)
Scheduling Methods for Global Resource Allocation 1. Equal Assignment, but no Load Balancing 2. Equal Assignment and Load Balanced 3. (2) + deadline (no worry about cost) 4. (2) + deadline (minimize the cost of computation) 5. (4) + budget --> deadline + cost (up front agreement) I am willing to pay $$$, can you complete by deadline. 6. (5) + Advance Resource Reservation 6. (6) + Grid Economy - dynamic pricing - use Trading/Tender/Bid process 7. (6) + Grid Economy - dynamic pricing - use auction technique 8. Genetic/Fuzzy Logic, etc algorithms
Grid Architecture for Computational Economy GRACE Grid Architecture for Computational Economy GRACE aims help Nimrod/G overcome the current limitations. GRACE middleware offer generic interfaces (APIs) that other developers of grid tools can use along with Globus services.
Grid Resource Management Architecture (Economic/Market-based) Grid Information Server Grid Explorer Application Job Control Agent Trading Schedule Advisor Trade Server Other components of local resource manager Resource Reservation Trade Manager Deployment Agent Resource Allocation R1 R2 … Rn User Resource Broker A Resource Domain
Why Computational Economy in Resource Management ? “Observe Grid characteristics and current resource management policies” Grid resources are not owned by user or single organisation. They have their own administrative policy Mismatch in resource demand and supply overall resource demand may exceed supply. Traditional System-centric (performance matrix approaches does not suit in grid environment. System-Centric --> User Centric Like in real life, economic-based approach is one of the best ways to regulate selection and scheduling on the grid as it captures user-intent. Markets are an effective institution in coordinating the activities of several entities.
Advantages of Economic-based RM System Centric --> User Centric Policy in RM Helps in regulating demand and supply resource access cost can fluctuate (based on demand and supply and system can adapt) Scalable Solution No need of central coordinator (during negotiation) Resources(sellers) and also Users(buyers) can make their own decisions and try to maximize utility and profit. Uniform Treatment of all Resources Everything can can be traded including CPU, Mem, Net, Storage/Disk, other devices/instruments Efficient allocation of resources
Grid Open Trading Protocols Trade Manager Trade Server Get Connected Pricing Rules Call for Bid(DT) Reply to Bid (DT) Negotiate Deal(DT) …. API Confirm Deal(DT, Y/N) DT - Deal Template - resource requirements (BM) - resource profile (BS) - price (any one can set) - status - change the above values - negotiation can continue - accept/decline - validity period Cancel Deal(DT) Change Deal(DT) Get Disconnected
Open Trading Finite State Machine DT < TM, Request for Resource > < TM, Ask Price > << TS, Update >> << TM, Update >> DT <TS, Bid > < TS, Final Offer > < TM, Final Offer > Offer TS Offer TM <TM, Rej.> < TS, Reject > < TM, Accept > DT - Deal Template TM - Trade Manager TM - Trade Server DA - Deal Accepted DN - Deal Not accepted DA DN
Conclusions The Emergence of Internet as a Powerful connectivity media is bridging the gap between a number of technologies leading to what is known as “Everything on IP”. Cluster-based systems have become a platform of choice for mainstream computing. Economic based approach to resource management is the way to go in the grid environment. The user can say “I am willing to pay $…, can you complete my job by this time…” Both sequential and parallel applications run seamless on desktops, SMPs, Clusters, and the Grid without any change. Grid: A Next Generation Internet ?
Further Information Cluster Computing Infoware: http://www.buyya.com/cluster/ Grid Computing Infoware: http://www.gridcomputing.com Millennium Compute Power Grid/Market Project http://www.ComputePower.com You are invited to join CPG project! Books: High Performance Cluster Computing, V1, V2, R.Buyya (Ed), Prentice Hall, 1999. The GRID, I. Foster and C. Kesselman (Eds), Morgan-Kaufmann, 1999. IEEE Task Force on Cluster Computing http://www.ieeetfcc.org GRID Forums http://www.gridforum.org | http://www.egrid.org GRID’2000 Meeting http://www.gridcomputing.org
Thank You… Any ??
Back up Slides Thanks to Jack Dongarra; He allowed me to use his survey slides particularly those slides which are cut and paste included.