David Abramson, Rajkumar Buyya, and Jonathan Giddy

David Abramson, Rajkumar Buyya, and Jonathan Giddy
Computational Grids and Computational Economy: Nimrod/G Approach Nimrod/G: Economic/Market-based Resource Management and Scheduling (for Parametric Modeling) on the Global Computational Grid Project Team: David Abramson, Rajkumar Buyya, and Jonathan Giddy

A user has an application, say a simulation program, that for a given set of parameters, will calculate a single result. The user wishes to explore the effect of modifying some of those input parameters… Parametric Modeling Study the behaviour of output variables against a range of different input scenarios Execute one application repeatedly for many combinations of input parameters Coarse-grained SPMD (single program - multiple data) model for i in (10, 20, 30, 40, 50, 60, 70, 80, 90, 100): for j in (‘v’, ‘w’, ‘x’, ‘y’, ‘z’): myprog $i $j > output.$i.$j - Each computation is totally independent from the others - This pseudocode takes 50 x the time for one single execution.

Working with Small Clusters
Nimrod ( ) DSTC funded project Designed for department level clusters Proof of concept Clustor (Activetools) ( ) Commercial version of Nimrod Re-engineered Features Workstation orientation Access to idle workstations Random allocation policy Password security

Execution Architecture
Input Files Substitution Output Files Computational Nodes Root Machine

Clustor Tools

Dispatch cycle using Clustor...

Sample Applications of Clustor
Bioinformatics: Protein Modeling Sensitivity experiments on smog formation Parametric study of Laser detuning Combinatorial Optimization: Simulated Annealing Ecological Modeling: Control Strategies for Cattle Tick Electronic CAD: Field Programmable Gate Arrays Computer Graphics: Ray Tracing High Energy Physics: Searching for Rare Events Physics: Laser-Atom Collisions VLSI Design: SPICE Simulations

Clustor limitations Manual resource location
static file of machine names No resource scheduling first come first served No cost model all machines cost alike Single access mechanism

Requirements Users and system managers want to know where it will run
when it will run how much it will cost that access is secure homogeneous access

Towards Grid Computing….
Source: & updated

Why “The Grid”? New applications based on high-speed coupling of people, computers, databases, instruments, etc. Computer-enhanced instruments Collaborative engineering Browsing of remote datasets Use of remote software Data-intensive computing Very large-scale simulation Large-scale parameter studies Source:

The Grid Vision: To offer
“Dependable, consistent, pervasive access to [high-end] resources” Dependable: Can provide performance and functionality guarantees Consistent: Uniform interfaces to a wide variety of resources Pervasive: Ability to “plug in” from anywhere Source:

Challenging Issues Authenticate once
Specify simulation (code, resources, etc.) Locate resources Negotiate authorization, acceptable use, etc. Acquire resources Initiate computation Steer computation Access remote datasets Collaborate on results Account for usage Domain 1 Domain 2 Source:

Standards & Commodity Tech
Where appropriate, exploit standards and commodity technology in core infrastructure LDAP, SSL, X.509, GSS-API, GAA-API, http, ftp, XML, etc. Provides leverage Interface with other common standards CORBA, Java/Jini, DCOM, Web, etc While our core infrastructure may not be built on one of these distributed architectures, we must cleanly interface with them Source:

The Globus Project Basic research in grid-related technologies
Resource management, QoS, networking, storage, security, adaptation, policy, etc. Development of Globus toolkit Core services for grid-enabled tools & applns Construction of large grid testbed: GUSTO Largest grid testbed in terms of sites & apps Application experiments Tele-immersion, distributed computing, etc. Source:

Layered Architecture (Grid Components)
Applications High-level Services and Tools GlobusView Testbed Status DUROC MPI MPI-IO CC++ Nimrod/G globusrun Core Services Nexus GRAM Metacomputing Directory Service Globus Security Interface Heartbeat Monitor Gloperf GASS Local Services Condor MPI TCP UDP LSF Easy NQE AIX Irix Solaris Source:

Core Globus Services Communication infrastructure (Nexus, IO)
Information services (MDS) Network performance monitoring (Gloperf) Process monitoring (HBM) Remote file and executable management (GASS and GEM) Resource management (GRAM) Security (GSI) Source:

Nimrod/G Architecture
Nimrod/G Client Nimrod/G Client Nimrod/G Client Parametric Engine Schedule Advisor Resource Discovery Persistent Info. Dispatcher Grid Directory Services Grid Middleware Services GUSTO Test Bed

Nimrod/G Interactions
Additional services used implicitly: GSI (authentication & authorization) Nexus (communication) Resource location MDS server Scheduler Resource allocation (local) Prmtc.. Engine Dispatcher GRAM server Queuing System Job Wrapper User process GASS server File access Root node Gatekeeper node Computational node

Global resource allocation
Global information is hard to get and out of date Load balancing Fairness to multiple users Global limits are easy to set and fairly stable Load profiling Cost-based resource allocation

Computational Economy
Resource selection on based real money and market based A large number of sellers and buyers (resources may be dedicated/shared) Negotiation: tenders/bids and select those offers meet the requirement Trading and Advance Resource Reservation Schedule computations on those resources that meet all requirements

Cost Model 1 3 2 User 5 Machine 1 User 1 Machine 5 non-uniform costing
time to time one user to another usage duration encourages use of local resources first user can access remote resources, but pays a penalty in higher cost.

A Nimrod/G User Console
Deadline Cost Available Machines

Some early results

Related Works AppLeS (UC. San Diego)
application level scheduling & case-by-case NetSolve (UTK/ORNL) API for creating farms DISCWorld (U. Adelaide) remote information access Millennium (UC. Berkeley) remote execution environment on clusters and supports computational economy

Conclusions Nimrod/G architecture offers a scalable model for resource management and scheduling on computational grids Supports Computational Economy The current model supporting Parametric Computing can be extended to support parallel jobs or any other computational model. Plan to use the concept of Advance Resource Reservation in order to offer the feature wherein the user can say “I am willing to pay $…, can you complete my job by this time…”

Further Information Nimrod/G: Active Tools (Clustor):

Closed systems

David Abramson, Rajkumar Buyya, and Jonathan Giddy

Similar presentations

Presentation on theme: "David Abramson, Rajkumar Buyya, and Jonathan Giddy"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

David Abramson, Rajkumar Buyya, and Jonathan Giddy

Similar presentations

Presentation on theme: "David Abramson, Rajkumar Buyya, and Jonathan Giddy"— Presentation transcript:

Similar presentations

About project

Feedback