Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service.

Slides:



Advertisements
Similar presentations
Nimrod/G GRID Resource Broker and Computational Economy
Advertisements

Architectural Models for Resource Management in the Grid
Nimrod/G and Grid Market A Case for Economy Grid Architecture for Service Oriented Global Grid Computing Rajkumar Buyya, David Abramson, Jon Giddy Monash.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
Resource Management of Grid Computing
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Massimo Sgaravatto INFN Padova.
1 GRID D. Royo, O. Ardaiz, L. Díaz de Cerio, R. Meseguer, A. Gallardo, K. Sanjeevan Computer Architecture Department Universitat Politècnica de Catalunya.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Core Grid Functions: A Minimal Architecture for Grids William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (www-itg.lbl.gov/~wej)
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
Nimrod/G GRID Resource Broker and Computational Economy David Abramson, Rajkumar Buyya, Jon Giddy School of Computer Science and Software Engineering Monash.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Chapter 4 Realtime Widely Distributed Instrumention System.
Grid Workload Management Massimo Sgaravatto INFN Padova.
The Globus Project: A Status Report Ian Foster Carl Kesselman
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
“ A Distributed Computational Economy and the Nimrod-G Grid Resource Broker ”
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Grid Computing Environments Grid: a system supporting the coordinated resource sharing and problem-solving in dynamic, multi-institutional virtual organizations.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Introduction to Grid Computing and its components.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Workload Management Workpackage
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Grid Computing.
University of Technology
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Int’l DataGrid Workshop 2 Contents Grid and Middleware Services Architectural Model for Resource Management  Hierarchical Resource Management  Abstract Owner  Market Model Scheduling Algorithms in Economy Grid Example of Application level Scheduler Concluding Remarks

Grid Information Service Uniform Resource Access BrokeringGlobal Queuing Global Event Services Co- Scheduling Data Cataloguing Uniform Data Access Collaboration and Remote Instrument Services Network Cache Communication Services Authentication Authorization Security Services AuditingFault Management Monitoring Grid Common Services: Standardized Services and Resources Interfaces Toolkits: Visualization, data publish/subscribe, etc. Applications: Simulations, Data Analysis, etc. Resources Discipline Specific Portals and Scientific Workflow Management Systems Condor pools network caches tertiary storage national user facilities clusters national supercomputer facility high-speed networks and communications services = Globus services Architecture of a Grid Source: IPG (Johnston)

Int’l DataGrid Workshop 4 Heterogeneous Computing: IPG Milestone Completed 10/2000 IPG managed compute and data management resources results study concept IPG Grid Common Services: Standardized services and uniform resource access study object 1) Condor Workstation Pool mgr. Molecular design application for nanotechnology devices and materials Uses 0.5 million otherwise idle CPU hours/year scavenged from a Sun and SGI workstations - a subset of the NAS Condor pool The Condor system is an IPG middleware service 2) Parameter Study Manager - Two problem solving environments use IPG services for uniform access to heterogeneous resources. ILab aerospace design parameter study manager uses IPG to access distributed computing and data resources

Int’l DataGrid Workshop 5 Online Instrumentation: Real-time Experiment Interaction computer simulations real-time collection multi-source data analysis, desktop & VR clients with shared controls Unitary Plan Wind Tunnel archival storage real-time experiment control

Int’l DataGrid Workshop 6 Grid from Services View : : E.g., Applications Resource-specific implementations of basic services E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public key infrastructure, site accounting, directory service, OS bypass Resource-independent and application-independent services authentication, authorization, resource location, resource allocation, events, accounting, remote data access, information, policy, fault detection Distributed Computing Toolkit Grid Fabric (Resources) Grid Services (Middleware) Application Toolkits Data- Intensive Applications Toolkit Collaborative Applications Toolkit Remote Visualization Applications Toolkit Problem Solving Applications Toolkit Remote Instrumentation Applications Toolkit Applications Chemistry Biology Cosmology High Energy Physics Environment

Int’l DataGrid Workshop 7 Middleware Layered collection of middleware services that provide to applications uniform views of distributed resource components and the mechanisms for assembling them into systems –Grid Workload Management, Data Management, Monitoring services –Management of the Local Computing Fabric –Mass Storage Services extend both “up and down” through the various layers of the computing and communications infrastructure

Int’l DataGrid Workshop 8 Functions in Middleware Workload management –The workload is chaotic – unpredictable job arrival rates, data access patterns –The goal is maximising the global system throughput (events processed per second) Data management –Management of petabyte-scale data volumes, in an environment with limited network bandwidth and heavy use of mass storage (tape) –Caching, replication, synchronisation, object database model Application monitoring –Tens of thousands of components, thousands of jobs and individual users –End-user - tracking of the progress of jobs and aggregates of jobs –Understanding application and grid level performance –Administrator – understanding which global-level applications were affected by failures, and whether and how to recover

Int’l DataGrid Workshop 9 Middleware (in Local Fabric) Effective local site management of giant computing fabrics –Automated installation, configuration management, system maintenance –Automated monitoring and error recovery - resilience, self-healing –Performance monitoring –Characterisation, mapping, management of local Grid resources Mass storage management  multi-PetaByte data storage  “ real-time ” data recording requirement  active tape layer – 1,000s of users  uniform mass storage interface  exchange of data and meta-data between mass storage systems

Int’l DataGrid Workshop Technical Approach in Layered Network vBNS IDREN Campus Internet 2 GigaPop ESNet Internet LBNL Ames ANL Global Middleware Services Resource Scheduling Network Cache QoS Broker Monitoring & Management Access Control Cache Tertiary (mass) storage Super- Computer Wind Tunnel Tertiary storage Cluster NCAR Applications Applications need uniform views of resources, and middleware must deal with the fact that most “real” resources are “locally” owned Local Services Source: Grid’98 Workshop (Johnston)

Int’l DataGrid Workshop Operation Model (1) vBNS IDREN Campus Internet 2 GigaPop ESNet Internet LBNL Ames ANL Network Cache QoS Broker Access Control Cache Tertiary (mass) storage Super- Computer Wind Tunnel Tertiary storage Cluster NCAR Applications Some services are provided in the middleware Middleware must actually reach well ! Resource Characteristics Resource Scheduling Global Middleware Services Monitoring & Management Most services drill down to institutional resources Data Catalogues Some services drill down to the various network layers Local Services Source: Grid’98 Workshop (Johnston)

Int’l DataGrid Workshop Operation Model (2) vBNS IDREN Campus Internet 2 GigaPop ESNet Internet LBNL Ames ANL Network Cache QoS Broker Access Control Cache Tertiary (mass) storage Super- Computer Wind Tunnel Tertiary storage Cluster NCAR Applications Some services are provided in the middleware Middleware layer and infrastructure to provide the transparent access for applications ! Resource Characteristics Resource Scheduling Global Middleware Services Monitoring & Management Data Catalogues Local Services Proxy management for multi-site resources Configure Analyzer Monitor Re-configure Cache Re-configure Monitor Source: Grid’98 Workshop (Johnston)

Int’l DataGrid Workshop 13 Middleware Approach Toolkit and services addressing key technical problems –Modular “bag of services” model –Not a vertically integrated solution –can be applied to many application domains Inter-domain issues, rather than clustering –Integration of intra-domain solutions

Int’l DataGrid Workshop 14 GRID Workload Management Architecture and services for scheduling and resource management Challenging issues: –Optimal co-allocation of data, CPU and network for specific jobs –Distributed scheduling (data and/or code migration) of unscheduled/scheduled jobs –Uniform interface to various local resource managers –Usage policies on resource (CPU, data, network)

Int’l DataGrid Workshop 15 GRID Data Management Services and tools for data management Challenging issues: –Petabyte-scale information volumes –High speed data moving and replica –Replica synchronization –Data caching –Uniform interface to mass storage management systems

Int’l DataGrid Workshop 16 GRID Monitoring Services Tools and infrastructure for status and error monitoring Tasks and challenges: –Develop instrumentation APIs –Integration with information services –Real time and long term monitoring –Analysis of multivariable data –job performance optimisation –problem tracing

Int’l DataGrid Workshop 17 Fabric Management Tools for new automated system management techniques of large computing fabrics Tasks and challenges: –Management of very large computing fabrics –Reduced costs of administration and operations –Dynamic management of new resources –Scalability to thousands processors An innovative approach: self-healing –algorithms for fault detection and localization –automatic reconfiguration of the fabric –automatic task re-running

Int’l DataGrid Workshop 18 Mass Storage Management Integration of local mass storage management systems within the DataGrid Tasks and challenges: –Develop interface APIs –Develop data import/export interfaces –Publication of Information and metadata

Int’l DataGrid Workshop 19 Globus

Int’l DataGrid Workshop 20 Globus Approach A software toolkit addressing key technical problems –Offer a modular bag of technologies –Enable incremental development of grid-enabled tools and applications –Define and standardize grid protocols and APIs Focus is on inter-domain issues, not clustering –Supports collaborative resource use spanning multiple organizations –Integrates cleanly with intra-domain services –Creates a collective service layer

Int’l DataGrid Workshop 21 Globus Approach Focus on architecture issues –Provide implementations of grid protocols and APIs as basic infrastructure –Use to construct high-level, domain- specific solutions Design principles –Keep participation cost low –Enable local control –Support for adaptation Diverse global services Core Globus services Local OS A p p l i c a t i o n s

Int’l DataGrid Workshop 22 Four Key Protocols The Globus Toolkit centers around four key protocols –Connectivity layer: Security: Grid Security Infrastructure (GSI) –Resource layer: Resource Management: Grid Resource Allocation Management (GRAM) Information Services: Grid Resource Information Protocol (GRIP) Data Transfer: Grid File Transfer Protocol (GridFTP)

Int’l DataGrid Workshop 23 Site A (Kerberos) Site B (Unix) Site C (Kerberos) Computer User Single sign-on via “grid-id” & generation of proxy cred. Or: retrieval of proxy cred. from online repository User Proxy Proxy credential Computer Storage system Communication* GSI-enabled FTP server Authorize Map to local id Access file Remote file access request* GSI-enabled GRAM server GSI-enabled GRAM server Remote process creation requests* * With mutual authentication Process Kerberos ticket Restricted proxy Process Restricted proxy Local id Authorize Map to local id Create process Generate credentials Grid Security Infrastructure in Action

Int’l DataGrid Workshop 24 Resource Management The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity Resource Specification Language (RSL) is used to communicate requirements A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services –Integrated with Condor, MPICH-G2, …

Int’l DataGrid Workshop 25 Resource Management Issues for Grid Computing Site autonomy –Resources owned by different organizations, in different administrative domains –Local policies for use, scheduling, security Heterogeneous substrate –Different local resource management systems Policy extensibility –Local sites need ability to customize their resource management policies Co-allocation –May need resources at several sites –Mechanism for allocating multiple resources, initiating computation, monitoring and managing On-line control –Adapt application requirements to resource availability

Int’l DataGrid Workshop 26 GRAM LSFEASY-LLNQE Application RSL Simple ground RSL Information Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info Resource Management Architecture

Int’l DataGrid Workshop 27 Local Resource Managers Implemented with Globus Resource Allocation Manager (GRAM) –Processing RSL specifications representing resource requests Deny request Create one or more processes (jobs) that satisfy request –Enable remote monitoring and management of jobs –Periodically update MDS information service with current availability and capabilities of resources GRAM is responsible for –Parsing and processing RSL –Job monitoring –MDS update

Int’l DataGrid Workshop 28 Grid Information Services System information is critical to operation of the grid and construction of applications –What resources are available? Resource discovery –What is the “state” of the grid? Resource selection –How to optimize resource use Application configuration and adaptation? We need a general information infrastructure to answer these questions

Int’l DataGrid Workshop 29 GIS Architecture AA Customized Aggregate Directories RRRR Standard Resource Description Services Registration Protocol Users Enquiry Protocol

Int’l DataGrid Workshop 30 A Model Architecture for Data Grids Metadata Catalog Replica Catalog Tape Library Disk Cache Attribute Specification Logical Collection and Logical File Name Disk ArrayDisk Cache Application Replica Selection Multiple Locations NWS Selected Replica GridFTP Control Channel Performance Information & Predictions Replica Location 1Replica Location 2Replica Location 3 MDS GridFTP Data Channel

Int’l DataGrid Workshop 31 GridFTP: Basic Approach FTP protocol is defined by several IETF RFCs Start with most commonly used subset –Standard FTP: get/put etc., 3 rd -party transfer Implement standard but often unused features –GSS binding, extended directory listing, simple restart Extend in various ways, while preserving interoperability with existing servers –Striped/parallel data channels, partial file, automatic & manual TCP buffer setting, progress monitoring, extended restart

Int’l DataGrid Workshop 32 Striped GridFTP Server Parallel File System (e.g. PVFS, PFS, etc.) MPI-IO … Plug-in Control GridFTP Server Parallel Backend GridFTP server master mpirun GridFTP client Plug-in Control Plug-in Control Plug-in Control … MPI (Comm_World) MPI (Sub-Comm) To Client or Another Striped GridFTP Server Control socket GridFTP Control ChannelGridFTP Data Channels

Int’l DataGrid Workshop 33 Condor

Int’l DataGrid Workshop 34 What is Condor? Condor converts collections of distributively owned workstations and dedicated clusters into a distributed high-throughput computing facility.  Resource finder  Batch queue manager  Scheduler  Checkpoint/Restart  Process migration  Remote system calls All jobs Jobs linked with the Condor library

Int’l DataGrid Workshop 35 Layered Design Resource Access Control Match-Making Request Agent Application RM Application Condor Resource Owner System Administrator Customer/User

Int’l DataGrid Workshop 36 Unique Mechanisms Checkpointing –Enables Preemptive Resume Resource Allocation (essential in an opportunistic environment) Remote I/O –Enables computation across administrative domains (essential for HTC) ClassAds –Enables flexible resource matchmaking (essential in a distributively owned environment)

Int’l DataGrid Workshop 37 Condor System Structure Submit MachineExecution Machine Collector CA [...A] [...B] [...C] CN RA Negotiator Customer AgentResource Agent Central Manager

Int’l DataGrid Workshop 38

Int’l DataGrid Workshop TENT

Int’l DataGrid Workshop 40 TENT A distributed workflow management and integration system for engineering applications developed by –German Aerospace Center (DLR), Simulation and Software Technology (SISTEC) –German National Research Center for Information Technology (GMD), Institute for Algorithms and Scientific Computing (SCAI)

Int’l DataGrid Workshop 41 TENT - The Integration Framework visualization

Int’l DataGrid Workshop 42 TENT Packages

Int’l DataGrid Workshop 43 TENT - Software architecture

Int’l DataGrid Workshop 44 Architectural Models for Resource Management in the Grid

Int’l DataGrid Workshop 45 Typical Grid Computing Environment Grid Resource Broker Resource Broker Application Grid Information Service Grid Resource Broker database R2R2 R3R3 RNRN R1R1 R4R4 R5R5 R6R6 Grid Information Service

Int’l DataGrid Workshop 46 Sources of Complexity in Grid Resource Management No single administrative control. No single ownership policy: –Each resource owner has their own policies or scheduling mechanisms –Users must honour them (particularly external Grid users) Heterogeneity –resources : PCs, Workstations, clusters, supercomputers, instruments, databases, software … –fabric management systems and management policies –application requirements Dynamic availability – may appear and disappear…

Int’l DataGrid Workshop 47 Sources of Complexity in Grid Resource Management Unreliable resource – disappear from view No uniform cost model - varies from one user’s resource to another and from time of day. No single access mechanism – Web, custom interfaces, command line…

Int’l DataGrid Workshop 48 Grid Resource Management Issues Authentication (once). Specify (code, resources, etc.). Discover resources. Negotiate authorization, acceptable use, Cost, etc. Acquire resources. Schedule Jobs. Initiate computation. Steer computation. Access remote data-sets. Collaborate with results. Account for usage. Discover resources. Negotiate authorization, acceptable use, Cost, etc. Acquire resources. Schedule jobs. Initiate computation. Steer computation. Domain 2 Domain 1 Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 49 Data Access for Resource Management

Int’l DataGrid Workshop 50 Architectural Models for RM MODELREMARKSSystems HierarchicalIt captures model followed in most contemporary systems. Globus, Legion, CCS, Apples, NetSolve, Ninf. Abstract Owner (AO)Order and delivery model and focuses on long term goals. Expected to emerge and most peer-2-peer computing systems likely to be based on this. Market ModelIt follows economic model for resource discover, sharing, & scheduling. GRACE, Nimrod/G, JavaMarket, Mariposa.

Int’l DataGrid Workshop 51 Hierarchical RM

Int’l DataGrid Workshop 52 Resource Management in Globus The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity Resource Specification Language (RSL) is used to communicate requirements A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services –Integrated with Condor, MPICH-G2, …

Int’l DataGrid Workshop 53 GRAM LSFEASY-LLNQE Application RSL Simple ground RSL Information Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info Resource Management Architecture in Globus

Int’l DataGrid Workshop 54 Local Resource Managers Implemented with Globus Resource Allocation Manager (GRAM) –Processing RSL specifications representing resource requests Deny request Create one or more processes (jobs) that satisfy request –Enable remote monitoring and management of jobs –Periodically update MDS information service with current availability and capabilities of resources GRAM is responsible for –Parsing and processing RSL –Job monitoring –MDS update

Int’l DataGrid Workshop 55 Globus/MPICH-G2 components Globus Security Infrastructure Globus-job-manager Client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Launch RSL Library Parse Request Allocate & create processes Process Monitor & control Local site boundary MPI AppsMDS: Grid Index Info Server Globus Gatekeeper MDS: Grid Resource Info Server Globus Resource Manager MDS client API calls to get resource info Provide state change callbacks to client Process MPI messages MPICH-G2

Int’l DataGrid Workshop 56 High throughput workload management system architecture (simplified design) GRAM CONDOR GRAM LSF GRAM PBS globusrun Site1 Site2Site3 condor_submit (Globus Universe) Condor-G MasterGIS Submit jobs (using Class-Ads) Resource Discovery Information on characteristics and status of local resources

Condor Globus Universe

Int’l DataGrid Workshop 58 AO General Model

Int’l DataGrid Workshop 59 OrderPickup AO is owner or broker User negotiates with AO through “order window” That AO may own some resources, and/or it may broker with other AOs for those resources After negotiation, resources are delivered through “pickup window” Order Window Pickup Window Physical Resource User RequestsResources AO Order Pickup Resource Manager AO1 Manager DeliverySales AO2 AO3

Int’l DataGrid Workshop 60 AO Resources Resources are objects Classes are –Instrument Data source, sink, transform e.g. programs, people, files, data collection devices –Channel Moves data among instruments –Complexes of above Attributes define sizes, times, connections, etc. Instrument (File) Instrument (Program) Instrument (File) Instrument (Program) Channels Instrument (Telescope) Instrument (Person)

Int’l DataGrid Workshop 61 Negotiating with an AO Make dummy resource (with attributes set to constants, variables, or “don’t care”) + bid + delivery plan + variable constraints Resource candidates (values for variables/attributes + asking price for each) Pick one, Try again, Or give up Delivery Window Resource Order Window Assign tasks to resource, use, relinquish Perhaps later... USER AO

Int’l DataGrid Workshop 62 Economic Models for Trading Commodity Market Model Posted Prices Models Bargaining Model Tendering (Contract Net) Model Auction Model Proportional Resource Sharing Model Shareholder Model Partnership Model

Int’l DataGrid Workshop Economy Grid = Globus + GRACE Applications MDS GRAM Globus Security Interface Heartbeat Monitor Nexus Local Services LSF Condor GRDQBank PBS TCP SolarisIrixLinux UDP High-level Services and Tools DUROCglobusrunMPI-G Nimrod/G MPI-IOCC++ GlobusViewGrid Status GASS GRACE-TS GARA Grid Fabric Grid Apps. Grid Middleware Grid Tools GBank GMD eCash JVM DUROC Core Services ScienceEngineeringCommercePortalsActiveSheet … … Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 64 Grid Node N Grid Architecture for Computational Economy Grid User Application Grid Resource Broker Grid Service Providers Grid Explorer Schedule Advisor Trade Manager Job Control Agent Deployment Agent Trade Server Resource Allocation Resource Reservation R1R1 Misc. services Information Server(s) R2R2 RmRm … Pricing Algorithms Accounting Grid Node1 … Grid Middleware Services … … Health Monitor Grid Market Services JobExec Info ? Secure Trading QoS Storage Sign-on Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 65 GRACE components A resource broker (e.g., Nimrod/G) Various resource trading protocols for different economic models A mediator for negotiating between users and grid service providers (Grid Market Directory) A deal template for specifying resource requirements and services offers Grid Trading Server Pricing policy specification Accounting (e.g., QBank) and payment management (GridBank, not yet implemented)

Int’l DataGrid Workshop 66 Flow Diagram for Pricing, Accounting, Allocations and Job Scheduling QBank Resource Manager 4 IBM-LL/PBS/… Compute Resources clusters/SGI/SP/ Make Deposits, Transfers, Refunds, Queries/Reports 1. Clients negotiates for access cost. 2. Negotiation is performed per owner defined policies. 3. If client is happy, TS informs QB about access deal. 4. Job is Submitted 5. Check with QB for “go ahead” 6. Job Starts 7. Job Completes 8. Inform QB about resource resource utilization. Trade Server 3 1 Pricing Policy 2 Site GRID Bank (digital transactions) 0 Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 67 A resource broker for managing, steering, and executing task farming (parametric sweep/SPMD model) applications on Grid based on deadline and computational economy. Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability. Key Features –A single window to manage & control experiment –Persistent and Programmable Task Farming Engine –Resource Discovery –Resource Trading –Scheduling & Predications –Generic Dispatcher & Grid Agents –Transportation of data & results –Steering & data management –Accounting Nimrod/G : A Grid Resource Broker Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 68 A Glance at Nimrod-G Broker Grid Middleware Nimrod/G Client Grid Information Server(s) Schedule Advisor Trading Manager Nimrod/G Engine Grid Store Grid Explorer GE GIS TM TS RM & TS Grid Dispatcher RM: Local Resource Manager, TS: Trade Server Globus, Legion, Condor, etc. G G C L Globus enabled node. Legion enabled node. G L Condor enabled node. RM & TS CL Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 69 Nimrod/G Interactions Grid Info Server Process Server User Process File access File Server Grid Node Nimrod Agent Compute Node User Node Grid Dispatcher Grid Trade Server Grid Scheduler Local Resource Manager Nimrod-G Grid Broker Task Farming Engine Grid Tools And Applications Do this in 30 min. for $10? Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 70 Discover Resources Distribute Jobs Establish Rates Meet requirements ? Remaining Jobs, Deadline, & Budget ? Evaluate & Reschedule Discover More Resources Adaptive Scheduling Steps Compose & Schedule Source: Rajkumar Buyya (Monash Univ.)

Int’l DataGrid Workshop 71 Concluding Remarks Restriction in Grid Middleware – Difficulties in distributed computing and resource management policy – Difficulties of middleware implementation required for heterogeneous systems in meta-computing infrastructure Globus, Condor, TENT, PARIS, Cactus, …. Difficulties of Resource Management in Grid Computing Models for Grid resource management architecture –Hierarchical, AO, and Market-model ….