Presentation is loading. Please wait.

Presentation is loading. Please wait.

Economic-based Resource Management and Scheduling for Global/Grid Computing Project Team: David Abramson Rajkumar Buyya Jonathan Giddy Rajkumar Buyya

Similar presentations


Presentation on theme: "Economic-based Resource Management and Scheduling for Global/Grid Computing Project Team: David Abramson Rajkumar Buyya Jonathan Giddy Rajkumar Buyya"— Presentation transcript:

1 Economic-based Resource Management and Scheduling for Global/Grid Computing Project Team: David Abramson Rajkumar Buyya Jonathan Giddy Rajkumar Buyya rajkumar@csse.monash.edu.au http://www.csse.monash.edu.au/~rajkumar

2 Agenda  Computing Platforms and How Grid is Different ?  Resource Management Issues  Resource Management Architectures (@GF)  Nimrod/G Broker  Computational Economy  Scheduling Algorithms  Limitations of current Nimrod/G  Proposal “GRACE - Grid Architecture of Computational Economy”  GRACE Negotiation Protocols and APIs  Conclusion

3 Computing Power (HPC) Drivers Solving grand challenging problem/applications using computer modeling, simulation and analysis Life Sciences Design & Analysis (CAD/CAM) Aerospace Geographic Information Systems Geographic Information Systems Military Applications

4 2100 Single Processor Shared Memory Local Cluster Global Cluster/Grid PERFORMANCEPERFORMANCE Computing Platforms Breaking Administrative Barriers Inter Planet Cluster/Grid ?? Individual Group Department Campus State National Globe Inter Planet Universe Administrative Barriers

5 Towards Grid Computing…. For illustration, placed resources arbitrarily on the GUSTO test-bed!!

6 What is Grid ?  An infrastructure that couples – Computers (PCs, workstations, clusters, traditional supercomputers, and even laptops, notebooks, mobile computers, PDA, and so on) … – Software ? (e.g., renting expensive special purpose applications on demand) – Databases (e.g., transparent access to human genome database) – Special Instruments (e.g., radio telescope--SETI@Home Searching for Life in galaxy, Austrophysics@Swinburne for pulsars) – People (may be even animals who knows ?)  across the Internet and presents them as an unified integrated (single) resource.

7 Grid Application-Drivers  Old and New applications getting enabled due to coupling of computers, databases, instruments, people, etc: – (distributed) Supercomputing – Collaborative engineering – high-throughput computing – large scale simulation & parameter studies – Remote software access / Renting Software – Data-intensive computing – On-demand computing

8 The Grid Vision: To offer “Dependable, consistent, pervasive access to [high-end] resources”  Dependable: Can provide performance and functionality guarantees  Consistent: Uniform interfaces to a wide variety of resources  Pervasive: Ability to “plug in” from anywhere Source: www.globus.org

9 The Grid Impact! “ “The global computational grid is expected to drive the economy of the 21st century similar to the electric power grid that drove the economy of the 20th century”

10 Sources of Complexity in Grid Resource Management  No single adminstrative control  No single policy – each resource owners have their own policies or scheduling mechanisms. – Users must honor them (particularly external users of the grid).  Heterogenity of resources (static and dynamic)  Unreliable - resource may come or disappear (die)  No single cost model (it cannot be) – varies from one user to another and time to time.  No Single access mechanism

11 Domain 2 Domain 1 Grid Resource Management: Challenging Issues Ack.: globus.. Authentication (once) Specify simulation (code, resources, etc.) Discover resources Negotiate authorization, acceptable use, Cost, etc. Acquire resources Schedule Jobs Initiate computation Steer computation Access remote data-sets Collaborate on results Account for usage

12 Grid Components and our Focus: Resource Brokers and Computational Economy Applications Core Services MDS GRAM Globus Security Interface Heartbeat Monitor Nexus Gloperf Local Services LSF CondorMPI NQEEasy TCP SolarisIrixAIX UDP High-level Services and Tools DUROCglobusrunMPI Nimrod/G MPI-IOCC++ GlobusViewTestbed Status GASS Source: Globus GRACE GARA Grid Fabric Grid Apps. Grid Middleware Grid Tools

13 Resource Management Strawman’s proposed at Grid Forum (GF-SCHED WG)*  Strawman 1: Too much explicitly mentions and tries to capture all current schedulers features. - Steve Chapin et. al  Strawman 2: too much abstract (you need to think yourself to understand it). - Jenny Schopf et. al  AO (Abstract Owner) emphasizes very much on order- delivery approach, looks too ideal and captures real world model, however, I doubt currently any systems support this grand image (hard to real at this moment). - Dave DiNucci  What we discuss here sounds like “the mixture of all the above through Nimrod/G and proposed GRACE middleware services.” - I plan to propose one soon !! * http://www.nas.nasa.gov/~nitzberg/sched-wg/ http://www.gridforum.org

14 Access ontrol Agent A Grid Resource Management Architecture (Strawman 1) Deployment Agent (state info. +reservation+bid+… services) Connection Cloud User Global Scheduler Control Domain Monitor Resource Grid Information Service Access/Admission Control Agent Domain Resource Manager or Control Agent Global Scheduler Global Scheduler Global Scheduler Global Scheduler Local Scheduler - Task Persistent Job Control Agent

15 Access ontrol Agent A Grid Resource Management Architecture (Strawman 2) Mapper Commit AgentDeploy Agent Job & Resource Info. Can I Map Task to Resource? Y/N Map T to R Mon, kill/restat Monitoring info. If required, talk to lower-level scheduler to answer query If required, talk to lower-level scheduler or resource manager to do the job. Lower Level Scheduler is expected to have similar architecture Higher Level Scheduler :

16 Access ontrol Agent A Grid Resource Management Architecture (AO, Abstract Owner) Abstract Owner Pickup Window Order Window Resource Manager Pickup Window Order Window Physical Resource Manager Pickup Window Order Window Delivery RepSales Rep AO1 Broker is Resource Owner: Broker is Not Resource Owner: Job Shop ResultJob AO for Grid GRID User

17 Nimrod/G Resource Broker Nimrod/G Approach to Resource Management and Scheduling

18  A global scheduler for managing and steering task farming (parametric simulation) applications on computational grid based on deadline and computational economy.  Key Features – A single window to manage and control whole experiment – Resource Discovery – Trade for Resources – Scheduling – Steering & data management  Leverages Globus Services  Other parallel application models can be supported easily (MPI & DUROC co-allocation) What is Nimrod/G ?

19 A Nimrod/G User Console CostDeadline AvailableMachines

20 Nimrod/G Architecture Grid Middleware Services Dispatcher Nimrod/G Client Grid Information Services Schedule Advisor Resource Discovery Parametric Engine GUSTO Test Bed Persistent Info. Grid Explorer

21 Nimrod/G Interactions MDS server Resource location Queuing System GRAM server Resource allocation (local) Additional services used implicitly: GSI (authentication & authorization) Nexus (communication) User process File access GASS server Gatekeeper node Job Wrapper Computational node Dispatcher Root node Scheduler Prmtc.. Engine

22 Computational Economy  Resource selection on based real money and market based  A large number of sellers and buyers (resources may be dedicated/shared)  Negotiation: tenders/bids and select those offers meet the requirement  Trading and Advance Resource Reservation  Schedule computations on those resources that meet all requirements

23 Global resource allocation: Meet Deadline, but minimize the cost  Global information is hard to get and out of date – Load balancing – Fairness to multiple users  Global limits are easy to set and fairly stable – Non equal number of jobs assigned to resources – Load profiling – Cost-based resource allocation

24 13 21 User 5 Machine 1 User 1 Machine 5 Cost Model  non-uniform costing – time to time – one user to another – usage duration  encourages use of local resources first  user can access remote resources, but pays a penalty in higher cost.

25 Current Scheduling Algorithm in Nimrod/G: Deadline based, and tries to meet using low cost resources.  M - Resources, N - Jobs, D - deadline  Note: Cost of any R i is less than any of R i+1 …. Or Rm – RL: Resource List need to be maintained in increasing order of cost  C t - Time when accessed (Time now)  T i - Job runtime (average) on Resource i (R i ) [updated periodically] – T i is acts as a load profiling parameter.  A i - number of jobs assigned to R i, where: – A i = Min (No.Unassigned Jobs, No. Jobs R i can complete by remaining deadline) – No.UnAssignedJobs i = Diff( N, (A 1 +…+A i-1 )) – JobsR i consume = RemainingTime (D- C t ) DIV T i  ALG: Invoke Job Assignment() periodically untile all jobs done. – Job Assignment(): – Establish ( RL, C t, T i, A i ) dynamically. – For all resources (I = 1 to M) { Assign A i Jobs to R i, if required}

26 Resource Usage (for various deadlines)

27

28

29

30 Scheduling Methods for Global Resource Allocation 1. Equal Assignment, but no Load Balancing 2. Equal Assignment and Load Balanced 3. (2) + deadline (no worry about cost) 4. (2) + deadline (minimize the cost of computation) 5. (4) + budget --> deadline + cost (up front agreement) – I am willing to pay $$$, can you complete by deadline. 6. (5) + Advance Resource Reservation 6. (6) + Grid Economy - dynamic pricing - use Trading/Tender/Bid process 7. (6) + Grid Economy - dynamic pricing - use auction technique 8. Genetic/Fuzzy Logic, etc algorithms

31 Scheduling Policies can be combined! Advaced Reservation Deadline Economic-based

32 Nimrod/G Limitations (current version)  Manual/Static Resource Cost Model – static file of machine cost  Partially support for computational economy – User cannot know the cost of computation until experiement completes. They just need to trust the broker.  No support dividing the budget among jobs.  Flat price model (limitation like Internet pricing)  No Advanced Resource Reservation  No resource trading

33 GRACE Gr id A rchitecture for C omputational E conomy  GRACE aims help Nimrod/G overcome the current limitations.  GRACE middleware offer generic interfaces (APIs) that other developers of grid tools can use along with Globus services.  GRACE may become part of Globus ?? – Then, APIs look like “ globus_grace_ …. (….) ”

34 Why Computational Economy in Resource Management ? “Observe Grid characteristics and current resource management policies”  Grid resources are not owned by user or single organisation.  They have their own administrative policy  Mismatch in resource demand and supply – overall resource demand may exceed supply.  Traditional System-centric (performance matrix approaches does not suit in grid environment. – System-Centric --> User Centric  Like in real life, economic-based approach is one of the best ways to regulate selection and scheduling on the grid as it captures user-intent.  Markets are an effective institution in coordinating the activities of several entities.

35 Advantages of Economic-based RM  System Centric --> User Centric Policy in RM  Helps in regulating demand and supply – resource access cost can fluctuate (based on demand and supply and system can adapt)  Scalable Solution – No need of central coordinator (during negotiation) – Resources(sellers) and also Users(buyers) can make their own decisions and try to maximize utility and profit.  Uniform Treatment of all Resources – Everything can can be traded including CPU, Mem, Net, Storage/Disk, other devices/instruments – Efficient allocation of resources

36 Resource Profile (that can be considered during resource selection)  Static profile – CPU (power, clock speed, arch.) – memory capacity – Disk (speed of access) – Network (bandwidth/latency) – OS  Dynamic Profile – Load – Available Capacity – Priority – Cost (trade off with all the above)

37 Grid Resource Management Architecture (Economic/Market-based) Client Application Super-scheduler or Broker Resource Domain Grid Explorer Scheduler Trade Manager Job Control Agent Deployment Agent Trade Server Process Server Resource Reservation R Other components of local resource manager Trading Grid Information Server

38 Key components involved in computation economy issues…  Client – accepts user job (application) – Budget (B) – Budget Handling Rules – Deadline (T) – Application requirements (storage etc.) – Normalised/Estimated runtime of tasks (optional) – Preferences  Super-scheduler or Broker – Persistent job control agent – Resource Discovery/Locator – Bid Manager (Negotiating with resource owners/bid servers) – Dispatcher

39 Key components involved in computation economy issues  Grid Information Directory/Server – Static profile – Dynamic profile – Cost/Price (yellow pages/posted) and conditions of use etc.  Bid Server (can be part of local scheduler/RM) – Negotiation Template – Trading/Auction  Local Resource Manager – Can support Bid Server – Advance Resource Reservation (e.g., GRAM+GARA) – Also handle Scheduling

40 Grid Open Trading Protocols Get Connected Call for Bid(DT) Reply to Bid(DT) Negotiate Deal(DT) Confirm Deal(DT, Y/N) …. Cancel Deal(DT) Change Deal(DT) Get Disconnected Bid ManagerBid Server API Pricing Rules DT - Deal Template - resource requirements (BM) - resource profile (BS) - price (any one can set) - status - change the above values - negotiation can continue - accept/decline - validity period

41 Open Trading Finite State Machine DT Offer BS ND NAD Offer BM > > ND - Negotiate Deal

42 GRACE APIs (for Trading between Bid Manager and Bid Server)  tid = grace_trade_connect(resource_id)  grace_request_quote/bid(tid, DT)  grace_deal_negotiate (tid, DT)  grace_deal_confirm(tid, DT)  grace_deal_cancel(tid, DT)  grace_deal_change( tid, DT)  grace_trade_reconnect(tid, resource_id)  grace_trade_disconnect(tid, resource_id) – tid = Trade Identification code – DT = Deal Template

43 GRACE Deal Template  Resource Requirement – CPU Time – Memory – Storage – Timeline (when and up to) DT - Deal Template - resource requirements (BM) - resource profile (BS) - price (any one can set) - status - change the above values - negotiation can continue - accept/decline - validity period  Resource Profile – Static – Dynamic – Cost Chart – Slots Chart

44 Scheduler Steps 1. Identify Accessible Resources 2. Call for Bids 3. Collect Bids 4. Evaluate Bids and Negotiate such that resources meets user requirements 5. Confirm all bidders (accept/reject) 6. Have contract/Agreement of Usage 7. Schedule Computations 8. Periodically Monitor progress and make sure that user requirements will be meet A. If “No”, Reschedule unfinished computations such that requirements are meet A. If “No”, Reschedule unfinished computations such that requirements are meet B. IF “yes” repeat Step (8) Until application processing completes. B. IF “yes” repeat Step (8) Until application processing completes.

45 Related Works  AppLeS (UC. San Diego) – application level scheduling templates & case-by- case for different Class of Apps.  NetSolve (UTK/ORNL) – API for creating farms  Millennium (UC. Berkeley) – remote execution environment on clusters and supports computational economy  CODINE/GRD (Genias/Gridware) – a RMS for clusters, but moving towards grid; meets deadline by dominating over others share.  Mariposa- Distributed Database system (UC, Berkeley) – query with budget, creates sub-query & divides budget, trades with (remote) servers

46 Scheduling Policies in GRD Policy - who gets how much Global Dynamic Scheduler 4tracks workload 4enforces policies 4manages global resource utilization Urgency-based Priority Initiate TimeDeadline User/Project share tree time based Share-based Usage Functional Priority Current load snapshot based Override System Boosts temporarily a project/job/team Policies can be combined

47 Deadline policy in GRD Urgency-based Priority Initiate TimeDeadline Deadline Job Other Jobs Share Utilization TimeRequired Completion

48 Scheduling Policies Policy - who gets how much Global Dynamic Scheduler 4tracks workload 4enforces policies 4manages global resource utilization Urgency-based Priority Initiate TimeDeadline User/Project share tree time based Share-based Usage Functional Priority Current load snapshot based Override System Boosts temporarily a project/job/team

49 Conclusions  Nimrod/G architecture offers a scalable model for resource management and scheduling on computational grids  Economic based approach to resource management is the way to go in the grid environment.  Resources can be traded for sequential and parallel applications.  Advance Resource Reservation and Computation Economy together helps in moving towards user- centric approach to resource management.  The user can say “I am willing to pay $…, can you complete my job by this time…”  Grid: A Next Generation Internet ?

50 Further Information  Nimrod/G Project: – http://www.dgs.monash.edu.au/~davida/nimrod.html  eGRID, Economy driven GRID resource management: – http://www.csse.monash.edu.au/~rajkumar/eGRID/  Grid Computing Info Centre: – http://www.gridcomputing.com  Millennium Compute Power Grid Project – http://www.ComputePower.com – You are invited to join CPG project!  GRID’2000 Meeting – http://www.gridcomputing.org

51 Thank You… Any ??


Download ppt "Economic-based Resource Management and Scheduling for Global/Grid Computing Project Team: David Abramson Rajkumar Buyya Jonathan Giddy Rajkumar Buyya"

Similar presentations


Ads by Google