Matchmaking in the Condor System Rajesh Raman Computer Sciences Department University of Wisconsin-Madison

Slides:



Advertisements
Similar presentations
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Advertisements

More HTCondor 2014 OSG User School, Monday, Lecture 2 Greg Thain University of Wisconsin-Madison.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
Joseph A. Giampapa Octavio H. Juarez-Espinosa Katia P. Sycara The Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Design and Evaluation of a Resource Selection Framework for Grid Applications University of Chicago.
The Condor Data Access Framework GridFTP / NeST Day 31 July 2001 Douglas Thain.
Chapter 8: Network Operating Systems and Windows Server 2003-Based Networking Network+ Guide to Networks Third Edition.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Condor Project Computer Sciences Department University of Wisconsin-Madison Asynchronous Notification in Condor By Vidhya Murali.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machine Universe in.
Jim Basney Computer Sciences Department University of Wisconsin-Madison Managing Network Resources in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
1 HawkEye A Monitoring and Management Tool for Distributed Systems Todd Tannenbaum Department of Computer Sciences University of.
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
1 1 Vulnerability Assessment of Grid Software Jim Kupsch Associate Researcher, Dept. of Computer Sciences University of Wisconsin-Madison Condor Week 2006.
Grid Computing I CONDOR.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Hunter of Idle Workstations Miron Livny Marvin Solomon University of Wisconsin-Madison URL:
Grid Workload Management Massimo Sgaravatto INFN Padova.
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Alain Roy Computer Sciences Department University of Wisconsin-Madison ClassAds: Present and Future.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.
Computer Science Lecture 7, page 1 CS677: Distributed OS Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features: –One.
Dan Bradley University of Wisconsin-Madison Condor and DISUN Teams Condor Administrator’s How-to.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Nick LeRoy Computer Sciences Department University of Wisconsin-Madison Hawkeye.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week.
Ian D. Alderman Computer Sciences Department University of Wisconsin-Madison Condor Week 2008 End-to-end.
A Constraint Language Approach to Grid Resource Selection Chuang Liu, Ian Foster Distributed System Lab University of Chicago
An Introduction to High-Throughput Computing With Condor Tuesday morning, 9am Zach Miller University of Wisconsin-Madison.
Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.
Nicholas Coleman Computer Sciences Department University of Wisconsin-Madison Distributed Policy Management.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Condor Tutorial NCSA Alliance ‘98 Presented by: The Condor Team University of Wisconsin-Madison
The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission and other tools Future developments.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Gabi Kliot Computer Sciences Department Technion – Israel Institute of Technology Adding High Availability to Condor Central Manager Adding High Availability.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Condor Week May 2012No user requirements1 Condor Week 2012 An argument for moving the requirements out of user hands - The CMS experience presented.
Quick Architecture Overview INFN HTCondor Workshop Oct 2016
A Distributed Policy Scenario
Negotiator Policy and Configuration
Accounting, Group Quotas, and User Priorities
Basic Grid Projects – Condor (Part I)
ACTIVE DIRECTORY An Overview.. By Karan Oberoi.
Chapter 5 Architectural Design.
GLOW A Campus Grid within OSG
Presentation transcript:

Matchmaking in the Condor System Rajesh Raman Computer Sciences Department University of Wisconsin-Madison

2 Outline › Introduction to Matchmaking  Architecture, Philosophy › Classified Advertisements (ClassAds)  Language for Matchmaking › Matchmaking in Condor › Future Directions in Matchmaking › Conclusion

3 Matchmaking › Matchmaking is a methodology for Distributed Resource Management › Conceptually simple:  Service providers and requesters advertise  Compatible advertisements are matched  Matched entities cooperate to perform service › Developed for opportunistic environments  Use resources as and when available

4 Matchmaking (Cont.) › Customers and Servers advertise to a Matchmaking Service › Advertisements describe advertising entities  Characteristics  Requirements and Constraints  Preferences › We call these descriptions classified advertisements (classads)

5 ClassAd Language Versions › New ClassAd implementation will be released with Condor Version 6.5 › Older implementation lacks:  Some data types: lists, nested classads, time stamps, time intervals  Some operators: conditional, array subscript, bit-wise operators  Others: function calls

6 Example ClassAds [ Type = “Tenant”; Name = “John Doe”; Age = 28; HavePets = false; RentOffer = 900; Requirements = other.Type==“Apartment” && other.Rent <= RentOffer && other.HeatIncluded; Rank = (other.Location==“Downtown” ? 2000 : other.OnBusLine ? 1500 : 1000) - other.Rent ]

7 Example ClassAds (Cont.) [ Type = “Apartment”; Location = “Downtown”; HeatIncluded = true; Rent = 850; Phone = “ ”; Requirements = other.Type == “Tenant” && !other.HavePets && other.RentOffer >= Rent; Rank = other.RentOffer ]

8 The Matchmaker › Matchmaker matches compatible classads  Matchmaker ensures that Server and Customer Constraints are satisfied  Server and Customer Preferences are honored (within bounds of framework) › Relevant parties notified when successfully matched › Matched parties claim each other

9 Matchmaking Tenant Agent Apartment Agent Customer AdsServer Ads Matchmaker

10 Matchmaking Tenant Agent Apartment Agent Step 1: Advertise Matchmaker

11 Matchmaking Tenant Agent Apartment Agent Step 2: Match Matchmaker

12 Matchmaking Matchmaker Tenant Agent Apartment Agent Step 3: Notify To matched Customers To matched Servers

13 Matchmaking Matchmaker Tenant Agent Apartment Agent Step 4: Claim

14 Matchmaking Components › Advertising Language › Matchmaker Protocol  Defines how ads are sent to the Matchmaker  Names “well-known” attributes  Defines how matched entities are notified › Matchmaking Algorithm  Algorithm used for making matches › Claiming Protocol  Defines how matched entities establish an allocation

15 Classified Advertisements › Set of (Attribute Name,Expression) pairs › Self-describing (no separate schema)  Combines schema, data and query  Entities classads may include arbitrary attributes, so workstations may advertise Physical location Locally cached data files Time windows when preemption is unlikely › Syntax: [ n0 = e0; n1 = e1; … ; ni = ei ]  XML is a good fit; not currently used

16 Condor Examples [ Type= "Job"; Owner= "raman"; Cmd= "run_sim"; Args= "-Q "; Cwd= "/u/raman"; ImageSize= 31M; Qdate= ‘Sat Jan 9... ’... Rank= other.Kflops... Requirements =.... ] [ Type= "Machine"; Name= "xxy.cs...."; Arch= ”ALPHA"; OpSys= ”OSF1"; Mips= 104; Kflops= 21893; Memory= 256M; LoadAvg= ;... Rank=...; Requirements =... ]

17 Attribute Expressions › Constants 104, , "iX86”, true, ‘15:45’, ‘Sat Jan 9 15: …’ › References attr, self.attr, other.attr › Operators>>, =, &&,... › Functions strcat, substr, regexp, member,... › Lists { expr, expr,... } › ClassAds [ name=expr; name=expr;... ]

18 Three-valued Logic other.Memory > 32 all other.Memory == 32 UNDEFINED other.Memory != 32 if other has no !(other.Memory == 32) "Memory" attribute Logical Operators (&& and ||) can squash UNDEFINED For example: other.Mips >= 100 || other.Kflops >= 1000 TRUE if either attribute exists and satisfies condition

19 Job Constraint Example Requirements = other.Type = "Machine" && other.Arch = ”ALPHA" && other.OpSys = ”OSF1" && other.Disk > 1G && other.Memory >= self.ImageSize;

20 Rank Examples // job's rank for machine Rank = other.Kflops/1k + other.Memory/32M; // machine's rank for job Rank = member(other.Owner,ResearchGrp) ? 10 : member(other.Owner, Friends) ? 5 : 0;

21 Machine Constraint Example IsIdle = LoadAvg ‘00:15’; Requirements = !member(other.Owner,Untrusted) && ( Rank >= 10 ? True : IsIdle && ( Rank >= 5 ? True : DayTime() ‘18:00’ ) )

22 Condor Matchmaking › To match two ads A and B  Set up environment such that in A self evaluates to A (MY in old classads) other evaluates to B (TARGET in old classads) › Check per classad policy  A.Requirements and B.Requirements  A.Rank and B.Rank for preferences › Condor Matchmaking is a 4-way balancing act!  User Priorities, Job Preferences, Machine Preferences, Administrator Policies

23 Condor Matchmaking: Priorities › Submitters are assigned priorities based on  Past resource usage  Administrator policy: boost and decay factors › Submitters negotiated in priority order  Users with better priority more likely to get what they want › Priorities used to determine “fair-share”  Share determined by ratio of priorities  May use more than fair-share if no one else wants resources

24 Condor Matchmaking: Preemption › Job may be preempted for different reasons  Priority: New job’s Owner has better priority  Rank: Machine likes new job more › Administrator may control preemption  Stall priority preemption PREEMPTION_REQUIREMENTS parameter e.g., prevent “thrashing”  Affect choice of machine to preempt PREEMPTION_RANK parameter e.g., reduce network load

25 Condor Matchmaking: Scheme › For submitter’s job J (upto fair-share)  Only consider vacant or preemptable machines  Order lexicographically by 1. Job rank of machine 2. Preemption Mode (None > Rank > Priority) 3. PREEMPTION_RANK (if applicable)  Check PREEMPTION_REQUIREMENTS if preempting for priority

26 Future Work › Efficient Matchmaking  Attribute and constraint indexing schemes  Initial results very promising! (8k,8k) takes 3 mins. instead of 2 hrs. 3 mins. › Multilateral Matchmaking: Gang-Matching  Matchmaking with more than two entities e.g., Job-Workstation-License  In addition to co-allocation, can implement Regulation: limit number of running instances Aggregation: abstract aggregate services

27 ClassAd Summary › Distributed resource management  Distributed clients, servers  Heterogeneous resources › Classified advertisements  Semi-structured data model  Schema, data, and query in one structure › Matchmaking  Condor’s matchmaking algorithm

28 ClassAd Summary (Cont.) › ClassAds are used throughout Condor  Represent and query system daemons startds, schedds, masters, collectors, checkpoint servers  Configuration and Job Control › C++ and Java implementations › Available as part of Condor and as stand- alone libraries