Download presentation
Presentation is loading. Please wait.
1
A Statistical Scheduling Technique for a Computational Market Economy Neal Sample Stanford University
2
UCSC 2002 2 Research Interests Compositional Computing (GRID) Reliability and Quality of Service Value-based and model-based mediation Languages: “Programming for the non-programmer expert” Database Research Semistructured indexing and storage Massive table/stream compression Approximate algorithms for streaming data
3
UCSC 2002 3 Why We’re Here Coding Integration/Composition 1970 1990 2010
4
UCSC 2002 4 GRID: Commodity Computing
5
UCSC 2002 5 GRID: Commodity Computing
6
UCSC 2002 6 GRID: Commodity Computing On Demand High Throughput Collaborative Distributed Supercomputing Data Intensive (Large Hadron Collider) (Computer-in-the-loop) (FightAIDSAtHome, Nug30) (Chip design, cryptography) (Data exploration, Education)
7
UCSC 2002 7 Remote, autonomous Services are not free Fee ($), execution time 2 nd order dependencies “Open Service Model” Principles:GRID, CHAIMS Protocols:UDDI, IETF SLP Runtime:Globus, CPAM Composition of Large Services
8
UCSC 2002 8 Grid Life is Tough Increased complexity throughout New tools and applications Diverse resources such as computers, storage media, networks, sensors Programming Control flow & data flow separation Service mediation Infrastructure Resource discovery, brokering, monitoring Security/authorization Payment mechanisms
9
UCSC 2002 9 Our GRID Contributions Programming models and tools System architecture Resource management Instrumentation and performance analysis Network protocols and infrastructure Service mediation
10
UCSC 2002 10 Other GRID Research Areas The nature of applications Algorithms and problem solving methods Security, payment/escrow, reputation End systems Programming models and tools System architecture Resource management Instrumentation and performance analysis Network protocols and infrastructure Service mediation
11
UCSC 2002 11 Roadmap Brief introduction to CLAM language Some related scheduling methods Surety-based scheduling Sample program Monitoring Rescheduling Results A few future directions
12
UCSC 2002 12 Decomposition of CALL-statement Parallelism by asynchrony in sequential program Reduction of complexity of invoke statements Control of new GRID requirements (estimation, trading, brokering, etc.) Abstract out data flow Mediation for data flow control and optimization Extraction model mediation Purely compositional No primitives for arithmetic No primitives for input/output Targets the “non-programmer expert” CLAM Composition Language
13
UCSC 2002 13 Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: service cost estimation Invocation and result gathering: INVOKE EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service CLAM Primitives
14
UCSC 2002 14 Resources + Scheduling Computational Model Multithreading Automatic parallelization Resource Management Process creation OS signal delivery OS scheduling end system
15
UCSC 2002 15 Resources + Scheduling Computational Model Synchronous communication Distributed shared memory Resource Management Parallel process creation Gang scheduling OS-level signal propagation cluster end system
16
UCSC 2002 16 Resources + Scheduling Computational Model Client/server Loosely synchronous: pipelines IWIM Resource Management Resource discovery Signal distribution networks cluster intranet end system
17
UCSC 2002 17 Resources + Scheduling Computational Model Collaborative systems Remote control Data mining Resource Management Brokers Trading Mobile code negotiation cluster intranet end system Internet
18
UCSC 2002 18 Scheduling Difficulties Adaptation: Repair and Reschedule Schedules for T 0 are only guesses Estimates for multiple stages may become invalid => Schedules must be revised during runtime t0t0 t finish schedule work reschedulehazard work TIME
19
UCSC 2002 19 Scheduling Difficulties Service Autonomy: No Resource Allocation The scheduler does not handle resource allocation Users observe resources without control Means: Competing objectives have orthogonal scheduling techniques Changing goals for tasks or users means vastly increased scheduling complexity
20
UCSC 2002 20 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution
21
UCSC 2002 21 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution PERT Q A M
22
UCSC 2002 22 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution PERT Q A M CPM M R A
23
UCSC 2002 23 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution ePERT(AT&T) Condor (Wisconsin) M R Q PERT Q A M CPM M R A
24
UCSC 2002 24 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution ePERT(AT&T) Condor (Wisconsin) M R Q PERT Q A M CPM M R A Mariposa (UCB) R Q A
25
UCSC 2002 25 Some Related Work R A M Q Rescheduling Autonomy of Services Monitoring Execution QoS, probabilistic execution ePERT(AT&T) Condor (Wisconsin) M R Q Mariposa (UCB) R Q A PERT Q A M CPM M R A SBS (Stanford) R Q A M
26
UCSC 2002 26 Sample Program C A D B
27
UCSC 2002 27 Budgeting Time Maximum allowable execution time Expense Funding available to lease services Surety Goal: schedule probability of success Assessment technique
28
UCSC 2002 28 Program Schedule as a Template Instantiated at runtime Service provider selection, etc. C A D B C C C C C A A A A B B B B B D D D D
29
UCSC 2002 29 Program Schedule as a Template Instantiated at runtime Service provider selection, etc. C A D B C C C C C A A A A B B B B B D D D D
30
UCSC 2002 30 Program Schedule as a Template Instantiated at runtime Service provider selection, etc. C A D B C C C C C A A A A B B B B B D D D D
31
UCSC 2002 31 Program Schedule as a Template Instantiated at runtime Service provider selection, etc. C A D B C C C C C A A A A B B B B B D D D D
32
UCSC 2002 32 t 0 Schedule Selection Guided by runtime “bids” Constrained by budget C A D B C C C C C A A A A B B B B B D D D D 7±2h $50 6±1h $40 5±2h $30 3±1h $30
33
UCSC 2002 33 t 0 Schedule Constraints Budget Time: upper bound- e.g. 22h Cost: upper bound- e.g. $250 Surety:lower bound- e.g. 90% {Time, Cost, Surety} ={22, 250, 90} Steered by user preferences/weights = Selection S1 est [20, 150, 90] = (22-20)*10 + (250-150)*1 + (90-90)*5 = 120 S2 est [22, 175, 95] = (22-22)*10 + (250-175)*1 + (95-90)*5 = 100 S3 est [18, 190, 96] = (22-18)*10 + (250-190)*1 + (96-90)*5 = 130
34
UCSC 2002 34 budget time budget cost Budget User Pref. Pareto Search Space Expected Program Execution Time Expected Program Cost 0 0 Plans
35
UCSC 2002 35 Program Evaluation and Review Technique Service times: most likely(m), optimistic(a) and pessimistic(b) and ; N(0, 1) (1) expected duration (service) (2) standard deviation (3) expected duration (program) (4) test value (5) expectation test (6) ~expectation test
36
UCSC 2002 36 t 0 Complete Schedule Properties Probability Density Probable Program Completion Time deadlineBank = $100 user specified surety
37
UCSC 2002 37 Individual Service Properties C A B 7±2h 6±1h 5±2h 010 ~finish time probability density 0 1.2 0 0
38
UCSC 2002 38 1423 probable finish time 0 1 t 0 Combined Service Properties 010 ~finish time probability density 0 1.2 0 0 Deadline (22h) Surety (90%) Current Surety (99.6%) probability density
39
UCSC 2002 39 Tracking Surety surety % 80 100 90 probability density User-specified surety
40
UCSC 2002 40 Runtime Hazards With control over resource allocation or without runtime hazards Scheduling becomes much easier Runtime implies t 0 schedule invalidation Sample hazards Delays and slowdowns Stoppages Inaccurate estimations Communication loss Competitive displacement… OSM
41
UCSC 2002 41 Progressive Hazard execution time 0 80 100 minimum surety hazard 90 surety % Definition + Detection serviceA start serviceB start (serviceB slow)
42
UCSC 2002 42 Catastrophic Hazard execution time 0 80 100 minimum surety hazard 90 surety % Definition + Detection 0% serviceA start serviceB start (serviceB fails)
43
UCSC 2002 43 Pseudo-Hazard execution time 0 80 100 minimum surety pseudo-hazard 90 surety % Definition + Detection serviceA start serviceB start (serviceB communication failure) 0%
44
UCSC 2002 44 Monitoring + Repair Observe, not control Complete set of repairs Sufficient (not minimal) Simple cost model: early termination = linear cost recovery Greedy selection of single repair -O(s*r) C A D B
45
UCSC 2002 45 Schedule Repair execution time 0 80 100 t hazard 90 surety % C A D B t repair
46
UCSC 2002 46 Strategy 0: baseline (no repair) pro:no additional $ cost pro:ideal solution for partitioning hazards con:depends on self-recovery execution time 0 80 100 t hazard 90 surety % t repair C A D B
47
UCSC 2002 47 Strategy 1: service replacement pro:reduces $ lost con:lost investment of $ and time con:concedes recovery chance execution time 0 80 100 t hazard 90 surety % C A D B t repair B’
48
UCSC 2002 48 Strategy 2: service duplication pro:larger boost surety; leverages recovery chance con:large $ cost execution time 0 80 100 t hazard 90 surety % C A D B t repair B’
49
UCSC 2002 49 Strategy 3: pushdown repair pro:cheap, no $ lost pro:no time lost con:cannot handle catastrophic hazards con:requires recovery chance execution time 0 80 100 t hazard 90 surety % C A D B t repair C’ x
50
UCSC 2002 50 Experimental Results Rescheduling options Baseline: no repairs Single strategy repairs Limits flexibility and effectiveness Use all strategies Setup 1000 random DAG schedules, 2-10 services 1-3 hazards per execution Fixed service availability All schedules are repairable
51
UCSC 2002 51 “The Numbers” What is the value of a close finish? ( late)
52
UCSC 2002 52 “The Numbers” What is the value of a close finish? ( late)
53
UCSC 2002 53 Why the Differences? Catastrophic hazard Service provider failure - “do nothing”: no solution to hazard Pseudo-hazard Communication failure, network partition Looks exactly like catastrophic hazard - “do nothing” : the ideal solution Slowdown hazard Not a complete failure, multiple solutions - “do nothing”: ideal or futile or acceptable
54
UCSC 2002 54 A Challenge Observations of progress are only secondary indicators of current work rate projected finish finish time projected finish
55
UCSC 2002 55 Open Questions Simultaneous rescheduling Use more than one strategy for a hazard NP to find the optimal solution NP here might not be that hard… Approximations are acceptable Small set Strong constraints NP is worst case, not average case? (e.g., DFBB search) Global impact of local schedule preferences How do local preferences interact in/reshape the global market?
56
UCSC 2002 56 Open Questions Monitoring resolution adjustments Networks are not free or zero latency Account cost of monitoring Frequent monitoring = more cost Frequent monitoring = greater accuracy Unstudied effect delayed status information Accuracy of t 0 service cost estimates Model as a hazard with delayed detection “1-way hazard” Penalty adjustments
57
UCSC 2002 57 Deeper Questions User preferences only used in generating initial (t 0 ) schedule fixed least cost repair ( = surety / repair cost) Best cost repair (success sensitive to preference?) Second order cost effects $ left over in budget is purchasing power What is the value of that purchasing power? Sampling for cost estimates during runtime surety = time + progress (+ budgetBalance/valuation)
58
UCSC 2002 58 Conclusions Novel statistical method for service scheduling Effective strategies for varied hazard mix Achieves per-user-defined Quality of Service Should translate well “out of the sandbox” Clear directions for continued research More information http://www.db.stanford.edu/~nsample/ http://www.db.stanford.edu/CHAIMS/
59
UCSC 2002 59
60
UCSC 2002 60 Steps in Scheduling Estimation Planning Invocation Monitoring Completion Rescheduling
61
UCSC 2002 61 CHAIMS Scheduler Program Analyzer Input program Planner Requirements Estimator/ Bidder MonitorDispatcher StatusCosts/TimesControl observeinvokehaggle User Requirements (e.g., Budget)
62
UCSC 2002 62 Simplified Cost Model on time target start/run finish + data transportation costs + Completing the cost model
63
UCSC 2002 63 Full Cost Model client ready to start hold fee lateearlyon time target start/run reservation finish client ready for data +-+ ++ data transportation costs + Completing the cost model
64
UCSC 2002 64 The Eight Fallacies of Distributed Computing -- Peter Deutsch 1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn't change 6. There is one administrator 7. Transport cost is zero 8. The network is homogeneous
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.