SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory

Slides:



Advertisements
Similar presentations
Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: NWS papers.
Advertisements

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Approaches to Data Acquisition The LCA depends upon data acquisition Qualitative vs. Quantitative –While some quantitative analysis is appropriate, inappropriate.
GoldSim 2006 User Conference Slide 1 Vancouver, B.C. The Submodel Element.
Chapter 9 Uniprocessor Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2008,
Chapter 9 Uniprocessor Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2008,
The Network Weather Service A Distributed Resource Performance Forecasting Service for Metacomputing Rich Wolski, Neil T. Spring and Jim Hayes Presented.
Class notes for ISE 201 San Jose State University
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Distributed Application Management Using PLuSH Jeannie Albrecht, Christopher Tuttle, Alex C. Snoeren, and Amin Vahdat UC San Diego CSE {jalbrecht, ctuttle,
Nick Trebon, Alan Morris, Jaideep Ray, Sameer Shende, Allen Malony {ntrebon, amorris, Department of.
MCell Usage Scenario Project #7 CSE 260 UCSD Nadya Williams
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Previously Optimization Probability Review Inventory Models Markov Decision Processes Queues.
The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Rich Wolski, Neil Spring, and Jim Hayes, Journal.
High Throughput Urgent Computing Jason Cope Condor Week 2008.
Overview of STAT 270 Ch 1-9 of Devore + Various Applications.
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
Lecture 10 Comparison and Evaluation of Alternative System Designs.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
SPRUCE Special PRiority and Urgent Computing Environment Nick Trebon Argonne National Laboratory University of Chicago
Computer Simulation A Laboratory to Evaluate “What-if” Questions.
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
Simulation Output Analysis
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
Verification & Validation
Gayathri Namala Center for Computation & Technology Louisiana State University Representing the SURA Coastal Ocean Observing and Prediction Program (SCOOP)
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Deadline-based Grid Resource Selection for Urgent Computing Nick Trebon 6/18/08 University of Chicago.
A. Betâmio de Almeida Assessing Modelling Uncertainty A. Betâmio de Almeida Instituto Superior Técnico November 2004 Zaragoza, Spain 4th IMPACT Workshop.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
Predicting Queue Waiting Time in Batch Controlled Systems Rich Wolski, Dan Nurmi, John Brevik, Graziano Obertelli Computer Science Department University.
October, 2000.A Self Organsing NN for Job Scheduling in Distributed Systems I.C. Legrand1 Iosif C. Legrand CALTECH.
SPRUCE Special PRiority and Urgent Computing Environment Pete Beckman Argonne National Laboratory University of Chicago
Simulation Tutorial By Bing Wang Assistant professor, CSE Department, University of Connecticut Web site.
Advisor: Resource Selection 11/15/2007 Nick Trebon University of Chicago.
TeraGrid Advanced Scheduling Tools Warren Smith Texas Advanced Computing Center wsmith at tacc.utexas.edu.
Superscheduling and Resource Brokering Sven Groot ( )
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
The influence of local calibration on the quality of UV-VIS spectrometer measurements in urban stormwater monitoring N. Caradot, H. Sonnenberg, M. Riechel,
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Network Simulation Motivation: r learn fundamentals of evaluating network performance via simulation Overview: r fundamentals of discrete event simulation.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Automatic Statistical Evaluation of Resources for Condor Daniel Nurmi, John Brevik, Rich Wolski University of California, Santa Barbara.
Confidence Interval for a population mean Section 10.1.
Marcin Płóciennik Poznan Supercomputing and Networking Center OGF23, Barcelona, Spain, June 3rd, 2008 Use case of NMR spectrometry in Virtual Laboratory.
SPRUCE Special PRiority and Urgent Computing Environment User Demo Nick Trebon Argonne National Laboratory University of Chicago
BIOSTATISTICS Hypotheses testing and parameter estimation.
Parallel Tomography Shava Smallen SC99. Shava Smallen SC99AppLeS/NWS-UCSD/UTK What are the Computational Challenges? l Quick turnaround time u Resource.
Network Weather Service. Introduction “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing.
Chapter 9 Uniprocessor Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2008,
Probabilistic Upper Bounds for Urgent Applications Nick Trebon and Pete Beckman University of Chicago and Argonne National Lab, USA Case Study Elevated.
Resource Characterization Rich Wolski, Dan Nurmi, and John Brevik Computer Science Department University of California, Santa Barbara VGrADS Site Visit.
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
OpenPBS – Distributed Workload Management System
Sampling distribution
Risk Mgt and the use of derivatives
ACCURACY IN PERCENTILES
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
Lecture 2 Part 3 CPU Scheduling
Uniprocessor scheduling
Probabilistic Upper Bounds
Presentation transcript:

SPRUCE Special PRiority and Urgent Computing Environment Advisor Demo Nick Trebon University of Chicago Argonne National Laboratory

2 University of ChicagoArgonne National Lab Urgent Computing - Urgent Computing Resource Selection  Given an urgent computation and a deadline how does one select the “best” resource?  “Best”: Resource that provides the configuration most likely to meet the deadline Configuration: a specification of the runtime parameters for an urgent computation on a given resource  Runtime parameters: # cpus, input/output repository, priority, etc.  Cost function (priority): Normal --> No additional cost SPRUCE --> Token + intrusiveness to other users

3 University of ChicagoArgonne National Lab Urgent Computing - Probabilistic Bounds on Total Turnaround Time Input Phase: (I Q,I B ) Resource Allocation Phase: (A Q,A B ) Execution Phase: (E Q,E B ) Output Phase: (O Q,O B ) If each phase is independent, then:  Overall bound = I B + A B + E B + O B  Overall quantile ≥ I Q * A Q * E Q * O Q

4 University of ChicagoArgonne National Lab Urgent Computing - I B +A B +E B + O B : File Staging Delay Methodology is the same for input/output file staging Utilize Network Weather Service to generate bandwidth predictions based upon probe  Wolski, et al.: Use MSE as sample variance of a normal distribution to generate upper confidence interval Problems?  Predicting bandwidth for output file transfers  NWS uses TCP-based probes, transfers utilize GridFTP

5 University of ChicagoArgonne National Lab Urgent Computing - I B + A B +E B +O B : Resource Allocation Delay Normal priority (no SPRUCE)  Utilize the existing Queue Bounds Estimation from Time Series (QBETS) SPRUCE policies  Next-to-run  Modified QBETS - Monte Carlo simulation on previously observed job history  Pre-emption  Utilize Binomial Method Batch Predictor (BMPB) to predict bounds based upon preemption history  Other possibilities: elevated priority, checkpoint/restart

6 University of ChicagoArgonne National Lab Urgent Computing - Resource Allocation Concerns Resource allocation bounds are predicted entirely on historical data and not current queue state  What if necessary nodes are immediately available?  What if it is clear that the bounds may be exceeded?  Example: Higher priority jobs already queued Possible solution: Query Monitoring and Discovery Services framework (Globus) to return current queue state

7 University of ChicagoArgonne National Lab Urgent Computing - I B +R B + E B +O B : Execution Delay Approach: Generate empirical bound for a given urgent application on each warm standby resource  Utilize BMBP methodology to generate probabilistic bounds  Methodology is general and non-parametric

8 University of ChicagoArgonne National Lab Urgent Computing - Calculating the Composite Probability Goal: to determine a probabilistic upper bound for a given configuration Approximate approach: query a small subset (e.g., 5) of quantiles for each individual phase and select the combination that results in highest composite quantile with bound < deadline

Advisor Demo Screenshots

10 University of ChicagoArgonne National Lab Urgent Computing -

11 University of ChicagoArgonne National Lab Urgent Computing -

12 University of ChicagoArgonne National Lab Urgent Computing -

13 University of ChicagoArgonne National Lab Urgent Computing -