Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, 2007 1 INFORMS International Session MC07, July 9, 2007 The.

Slides:



Advertisements
Similar presentations
Convegno Progetto FIRB LSNO – Capri 19/20 aprile ESOPO: an Environment for Solving Optimization Problems Online M. DApuzzo *, M.L. De Cesare **,
Advertisements

SL-10 Laboratory Hot Tack / Seal Tester TMI Group of Companies TMI Group of Companies.
Chapter 13 Network Design and Management
Deterministic Global Parameter Estimation for a Budding Yeast Model T.D Panning*, L.T. Watson*, N.A. Allen*, C.A. Shaffer*, and J.J Tyson + Departments.
Early Effort Estimation of Business Data-processing Enhancements CS 689 November 30, 2000 By Kurt Detamore.
CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm.
Automating Bespoke Attack Ruei-Jiun Chapter 13. Outline Uses of bespoke automation ◦ Enumerating identifiers ◦ Harvesting data ◦ Web application fuzzing.
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
A COMPUTER BASED TOOL FOR THE SIMULATION, INTEGRATED DESIGN, AND CONTROL OF WASTEWATER TREAMENT PROCESSES By P. Vega, F. Alawneh, L. González, M. Francisco,
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 10 Slide 1 Formal Specification.
Linear programming. Linear programming… …is a quantitative management tool to obtain optimal solutions to problems that involve restrictions and limitations.
SENG521 (Fall SENG 521 Software Reliability & Testing Software Reliability Tools (Part 8a) Department of Electrical & Computer.
Chapter 19 Linear Programming McGraw-Hill/Irwin
Module E Overview of Sampling ACCT-40801Mudule E.
Web Technologies COMP6115 Session 2: Planning, Designing, Constructing and Testing Static Web Sites Dr. Paul Walcott Department of Computer Science, Mathematics.
Software Testing. Definition To test a program is to try to make it fail.
Evaluation of Safety Critical Software -- David L. Parnas, -- A. John van Schouwen, -- Shu Po Kwan -- June 1990 Presented By Zhuojing Li.
Advanced Excel for Finance Professionals A self study material from South Asian Management Technologies Foundation.
1 Using Excel to Implement Software Reliability Models Norman F. Schneidewind Naval Postgraduate School 2822 Racoon Trail, Pebble Beach, California, 93953,
Chapter 6 : Software Metrics
Slide 1 Tutorial: Optimal Learning in the Laboratory Sciences The knowledge gradient December 10, 2014 Warren B. Powell Kris Reyes Si Chen Princeton University.
Simulation Prepared by Amani Salah AL-Saigaly Supervised by Dr. Sana’a Wafa Al-Sayegh University of Palestine.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Unit 2: Engineering Design Process
Linear Programming McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Jump to first page (c) 1999, A. Lakhotia 1 Software engineering? Arun Lakhotia University of Louisiana at Lafayette Po Box Lafayette, LA 70504, USA.
Principles of Engineering Economic Analysis, 5th edition Chapter 15 Capital Budgeting.
Appendix B A BRIEF TOUR OF SOLVER Prescriptive Analytics
WXGE6103 Software Engineering Process and Practice Formal Specification.
BE-SECBS FISA 2003 November 13th 2003 page 1 DSR/SAMS/BASP IRSN BE SECBS – IRSN assessment Context application of IRSN methodology to the reference case.
Usability Testing Chapter 6. Reliability Can you repeat the test?
Fundamental Programming: Fundamental Programming K.Chinnasarn, Ph.D.
Robert Fourer, Jun Ma, Kipp Martin Optimization Services and the Stylized “OS” Logo are registered in the US Patent & Trademark Office. All other product.
Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.
Goal Seek and Solver. Goal seeking helps you n Find a specific value for a target cell by adjusting the value of one other cell whose value is allowed.
Chin-Yu Huang Department of Computer Science National Tsing Hua University Hsinchu, Taiwan Optimal Allocation of Testing-Resource Considering Cost, Reliability,
HARDWARE INPUT DEVICES GETTING DATA INTO THE COMPUTER.
Real-time Verification of Operational Precipitation Forecasts using Hourly Gauge Data Andrew Loughe Judy Henderson Jennifer MahoneyEdward Tollerud Real-time.
Chapter 8 Usability Specification Techniques Hix & Hartson.
S ystems Analysis Laboratory Helsinki University of Technology Automated Solution of Realistic Near-Optimal Aircraft Trajectories Using Computational Optimal.
Computer Science 1 Mining Likely Properties of Access Control Policies via Association Rule Mining JeeHyun Hwang 1, Tao Xie 1, Vincent Hu 2 and Mine Altunay.
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
CPSC 873 John D. McGregor Session 9 Testing Vocabulary.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Solving MPEC’s via Automatic Reformulation as NLP’s S. P. Dirkse, M.C. Ferris, and A. Meeraus GAMS Development Corporation & UW-Madison ISMP-2003 Copenhagen.
Written by Changhyun, SON Chapter 5. Introduction to Design Optimization - 1 PART II Design Optimization.
Bolted Joint Assembly. Advanced Solid Body Contact Options Goal – In this workshop, our goal is to study the effect that contact stiffness specification.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Supplement 6 Linear Programming.
BME 353 – BIOMEDICAL MEASUREMENTS AND INSTRUMENTATION MEASUREMENT PRINCIPLES.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Collecting and Processing Information Foundations of Technology Collecting and Processing Information © 2013 International Technology and Engineering Educators.
Advanced control strategies. CONTROL SYSTEMS The process parameters which are measured using probes described in the previous sections may be controlled.
QC – User Interface QUALITY CENTER. QC – Testing Process QC testing process includes four phases: Specifying Requirements Specifying Requirements Planning.
A Challenge Problem on Uncertainty Quantification & Value of Information 1 Patrick Koch Dassault Systèmes SIMULIA 2013 NAFEMS World Congress Salzburg,
Supply Chain Management By Dr. Asif Mahmood Chapter 9: Aggregate Planning.
Determining How Costs Behave
Excel’s Solver Use Excel’s Solver as a tool to assist the decision maker in identifying the optimal solution for a business decision. Business decisions.
OPERATING SYSTEMS CS 3502 Fall 2017
Chapter 4 Basic Estimation Techniques
Ryan Lekivetz JMP Division of SAS Abstract Covering Arrays
3E Plus Program Software Insulation Thickness Calculator
Determining How Costs Behave
Cyberinfrastructure – the NEOS Project
John D. McGregor Session 9 Testing Vocabulary
John D. McGregor Session 9 Testing Vocabulary
Software life cycle models
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Software Metrics “How do we measure the software?”
 Is a machine that is able to take information (input), do some work on (process), and to make new information (output) COMPUTER.
Presentation transcript:

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, INFORMS International Session MC07, July 9, 2007 The NEOS Benchmarking Service Robert Fourer Industrial Engineering & Management Sciences Northwestern University Liz Dolan, Jorge Moré, Todd Munson Mathematics & Computer Science Division Argonne National Laboratory

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, The NEOS Benchmarking Service One of the “solvers” on NEOS 5 is actually a benchmarking service that runs a submitted problem on a selection of solvers on the same machine. In addition to output listings from the solvers, the service optionally returns an independent assessment of the quality of the results. It is particularly useful in choosing a solver for a particular application.

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking Fundamentals  Collection of test problems  Collection of solvers & solver parameter settings Performance measures  Number of iterations  Number of function evaluations  Number of conjugate gradient iterations  Computing time Issues  Running the solvers  Verifying the results  Comparing performance

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Running Solvers NEOS benchmarking tools  User submits one problem to NEOS benchmark “solver”  User selects solvers to be compared  NEOS tries all solvers, using the same computer  NEOS verifies reported solutions  NEOS returns listing of results Other current benchmarking resources  Hans Mittelmann’s benchmark pages, plato.la.asu.edu/bench.html  PAVER performance analysis tool, access to numerous solvers is essential Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, NEOS Tools Benchmarking web page (instructions) Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, NEOS Tools (cont’d) Benchmarking web page (solver choice) Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Verifying Results Comparable running environments  Same computer and operating system  User’s choice of solver parameters  User’s choice of tolerances for feasibility, optimality, complementarity Independent assessment of solutions  Based only on solution returned E.D. Dolan, J.J. Moré and T.S. Munson, “Optimality Measures for Performance Profiles” (available from Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, NEOS Tools (cont’d) Benchmark verification results Solver lbfgsb. feasibility error e+00 complementarity error e+00 optimality error e-07 scaled optimality error e-06 Solver solution optimality and complementarity found acceptable. Solver loqo. feasibility error e+00 complementarity error e-05 optimality error e-06 scaled optimality error e-04 Solver solution not acceptable by this analysis because the scaled optimality error is greater than your limit of 1.0e-05 and the complementarity error is greater than your limit of 1.0e-05. Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Comparing Performance Average or cumulative totals of metric  Sensitive to results on a small number of problems Medians or quartiles of a metric  Information between quartiles is lost Number of k-th place entries  No information on the size of improvement Number of wins by a fixed amount or percentage  Dependent on the subjective choice of a parameter Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Performance Profiles Quantities to compute  For each solver s on each test problem p: ratio r ps of that solver’s metric to best solver’s metric  For each solver s: fraction  s (  ) of test problems that have log 2 r ps   Values to display  Plot  s (  ) vs.  for each solver s   s :   [0,1] is a non-decreasing, piecewise constant function   s (0) is the fraction of problems on which solver s was best  s (  ) is the fraction of problems on which solver s did not fail  Emphasis goes from performance to reliability as you go from left to right in the plot E.D. Dolan and J.J. Moré, “Benchmarking Optimization Software with Performance Profiles.” Mathematical Programming 91 (2002) 201–213. Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Performance Profiles (cont’d) COPS optimal control problems Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Performance Profiles (cont’d) Mittelman test problems Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Performance Profiles (cont’d) Advantages  Not sensitive to the data on a small number of problems  Not sensitive to small changes in the data  Information on the size of improvement is provided  Does not depend on the subjective choice of a parameter  Can be used to compare more than two solvers Further research interests  Multi-problem NEOS benchmarking tool  Automated benchmark runs  Automated generation of result tables & performance profiles Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Automated Benchmarking NEOS benchmarker  User submits one problem to NEOS benchmark “solver”  User selects solvers to be compared  NEOS tries all solvers, using the same computer  NEOS verifies reported solutions  NEOS returns listing of results Other current benchmarking resources  Hans Mittelmann’s benchmark pages, plato.la.asu.edu/bench.html  PAVER performance analysis tool, access to numerous solvers is essential Forthcoming

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking (cont’d) NEOS benchmarking web page (instructions) Forthcoming

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking (cont’d) NEOS benchmarking web page (solver choice) Forthcoming

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking (cont’d) Features to come  Run specified set of test problems  Make fair option choices for each solver  Produce performance profiles... Forthcoming

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking Interface Benchmarking web page (instructions) Using NEOS

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Benchmarking Interface (cont’d) Web page (solver choice)

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Verifying Results Comparable running environments  Same computer and operating system  User’s choice of solver parameters  User’s choice of tolerances for feasibility, optimality, complementarity Independent assessment of solutions  Based only on solution returned E.D. Dolan, J.J. Moré and T.S. Munson, “Optimality Measures for Performance Profiles” (available from Benchmarking

Fourer et al., The NEOS Benchmarking Service INFORMS International, Puerto Rico, July 8-11, Verifying Results (cont’d) Benchmark verification results Solver lbfgsb. feasibility error e+00 complementarity error e+00 optimality error e-07 scaled optimality error e-06 Solver solution optimality and complementarity found acceptable. Solver loqo. feasibility error e+00 complementarity error e-05 optimality error e-06 scaled optimality error e-04 Solver solution not acceptable by this analysis because the scaled optimality error is greater than your limit of 1.0e-05 and the complementarity error is greater than your limit of 1.0e-05. Benchmarking