Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century May 4th 2017 Ben Lenard.

Slides:



Advertisements
Similar presentations
1 Wake Up and Smell the Coffee: Performance Analysis Methodologies for the 21st Century Kathryn S McKinley Department of Computer Sciences University of.
Advertisements

Microarchitectural Characterization of Production JVMs and Java Workload work in progress Jungwoo Ha (UT Austin) Magnus Gustafsson (Uppsala Univ.) Stephen.
CS 330 Programming Languages 10 / 16 / 2008 Instructor: Michael Eckmann.
1 PERFORMANCE EVALUATION H Often one needs to design and conduct an experiment in order to: – demonstrate that a new technique or concept is feasible –demonstrate.
Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003.
JETT 2003 Java.compareTo(C++). JAVA Java Platform consists of 4 parts: –Java Language –Java API –Java class format –Java Virtual Machine.
Lecture 36: Programming Languages & Memory Management Announcements & Review Read Ch GU1 & GU2 Cohoon & Davidson Ch 14 Reges & Stepp Lab 10 set game due.
Dynamic Tainting for Deployed Java Programs Du Li Advisor: Witawas Srisa-an University of Nebraska-Lincoln 1.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
Comparison of JVM Phases on Data Cache Performance Shiwen Hu and Lizy K. John Laboratory for Computer Architecture The University of Texas at Austin.
Introducing the Common Language Runtime. The Common Language Runtime The Common Language Runtime (CLR) The Common Language Runtime (CLR) –Execution engine.
CIT241 Prerequisite Knowledge ◦ Variables ◦ Operators ◦ C++ Syntax ◦ Program Structure ◦ Classes  Basic Structure of a class  Concept of Data Hiding.
McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. The Nature of Research Chapter One.
M1G Introduction to Programming 2 4. Enhancing a class:Room.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Exploring Multi-Threaded Java Application Performance on Multicore Hardware Ghent University, Belgium OOPSLA 2012 presentation – October 24 th 2012 Jennifer.
ScalaMeter Performance regression testing framework Aleksandar Prokopec, Josh Suereth.
© 2003, Carla Ellis Experimentation in Computer Systems Research Why: “It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you.
Modeling and Performance Evaluation of Network and Computer Systems Introduction (Chapters 1 and 2) 10/4/2015H.Malekinezhad1.
Lecture 10 : Introduction to Java Virtual Machine
Introduction to the Java Virtual Machine 井民全. JVM (Java Virtual Machine) the environment in which the java programs execute The specification define an.
 Internal Validity  Construct Validity  External Validity * In the context of a research study, i.e., not measurement validity.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
CS380 C lecture 20 Last time –Linear scan register allocation –Classic compilation techniques –On to a modern context Today –Jenn Sartor –Experimental.
Debugging and Profiling With some help from Software Carpentry resources.
Investigating the Effects of Using Different Nursery Sizing Policies on Performance Tony Guan, Witty Srisa-an, and Neo Jia Department of Computer Science.
Click to add text © 2012 IBM Corporation Design Manager Server Instrumentation Instrumentation Data Documentation Gary Johnston, Performance Focal Point,
380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality.
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
C# and.NET. .NET Architecture  Compiling and running code that targets.NET  Advantages of Microsoft Intermediate Language (MSIL)  Value and Reference.
Introduction to Physical Science
Common Language Runtime Introduction  The common language runtime is one of the most essential component of the.Net Framework.  It acts.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
Benchmarking and Applications. Purpose of Our Benchmarking Effort Reveal compiler (and run-time systems) weak points and lack of adequate automatic optimizations.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Typical farms and hybrid approaches
Object Lifetime and Pointers
Before You Begin Nahla Abuel-ola /WIT.
Design of Experiments (DOE)
Advanced Compiler Design
4- Performance Analysis of Parallel Programs
Interpreted languages Jakub Yaghob
Introduction to Statistics
Andy Wang CIS 5930 Computer Systems Performance Analysis
Parts of an Academic Paper
CS 153: Concepts of Compiler Design November 28 Class Meeting
Introduction to Programmng in Python
Lecture 1 Runtime environments.
Runtime Analysis of Hotspot Java Virtual Machine
Simulation - Introduction
Wednesday, October 19, 2016 Warm-up
Class project by Piyush Ranjan Satapathy & Van Lepham
IE-432 Design Of Industrial Experiments
Object Based Programming
© 2012 The McGraw-Hill Companies, Inc.
Understanding Randomness
Mark Claypool and Jonathan Tanner Computer Science Department
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
.Net Framework Details Imran Rashid CTO at ManiWeber Technologies.
Chapter 12 Power Analysis.
Advanced Compiler Design
Lecture 1 Runtime environments.
1. INTRODUCTION.
Java Remote Method Invocation
Interpreting Java Program Runtimes
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
JIT Compiler Design Maxine Virtual Machine Dhwani Pandya
CMPE 152: Compiler Design May 2 Class Meeting
Presentation transcript:

Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century May 4th 2017 Ben Lenard

Introduction Methodology is the foundation determining if an experiment yielded good or bad results Like anything else in life, methodology namely, needs to be inline with the current technologies Article compared this to the testing methods for c/c++ vs Java and how outdated benchmarks can provide the wrong conclusions DaCapo is a suite of benchmarking for Java

Workload Design and Use DaCapo was created in 2003 after they pointed out to a NSF panel the need for realistic Java benchmarks Despite being providing additional NSF funds, the group continued to develop the benchmark suite since the current benchmarks are dated Relevant and diverse workload: Wide range of current applications Suitable for research: controlled and easy to use

Relevance and Diversity The authors used ‘real world’ applications, such as Eclipse – a Java IDE The DaCapo suite was able to run repeatable runs with various parameters; each run was about a minute In addition to standard metrics, the authors also collected metrics about the Java heap such as allocation rate, GC, and growth

Suitable for Research Easy to control workloads Easy to use instrumentation / packaging to encourage use and the ease in ability to make multiple runs The ability to use one host and not a whole infrastructure

The Researcher/ Do Not Cherry-Pick! Workloads need to be relevant to the experiment and if one does exist create one with a consortium A well designed benchmark reflects a range of behaviors for an application, and all results should be shown so ideas are not skewed.

Experimental Design / Gaming Your Results In addition to selecting a baseline when conducting an experiment, one most also identify the parameters that have relevance in the experiment Make sure your results don’t mislead people For example the authors cited that people compare Java Garbage Collection without comparing different heap sizes.

Control in a Changing World In C/C++ and Fortran, most important variables are the host and compiler and runtime libraries In Java you have more variables: Heap size and its parameters Warm up of the JVM or runtime environment Nondeterminism The Java/JIT compiler itself

A Case Study The authors designed a study to evaluate garbage collection in a JVM The space-time tradeoff in the heap The relationship between the collector and the application itself Meaningful Baseline – this is needed to make sure the study is ‘apples-to-apples’ Host Platform - architecture-dependent performance properties Language Runtime – libraries and JIT compiler behave differently and should be controlled

A Case Study (cont) Heap size – Since the authors are studying GC different heap sizes should be used since GC can behavior differently Warm-up – As more iterations occur, less compiling and loading will occur yielding better results Controlling Nondeterminism: use deterministic replay of optimization plans take multiple measurements in a single JVM invocation, after being warm generate sufficient data points and apply suitable statistical analysis

Analysis Data analysis is: Looking at repeated experiments to defeat experimental noise Looking at diverse experiments to draw conclusions

Conclusion Sound methodology relies on: relevant workloads the use of principled experimental design rigorous analysis. The underlying point of the article is to control variables within the experiment's’ environment