1 Automating Auto Tuning Jeffrey K. Hollingsworth University of Maryland

Slides:



Advertisements
Similar presentations
Zhongxing Telecom Pakistan (Pvt.) Ltd
Advertisements

Analysis of Computer Algorithms
Basic Java Constructs and Data Types – Nuts and Bolts
1 Applets Programming Enabling Application Delivery Via the Web.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Servlets and Java Server Pages.
Chapter 11 Introduction to Programming in C
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
By Rick Clements Software Testing 101 By Rick Clements
Credit hours: 4 Contact hours: 50 (30 Theory, 20 Lab) Prerequisite: TB143 Introduction to Personal Computers.
0 - 0.
Addition Facts
Making the System Operational
View-Based Application Development Lecture 1 1. Flows of Lecture 1 Before Lab Introduction to the Game to be developed in this workshop Comparison between.
Distributed and Parallel Processing Technology Chapter2. MapReduce
Electric Bus Management System
Addison Wesley is an imprint of © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 10 Arrays and Tile Mapping Starting Out with Games & Graphics.
Active Harmonys Plug-in Interface A World of Possibilities Ray S. Chen Jeffrey K. Hollingsworth Paradyn/Dyninst Week 2013 April 30, 2013.
Active Harmony and the Chapel HPC Language Ray Chen, UMD Jeff Hollingsworth, UMD Michael P. Ferguson, LTS.
SE-292 High Performance Computing Profiling and Performance R. Govindarajan
Modern Programming Languages, 2nd ed.
1 1 Mechanical Design and Production Dept, Faculty of Engineering, Zagazig University, Egypt. Mechanical Design and Production Dept, Faculty of Engineering,
1 Contract Inactivation & Replacement Fly-in Action ( Continue to Page Down/Click on each page…) Electronic Document Access (EDA)
Social Web Design 1 Darby Chang Social Web Design.
Molecular Biomedical Informatics Web Programming 1.
Symantec Education Skills Assessment SESA 3.0 Feature Showcase
Chapter 11: The X Window System Guide To UNIX Using Linux Third Edition.
1 What is JavaScript? JavaScript was designed to add interactivity to HTML pages JavaScript is a scripting language A scripting language is a lightweight.
User Defined Functions
HORIZONT TWS/WebAdmin TWS/WebAdmin for Distributed
Code Correctness, Readability, Maintainability Svetlin Nakov Telerik Corporation
Shiny in R the fundamentals of getting started. What Is It? New package in R to create web apps Web app built entirely in R, but can also incorporate.
4 Oracle Data Integrator First Project – Simple Transformations: One source, one target 3-1.
Processes Management.
Implementation Architecture
Introducing JavaScript
DB Relay An Introduction. INSPIRATION Database access is WAY TOO HARD The crux.
C Language.
Addition 1’s to 20.
Test B, 100 Subtraction Facts
What’s New in WatchGuard Dimension v1.2
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
Chapter 9 Interactive Multimedia Authoring with Flash Introduction to Programming 1.
Types of selection structures
We will resume in: 25 Minutes.
Chapter 11 Component-Level Design
Tutorial 1: Sensitivity analysis of an analytical function
Using the SmartPLS Software
Chapter Modules CSC1310 Fall Modules Modules Modules are the highest level program organization unit, usually correspond to source files and.
South Dakota Library Network MetaLib User Interface South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD © South Dakota.
Chapter 8 Improving the User Interface
Working with JavaScript. 2 Objectives Introducing JavaScript Inserting JavaScript into a Web Page File Writing Output to the Web Page Working with Variables.
1. 2 FUNCTION INLINE FUNCTION DIFFERENCE BETWEEN FUNCTION AND INLINE FUNCTION CONCLUSION 3.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
XP Tutorial 10New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with JavaScript Creating a Programmable Web Page for North Pole.
Cohesion and Coupling CS 4311
SECURE WEB APPLICATIONS VIA AUTOMATIC PARTITIONING S. Chong, J. Liu, A. C. Myers, X. Qi, K. Vikram, L. Zheng, X. Zheng Cornell University.
Intro to PHP IST2101. Review: HTML & Tags 2IST210.
Guide to Linux Installation and Administration, 2e1 Chapter 11 Using Advanced Administration Techniques.
University of Maryland Towards Automated Tuning of Parallel Programs Jeffrey K. Hollingsworth Department of Computer Science University.
 Previous lessons have focused on client-side scripts  Programs embedded in the page’s HTML code  Can also execute scripts on the server  Server-side.
Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)
CS 440 Database Management Systems Stored procedures & OR mapping 1.
XP Tutorial 10New Perspectives on HTML, XHTML, and DHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties.
IST 210: PHP Basics IST 210: Organization of Data IST2101.
A S P. Outline  The introduction of ASP  Why we choose ASP  How ASP works  Basic syntax rule of ASP  ASP’S object model  Limitations of ASP  Summary.
CS 330 Class 7 Comments on Exam Programming plan for today:
SPL – PS1 Introduction to C++.
Presentation transcript:

1 Automating Auto Tuning Jeffrey K. Hollingsworth University of Maryland

2 Auto-tuning Motivation Example, a dense matrix multiple kernel Various Options: Original program: 30.1 sec Hand Tuned (by developer): 11.4 sec Auto-tuned of hand-tuned: 15.9 sec Auto-tuned original program: 8.5 sec What Happened? Hand tuning prevented analysis Auto-tuned transformations were then not possible

3 Automated Performance Tuning Goal: Maximize achieved performance Problems: Large number of parameters to tune Shape of objective function unknown Multiple libraries and coupled applications Analytical model may not be available Requirements: Runtime tuning for long running programs Don t try too many configurations Avoid gradients

4 Easier Auto-tuning: Improve User Interface Web-based UI written in HTML/Javascript Reduced overhead on Harmony Server Allows isolated view of single dimension HTTP deals with dropped connections better than X

5 Easier Auto-tuning: Lower Barriers to Use Problem: Hard to get codes ready for auto-tuning Tools can be complex Tuna: auto-tuning shell Hides details of auto-tuning Use: Program print performance as final output line Auto-tuner invoked programs via command line arguments Program can be shell script including compilation config file generation

6 Easier Auto-tuning: Tuna >./tuna -i=tile,1,10,1 -i=unroll,2,12,2 -n=25 matrix_mult -t % -u % Parameter 1 Low: 1 High: 10 Step: 1 Parameter 1 Low: 1 High: 10 Step: 1 Parameter 2 Low: 2 High: 12 Step: 2 Parameter 2 Low: 2 High: 12 Step: 2 Command line to run % - replaced by paramters Command line to run % - replaced by paramters

7 Easier Auto-tuning: Language Help Problem: How to express tunable options Chapel Already supports Configuration Variables Sort of like a constant Defined in program Given a value by programmer But User can change value at program launch Proposed Extensions to Language Add range to Configuration Variable Include Range in –help output Include machine readable version of –help

8 >./quicksort --help | tail -5 quicksort config vars: n: int(64) thresh: int(64) verbose: int(64) timing: bool Auto-tuning parameters Threads, Tasks: defaults of Chapel runtime Thresh: Configuration variable from program Tuna + Chapel

9 Default28.3 (44, 256) Speedup13.9% LULESH in Chapel on Problem Size 32 3

10 Default102.5 (76, 256) Speedup12.3% LULESH in Chapel on Problem Size 48 3

11 Auto-Tuning Deconstructed Active Harmony Client Application Candidate Points Evaluated Performance Search Strategy FETCH REPORT

12 Onion Model Workflow Allows for paired functionality, but either hook is optional. Fetch hooks executed in ascending order. Report hooks executed in descending order. Point values cannot be modified (directly). 12 Search Strategy FETCH REPORT 123

13 Plug-in Workflow: ACCEPT Indicates successful processing at this level. Plug-in relinquishes control of the point to entity below. 13 Search Strategy 123

14 Plug-in Workflow: RETURN Allows a short-circuit in the feedback loop. Plug-in must provide the performance value. Processing resumes at the same level from the report workflow. 14 Search Strategy 123

15 Plug-in Workflow: REJECT Allows modification of the search indirectly. Plug-in must provide an alternate valid point. Search strategy is not obligated to use it. Processing jumps directly back to search strategy. 15 Search Strategy 123

16 Plug-in Example: Point Logger logger_init() { outfd = open(performance.log); } logger_report(flow_t *flow, point_t *pt, double perf) { fprintf(outfd, %s -> %lf\n, string(pt), perf); flow->type = ACCEPT; } logger_fini() { close(outfd); } logger_init() { outfd = open(performance.log); } logger_report(flow_t *flow, point_t *pt, double perf) { fprintf(outfd, %s -> %lf\n, string(pt), perf); flow->type = ACCEPT; } logger_fini() { close(outfd); } 16

17 Plug-in Example: Constraints 17 Search Strategy 123

18 Conclusions and Future Work Ongoing Work Getting features into Chapel Communication Optimization Re-using data from prior runs Conclusions Auto tuning can be done at many levels Offline – using training runs (choices fixed for an entire run) Compiler options Programmer supplied per-run tunable parameters Compiler Transformations Online –training or production (choices change during execution) Programmer supplied per-timestep parameters Compiler Transformations It Works! Real programs run faster