F LEX J AVA : Language Support for Safe and Modular Approximate Programming Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, William Harris Alternative.

Slides:



Advertisements
Similar presentations
Adrian Sampson, Hadi Esmaelizadeh,1 Michael Ringenburg, Reneé St
Advertisements

CML Efficient & Effective Code Management for Software Managed Multicores CODES+ISSS 2013, Montreal, Canada Ke Bai, Jing Lu, Aviral Shrivastava, and Bryce.
Kinematic Synthesis of Robotic Manipulators from Task Descriptions June 2003 By: Tarek Sobh, Daniel Toundykov.
Programming Abstractions for Approximate Computing Michael Carbin with Sasa Misailovic, Hank Hoffmann, Deokhwan Kim, Stelios Sidiroglou, Martin Rinard.
Accuracy-Aware Program Transformations Sasa Misailovic MIT CSAIL.
5th International Conference, HiPEAC 2010 MEMORY-AWARE APPLICATION MAPPING ON COARSE-GRAINED RECONFIGURABLE ARRAYS Yongjoo Kim, Jongeun Lee *, Aviral Shrivastava.
From Yale-45 to Yale-90: Let Us Not Bother the Programmers Guri Sohi University of Wisconsin-Madison Celebrating September 19, 2014.
Randomized Accuracy Aware Program Transformations for Efficient Approximate Computations Sasa Misailovic Joint work with Zeyuan Allen ZhuJonathan KelnerMartin.
Architectures and Systems for Mobile-Cloud Computing: A Workload-Driven Perspective Prashant Nair Adviser: Moin Qureshi ECE Georgia Tech Xin Zhang Adviser:
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 5 - Functions Outline 5.1Introduction 5.2Program.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
1 Patt and Patel Ch. 1 Abstraction and Computer Systems.
ASAC : Automatic Sensitivity Analysis for Approximate Computing Pooja ROY, Rajarshi RAY, Chundong WANG, Weng Fai WONG National University of Singapore.
CSE 1301 J Lecture 2 Intro to Java Programming Richard Gesick.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison,
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
- 1 - EE898-HW/SW co-design Hardware/Software Codesign “Finding right combination of HW/SW resulting in the most efficient product meeting the specification”
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
03/12/20101 Analysis of FPGA based Kalman Filter Architectures Arvind Sudarsanam Dissertation Defense 12 March 2010.
Alternative Computing Technologies
Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
Computing with C# and the.NET Framework Chapter 1 An Introduction to Computing with C# ©2003, 2011 Art Gittleman.
Assuring Application-level Correctness Against Soft Errors Jason Cong and Karthik Gururaj.
Automated Design of Custom Architecture Tulika Mitra
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. C How To Program - 4th edition Deitels Class 05 University.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
+ CUDA Antonyus Pyetro do Amaral Ferreira. + The problem The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany 2013 年 12 月 02 日 These slides use Microsoft clip arts. Microsoft copyright.
GPU Architecture and Programming
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE 498AL, University of Illinois, Urbana-Champaign 1 Basic Parallel Programming Concepts Computational.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 5 - Functions Outline 5.1Introduction 5.2Program.
Axilog: Language Support for Approximate Hardware Design DATE 2015 Georgia Institute of Technology Alternative Computing Technologies (ACT) Lab Georgia.
2.1 Functions. Functions in Mathematics f x y z f (x, y, z) Domain Range.
CS/EE 217 GPU Architecture and Parallel Programming Midterm Review
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
Approximate Computing on FPGA using Neural Acceleration Presented By: Mikkel Nielsen, Nirvedh Meshram, Shashank Gupta, Kenneth Siu.
EnerJ: Approximate Data Types for Safe and General Low-Power Computation (PLDI’2011) Adrian Sampson, Werner Dietl, Emily Fortuna Danushen Gnanapragasam,
Rely: Verifying Quantitative Reliability for Programs that Execute on Unreliable Hardware Michael Carbin, Sasa Misailovic, and Martin Rinard MIT CSAIL.
Philipp Gysel ECE Department University of California, Davis
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
A User-Guided Approach to Program Analysis Ravi Mangal, Xin Zhang, Mayur Naik Georgia Tech Aditya Nori Microsoft Research.
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
RFVP: Rollback-Free Value Prediction with Safe to Approximate Loads Amir Yazdanbakhsh, Bradley Thwaites, Hadi Esmaeilzadeh Gennady Pekhimenko, Onur Mutlu,
Chapter 3 Data Representation
Computer Engg, IIT(BHU)
COMPSCI 110 Operating Systems
CS427 Multicore Architecture and Parallel Computing
Types for Programs and Proofs
Enabling machine learning in embedded systems
Deitel- C:How to Program (5ed)
Chapter 5 - Functions Outline 5.1 Introduction
Functions.
RFVP: Rollback-Free Value Prediction with Safe to Approximate Loads
Safely Supporting Probabilistic Data: PL Techniques as Part of the Story Dan Grossman University of Washington.
Chapter 5 - Functions Outline 5.1 Introduction
Presented by: Huston Bokinsky Ying Zhang 25 April, 2013
COSC121: Computer Systems
Evaluation and Validation
Algorithm Discovery and Design
†UCSD, ‡UCSB, EHTZ*, UNIBO*
The University of Adelaide, School of Computer Science
Utsunomiya University
Sculptor: Flexible Approximation with
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

F LEX J AVA : Language Support for Safe and Modular Approximate Programming Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, William Harris Alternative Computing Technologies (ACT) Lab Georgia Institute of Technology ESEC/FSE 2015

Energy is a primary constraint Mobile Internet of Things Data Center 2

Data growth vs performance Performance growth trends: Esmaeilzadeh et al, “Dark Silicon and the End of Multicore Scaling,” ISCA 2011 Data growth trends: IDC's Digital Universe Study, December

Adding a third dimension Embracing Error Error IoT Mobile Desktop Data center 4

Navigating a three dimensional space IoT Mobile Desktop Data center Error 5

Approximate computing Embracing error Relax the abstraction of “near perfect” accuracy in Allows errors to happen to improve performance resource utilization efficiency ProcessingStorageCommunication 6

New landscape of computing Personalized and targeted c omputing 7

IoT Computing Mobile Computing Cloud Computing 8

Classes of approximate applications Programs with analog inputs Sensors, scene reconstruction Programs with analog outputs Multimedia Programs with multiple possible answers Web search, machine learning Convergent programs Gradient descent, big data analytics 9

Avoiding overkill design Approximate Computing Physical Device Circuit Architecutre Compiler Programming Language Application Microarchitecture Precision Reliability Efficiency Performance Cost 10

Cross-stack approach Application Programming Language Circuit Microarchitecture Architecutre Compiler Physical Device 11

WH 2 of Approximation What to approximate? How much to approximate? How to approximate? Language Compiler Runtime Hardware 12

✗ √ ✗ √ 13

Approximate program 14

Separation 15

Goal Design an automated and high-level programming language for approximate computing Criteria 1) Safety 2) Scalability 3) Modularity and reusability 4) Generality 16

F LEX J AVA Lightweight set of language extensions Reducing annotating effort Language-compiler co-design Safety Scalability Modularity and reusability Generality 17

F LEX J AVA Annotations 1. Approximation for individual data and operations loosen (variable) loosen_invasive (variable) tighten (variable) 2. Approximation for a code block begin_loose (“TECHNIQUE”, ARGUMENTS) end_loose (ARGUMENTS) 3. Expressing the quality requirement loosen (variable, QUALITY_REQUIREMENT); loosen_invasive (variable, QUALITY_REQUIREMENT); end_loose (ARGUMENTS, QUALITY_REQUIREMENT); 18

float computeLuminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; } Safe and scalable approximation 19

float computeLuminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; } Safe and scalable approximation 19

float computeLuminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; } Safe and scalable approximation 19

float computeLuminance (float r, float g, float b) { float luminance = r * 0.3f + g * 0.6f + b * 0.1f; loosen (luminance); return luminance; } Safe and scalable approximation 19

int fibonacci (int n) { int r; if (n <= 1) r = n; else r = fibonacci (n – 1) + fibonacci (n – 2); loosen (r); return r; } Safe and scalable approximation 20

int fibonacci (int n) { int r; if (n <= 1) r = n; else r = fibonacci (n – 1) + fibonacci (n – 2); loosen (r); return r; } Safe and scalable approximation 20

Modularity int p = 1; for (int i = 0; i < a.length; i++) { p *= a[i]; } for (int i = 0; i < b.length; i++) { p += b[i]; loosen(p); } 21

Reuse and library support static int square(int a) { int s = a * a; loosen(s); return s; } main () { int x = 2 * square(3); System.out.println(x); } 22

Reuse and library support static int square(int a) { int s = a * a; loosen(s); return s; } main () { int x = 2 * square(3); loosen(x); System.out.println(x); } 22

Reuse and library support static int square(int a) { int s = a * a; loosen(s); return s; } main () { int x = 2 * square(3); loosen(x); System.out.println(x); } 22

Reuse and library support static int square(int a) { int s = a * a; loosen(s); return s; } main () { int x = 2 * square(3); loosen(x); System.out.println(x); } 22

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen(x); System.out.println(x); } 23

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen(x); System.out.println(x); } 23

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); } 24

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); } 24

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); } 24

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); } 24

Reuse and library support static int square(int a) { int s = a * a; return s; } main () { int x = 2 * square(3); loosen_invasive(x); System.out.println(x); } 24

Truffle: Architecture support for approximate computing [ASPLOS’12] NPU: Neural acceleration [MICRO’12] Hardware Loop perforation and loop early-termination [ESEC/FSE’11, PLDI’10] Function substitution [PLDI’10] Software Flikker: Saving DRAM refresh power [ASPLOS’11] RFVP and load value prediction [PACT’14, MICRO’14] Memory Generality 25

Generality begin_loose(“PERFORATION”, 0.10); for (int i = 0; i < n; i++) { … } end_loose(); 26

Generality begin_loose(“NPU”); p = Math.sin(x) + Math.cos(y); q = 2 * Math.sin(x + y); end_loose(); 27

Approximation Safety Analysis Find the maximal set of safe-to-approximate data/operations 1.For each annotation, loosen(v), find the set of data and operations that affect v 2.Merge all the sets 3.Exclude the data and operations that affect control flow 4.Exclude the data and operations that affect address calculations The rest of the program data and operations are precise 28

Source code F LEX J AVA annotations Programming Approximation Safety Analysis Safe-to-approximate data/operations Source-code Highlighting Tool Compilation Highlighted Source Code Feedback Workflow 29

Benchmark smm # lines: 38 SciMark2 benchmark Matrix-vector multiplication mc # lines: 59 SciMark2 benchmark Mathematical approximation fft # lines: 168 SciMark2 benchmark Signal processing sor # lines: 36 SciMark2 benchmark Successive Over-relaxation lu # lines: 283 SciMark2 benchmark Matrix factorization raytracer # lines: 174 Graphics 3D image renderer jmeint # lines: 5,962 3D gaming Triangle intersection kernel hessian # lines: 10,174 Image processing Interest point detection sobel # lines: 163 Image processing Image edge detection zxing # lines: 26,171 Image processing Bar code decoder 30

Number of Annotations 2× to 17× reduction in the number of annotations 31 Lower is Better

User-study: Annotation Time 8× average reduction in annotation time 10 subjects (programmers) Time for annotating fft using EnerJ and F LEX J AVA 32 Lower is Better

Language-compiler co-design to Automate approximate programming Reduce programmer effort F LEX J AVA Safety Scalability Modularity and reusability Generality 33

Replication Package 1.Annotated benchmarks 2.Compiler with approximate safety analysis 3.Source code highlighting tool Open source git repository:

Truffle: Architecture support for approximate computing [ASPLOS’12] NPU: Neural acceleration [MICRO’12] Architecture Loop perforation and loop early-termination [ESEC/FSE’11, PLDI’10] Function substitution [PLDI’10] Software Flikker: Saving DRAM refresh power [ASPLOS’11] RFVP and load value prediction [PACT’14, MICRO’14] Memory

 Microarchitecture  Approximate processor [ASPLOS’12]  Accelerator  Neural acceleration (Neural Processing Unit) [MICRO’12]  Memory  DRAM with low refresh rate [ASPLOS’11]  Load value approximation [MICRO’14]  Software  Loop perforation [ESEC/FSE’11]  Function substitution [PLDI’10]  Many others…

Approximate programming: Language and compiler support for approximate computing

Safety is the first priority in approximate programming language

Programmer’s manual/explicit annotations [EnerJ PLDI’11, Rely OOPSLA’13] Automated approximate programming