GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh

GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu Chi Xu xuchi@umn.eduvande501@umn.edumish0088@umn.edukuma0253@un.eduxuchi@umn.edu

Outline Introduction and Motivation Analytical Model Description Experiment Setup Results Conclusion and Further Work

Introduction

Motivation

Outline Introduction and Motivation Analytical Model Description o Parser o Power Model Experiment Setup Results Conclusion and Further Work

Parser

Power Model PTX Level

Power Model Assembly Level

Experiment Setup - Hardware Measure Power Consumption and Temperature o Current Clamp for PCIE & GPU Power Cable  Data Acquisition Card @ 100Hz o GPU Performance Counter o Sample Temperature @ 10Hz, GPU sensor

Experiment Setup - Software Driver API Generate and Modify PTX code o Minimize control loops CUDA 4.0 o Built in Binary -> Assembly Converter (cuobjdump) MATLAB to build model Remote login

CUDA- Fermi Architecture Third Generation Streaming Multiprocessor(SM) o 32 CUDA cores per SM, 4x over GT200 o 1024 thread block size, 2x over GT200 o Unified address space enables full C++ support o Improved Memory Subsystem

Benchmarks Small number of overhead operations (loop counters, initialization, etc.). Computational intensive work to allow for an experiment of significant length for accurate current measurement. Exhibit high utilization of the CUDA cores, few data hazards as possible. Grid and block sizes appropriately so that all SM are used, since idle SM leak. Accordingly 7 benchmarks were selected from CUDA SDK.

Benchmarks For this project we tested out a few benchmarks. 2D convolution Matrix Multipication Vector Addition Vector Reduction Scalar Product DCT 8x8 3DFD

Limitations of PTX Higher level than assembly o Divide & Sqrt: 1 PTX line, library in assembly Compiler optimizations from PTX -> assembly Doesn’t reflect RAW dependencies Performance counters use assembly

Results

Conclusion Further Work o Take into account context switches o Consider Multiple kernels running simultaneously

The End Thanks Q&A

GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh

Similar presentations

Presentation on theme: "GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh

Similar presentations

Presentation on theme: "GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh"— Presentation transcript:

Similar presentations

About project

Feedback