Hardware Support for On-Demand Software Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan December 8, 2011.

Slides:



Advertisements
Similar presentations
Concurrency: introduction1 ©Magee/Kramer 2 nd Edition Concurrency State Models and Java Programs Jeff Magee and Jeff Kramer.
Advertisements

1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.
Recording Inter-Thread Data Dependencies for Deterministic Replay Tarun GoyalKevin WaughArvind Gopalakrishnan.
1 U NIVERSITY OF M ICHIGAN 11 1 SODA: A Low-power Architecture For Software Radio Author: Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor.
CS 345 Computer System Overview
Continuously Recording Program Execution for Deterministic Replay Debugging.
SYNAR Systems Networking and Architecture Group CMPT 886: Architecture of Niagara I Processor Dr. Alexandra Fedorova School of Computing Science SFU.
Chapter Hardwired vs Microprogrammed Control Multithreading
Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.
Presenter : Shih-Tung Huang Tsung-Cheng Lin Kuan-Fu Kuo 2015/6/26 EICE team dIP: A Non-Intrusive Debugging IP for Dynamic Data Race Detection in Many-core.
12/1/2005Comp 120 Fall December Three Classes to Go! Questions? Multiprocessors and Parallel Computers –Slides stolen from Leonard McMillan.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
Slide 1 Patterson Consulting Radical Proposal: Let’s help real problems Dave Patterson University of California at Berkeley April.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
Thinking in Parallel Adopting the TCPP Core Curriculum in Computer Systems Principles Tim Richards University of Massachusetts Amherst.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Advanced Operating Systems CIS 720 Lecture 1. Instructor Dr. Gurdip Singh – 234 Nichols Hall –
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Introduction CSE 410, Spring 2008 Computer Systems
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
Did we achieve the goals for this course? What did we learn? COMP251: Computer Architecture John MacCormick.
A Case for Unlimited Watchpoints Joseph L. Greathouse †, Hongyi Xin*, Yixin Luo †‡, Todd Austin † † University of Michigan ‡ Shanghai Jiao Tong University.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:
A Closer Look At GPUs By Kayvon Fatahalian and Mike Houston Presented by Richard Stocker.
Chapter 10 System Monitoring Issues Performance Benchmarks NT Server Services Users and Server Access Information Task Manager for Applications Ram and.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
© 2004, D. J. Foreman 1 Computer Organization. © 2004, D. J. Foreman 2 Basic Architecture Review  Von Neumann ■ Distinct single-ALU & single-Control.
CSC 7600 Lecture 28 : Final Exam Review Spring 2010 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS FINAL EXAM REVIEW Daniel Kogler, Chirag Dekate.
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan December 12,
Accelerating Dynamic Software Analyses Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan December 1,
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
Hardware Mechanisms for Distributed Dynamic Software Analysis Joseph L. Greathouse Advisor: Prof. Todd Austin May 10, 2012.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Debugging Threaded Applications By Andrew Binstock CMPS Parallel.
Platform Abstraction Group 3. Question How to deal with different types hardware and software platforms? What detail to expose to the programmer? What.
Testudo: Heavyweight Security Analysis via Statistical Sampling Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Ilya.
Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
Real-time aspects Bernhard Weirich Real-time Systems Real-time systems need to accomplish their task s before the deadline. – Hard real-time:
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan November 29,
University of Michigan Electrical Engineering and Computer Science 1 Embracing Heterogeneity with Dynamic Core Boosting Hyoun Kyu Cho and Scott Mahlke.
The Potential of Sampling for Dynamic Analysis Joseph L. GreathouseTodd Austin Advanced Computer Architecture Laboratory University of Michigan PLAS, San.
Demand-Driven Software Race Detection using Hardware Performance Counters Joseph L. Greathouse †, Zhiqiang Ma ‡, Matthew I. Frank ‡ Ramesh Peri ‡, Todd.
GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.
Advanced Pipelining 7.1 – 7.5. Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.
On-Demand Dynamic Software Analysis
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
MSWAT: Hardware Fault Detection and Diagnosis for Multicore Systems
Secure Software Development: Theory and Practice
Multi-Processing in High Performance Computer Architecture:
Architecture Background
Real-time Software Design
Hardware Mechanisms for Distributed Dynamic Software Analysis
Multi-Processing in High Performance Computer Architecture:
Precision Timed Machine (PRET)
Levels of Parallelism within a Single Processor
COSC121: Computer Systems
Coe818 Advanced Computer Architecture
Fine-grained vs Coarse-grained multithreading
On-Demand Dynamic Software Analysis
Levels of Parallelism within a Single Processor
CS 143A Principles of Operating Systems
Chip&Core Architecture
DMP: Deterministic Shared Memory Multiprocessing
Sampling Dynamic Dataflow Analyses
Presentation transcript:

Hardware Support for On-Demand Software Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan December 8, 2011

NIST: Software errors cost U.S. ~$60 billion/year FBI: Security Issues cost U.S. $67 billion/year  >⅓ from viruses, network intrusion, etc. Software Errors Abound 2

X = 0 Y=10 X = 5 Parallel Programming is Here – And it’s Hard! Hardware Plays a Role 3 x + y = 10 y + z = 20 Y = Z = 510 Y=10 10

Example of a Modern Bug 4 Thread 2 mylen=large Thread 1 mylen=small ptr ∅ Nov OpenSSL Security Flaw if(ptr == NULL) { len=thread_local->mylen; ptr=malloc(len); memcpy(ptr, data, len); }

Example of a Modern Bug 5 if(ptr==NULL) memcpy(ptr, data2, len2) ptr LEAKED TIME Thread 2 mylen=large Thread 1 mylen=small ∅ len2=thread_local->mylen; ptr=malloc(len2); len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1)

Thread 2 mylen=large Thread 1 mylen=small if(ptr==NULL) len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1) len2=thread_local->mylen; ptr=malloc(len2); memcpy(ptr, data2, len2) Data Race Detection 6 Shared? Synchronized? TIME

Data Race Detection is Slow 7 PhoenixPARSEC

Inter-thread Sharing is What’s Important 8 if(ptr==NULL) len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1) len2=thread_local->mylen; ptr=malloc(len2); memcpy(ptr, data2, len2) Thread-local data NO SHARING Shared data, but NO DYNAMIC SHARING TIME

Very Little Dynamic Sharing 9 PhoenixPARSEC

Little Sharing Means Wasted Work 10 Multi-threaded Application Multi-threaded Application Software Race Detector Local Access Inter-thread sharing

Do Analysis On-Demand! 11 Multi-threaded Application Multi-threaded Application Software Race Detector Local Access Inter-thread sharing Inter-thread Sharing Monitor

Hardware Sharing Detector HITM in Cache Memory: W→R Data Sharing Hardware Performance Counters 12 Read Y S S I I S S I I HITM Core 1Core 2 Pipeline Cache Perf. Ctrs FAULT Write Y=5 M M Y=5

On-Demand Analysis on Real HW 13 Execute Instruction SW Race Detection Enable Analysis Disable Analysis HITM Interrupt? HITM Interrupt? Sharing Recently? Analysis Enabled? NO YES > 97% < 3%

Performance Difference 14 PhoenixPARSEC

Performance Increases 15 PhoenixPARSEC 51x Accuracy vs. Continuous Analysis: 97%

In Summary 16 Hardware makes constructing software difficult. Tools make software better. Hardware can (and should!) help these tools.

BACKUP SLIDES 17

Width Test 18