Hardware Support for On-Demand Software Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan December 8, 2011
NIST: Software errors cost U.S. ~$60 billion/year FBI: Security Issues cost U.S. $67 billion/year >⅓ from viruses, network intrusion, etc. Software Errors Abound 2
X = 0 Y=10 X = 5 Parallel Programming is Here – And it’s Hard! Hardware Plays a Role 3 x + y = 10 y + z = 20 Y = Z = 510 Y=10 10
Example of a Modern Bug 4 Thread 2 mylen=large Thread 1 mylen=small ptr ∅ Nov OpenSSL Security Flaw if(ptr == NULL) { len=thread_local->mylen; ptr=malloc(len); memcpy(ptr, data, len); }
Example of a Modern Bug 5 if(ptr==NULL) memcpy(ptr, data2, len2) ptr LEAKED TIME Thread 2 mylen=large Thread 1 mylen=small ∅ len2=thread_local->mylen; ptr=malloc(len2); len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1)
Thread 2 mylen=large Thread 1 mylen=small if(ptr==NULL) len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1) len2=thread_local->mylen; ptr=malloc(len2); memcpy(ptr, data2, len2) Data Race Detection 6 Shared? Synchronized? TIME
Data Race Detection is Slow 7 PhoenixPARSEC
Inter-thread Sharing is What’s Important 8 if(ptr==NULL) len1=thread_local->mylen; ptr=malloc(len1); memcpy(ptr, data1, len1) len2=thread_local->mylen; ptr=malloc(len2); memcpy(ptr, data2, len2) Thread-local data NO SHARING Shared data, but NO DYNAMIC SHARING TIME
Very Little Dynamic Sharing 9 PhoenixPARSEC
Little Sharing Means Wasted Work 10 Multi-threaded Application Multi-threaded Application Software Race Detector Local Access Inter-thread sharing
Do Analysis On-Demand! 11 Multi-threaded Application Multi-threaded Application Software Race Detector Local Access Inter-thread sharing Inter-thread Sharing Monitor
Hardware Sharing Detector HITM in Cache Memory: W→R Data Sharing Hardware Performance Counters 12 Read Y S S I I S S I I HITM Core 1Core 2 Pipeline Cache Perf. Ctrs FAULT Write Y=5 M M Y=5
On-Demand Analysis on Real HW 13 Execute Instruction SW Race Detection Enable Analysis Disable Analysis HITM Interrupt? HITM Interrupt? Sharing Recently? Analysis Enabled? NO YES > 97% < 3%
Performance Difference 14 PhoenixPARSEC
Performance Increases 15 PhoenixPARSEC 51x Accuracy vs. Continuous Analysis: 97%
In Summary 16 Hardware makes constructing software difficult. Tools make software better. Hardware can (and should!) help these tools.
BACKUP SLIDES 17
Width Test 18