Hardware Trends CSE451 Andrew Whitaker
Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond the operating system
Famous Quotes “I think there is a world market for maybe five computers” Thomas Watson, IBM, 1943 “There is no reason for any individual to have a computer in their home” Ken Olson, Digital, 1977 “640K ought to be enough for anybody” Bill Gates, 1981
CPU Performance Trends Processor performance stopped doubling every 18 months in 2002 Source: David Patterson
Why Has CPU Performance Growth Slowed? Moore’s law continues to provide more transistors But: Chip designs have gotten too complex Communication times have gotten (relatively) longer Circuit noise becomes a factor at small scale Bottom line: automatic processor performance improvements are a thing of the past
Industry Response: Multi-Core Architectures
Sidebar: Educational Ramifications “Given this sea change, how much of the curriculum and what fraction of the CS faculty is oblivious to concurrency? How many algorithms, data structures, languages, compilers, debuggers, operating systems, books, and lectures must change to match the transformation in underlying technology if they are to remain relevant in the 21st century?” David Patterson -- Computer Science Education in the 21st Century
Grand Challenge of Parallel Computing How do we get a speedup of N on an N- way multi-processor? Software must be parallelizable Speedup can refer to: Latency: the length of time to complete a single task Throughput: the rate at which tasks are completed
The Human Baby Problem Problem: produce a human baby Q: Is this problem parallelizable? Latency: No 9 women cannot make a baby in a month Throughput: Yes 9 women can produce 1 baby/month
Speeding Up an Internet Service Which is easier for Amazon.com? Handling more customers Making individual customer transactions go faster?
Parallelization Theory Amdahl’s Law predicts speedup on a parallel machine N: number of processors F: Fraction of computation that is sequential Sequential == not parallelizable Embarrassingly parallel: F 0 F + (1 - F) / N 1 speedup =
Case Study: Merge Sort Given N processors, what speedup is possible?
Parallelization in Practice Predicting performance is difficult because there are many flavors of parallelism Multiple processors Symmetric multiprocessors (SMPs) Multi-core processors Multi-threaded processors (“Hyperthreading”) Clusters of machines Running the software is the only way to know for sure…
Disk Storage Trends Doubled every 3+ years 25% improvement each year Factor of 10 every decade Still exponential, but far less rapid than processor performance Since 1990 Doubling every 12 months 100% improvement each year Factor of 1000 every decade 10x as fast as processor performance!
Memory Capacity Continues to track Moore’s law Ed Lazowska: “I remember pulling all kinds of strings to get a special deal: 512K of VAX-11/780 memory for $30,000” Today (2006): roughly $300 / GB
© 2004 Jim Gray, Microsoft Corporation
A Quick Intro to Threads
public class Yo extends Thread { private static int x = 7; public void run () { x++; } public static void main (String [] args) { Thread t1 = new Yo(); Thread t2 = new Yo(); t1.start(); t2.start(); try { t1.join(); t2.join(); } catch (InterruptedException iex) { } System.out.println("Value of x: " + x); } What is wrong with this program?
Breaking Down x++ x++ in MIPS assembly: Key point: x++ is not atomic Requires multiple instructions to complete its work This is a race condition Program’s output depends on thread scheduling lw $t, offset($s) addi $t, $t, 1 sw $t, offset($s)