Parallel Computing Laxmikant Kale
2 Advent of parallel computing “Parallel computing is necessary to increase speeds” –cry of the ‘70s –processors kept pace with Moore’s law: Doubling speeds every 18 months Now, finally, the time is ripe –uniprocessors are commodities (and proc. speeds shows signs of slowing down) –Highly economical to build parallel machines
3 What is Parallel Computing? Use of multiple processors to solve a single computational problem faster. –Distinct from distributed computing Why parallel computing? –“Parallel computing is necessary to increase speeds” –cry of the ‘70s –processors kept pace with Moore’s law: Doubling speeds every 18 months Now, finally, the time is ripe –uniprocessors are commodities (and proc. speeds shows signs of slowing down) –Highly economical to build parallel machines
4 Why parallel computing It is the only way to increase speed beyond uniprocessors –Except, of course, waiting for uniprocessors to become faster! –Several applications require orders of magnitude higher performance than feasible on uniprocessors Cost effectiveness: –older argument –in 1985, a supercomputer cost 2000 times more than a desktop, yet performed only 400 times faster. –So: combine microcomputers to get speed at lower costs –Incremental scalability: can get inbetween performance points with 20, 50, 100,… processors –But: You may get speedup lower than 400 on 2000 processors! Microcomputers became faster, killing supercomputers, effectively
5 Technology Trends The natural building block for multiprocessors is now also about the fastest!
6 Architectural Trends Greatest trend in VLSI generation is increase in parallelism –Up to 1985: bit level parallelism: 4-bit -> 8 bit -> 16-bit slows after 32 bit adoption of 64-bit now under way, 128-bit far (not performance issue) great inflection point when 32-bit micro and cache fit on a chip –Mid 80s to mid 90s: instruction level parallelism pipelining and simple instruction sets, + compiler advances (RISC) on-chip caches and functional units => superscalar execution greater sophistication: out of order execution, speculation, prediction –to deal with control transfer and latency problems
7 Economics Commodity microprocessors not only fast but CHEAP Development cost is tens of millions of dollars (5-100 typical) BUT, many more are sold compared to supercomputers –Crucial to take advantage of the investment, and use the commodity building block –Exotic parallel architectures no more than special-purpose Multiprocessors being pushed by software vendors (e.g. database) as well as hardware vendors Standardization by Intel makes small, bus-based SMPs commodity Desktop: few smaller processors versus one larger one? –Multiprocessor on a chip
8 What to Expect? Parallel Machine classes: –Cost and usage defines a class! Architecture of a class may change. –Desktops, Engineering workstations, database/web servers, suprtcomputers, Commodity (home/office) desktop: –less than $5,000 –possible to provide 5-25 processors for that price! –Driver applications: games, video /signal processing, possibly “peripheral” AI: speech recognition, natural language understanding (?), smart spaces and agents New applications?
9 Engineeering workstations Price: less than $100,000 (used to be): –new proce level acceptable may be $50,000 –100+ processors, large memory, –Driver applications: CAD (Computer aided design) of various sorts VLSI Structural and mechanical simulations… Etc. (many specialized applications)
10 Commercial Servers Price range: variable ($10,000 - several hundreds of thousands) –defining characteristic: usage –Database servers, decision support (MIS), web servers, e-commerce High availability, fault tolerance are main criteria Trends to watch out for: –Likely emergence of specialized architectures/systems E.g. Oracle’s “No Native OS” approach Currently dominated by database servers, and TPC benchmarks –TPC: transactions per second –But this may change to data mining and application servers, with corresponding impact on architecure.
11 Supercomputers “Definition”: expensive system?! –Used to be defined by architecture (vector processors,..) –More than a million US dollars? –Thousands of processors Driving applications –Grand challenges in science and engineering: –Global weather modeling and forecast –Rational Drug design / molecular simulations –Processing of genetic (genome) information –Rocket simulation –Airplane design (wings and fluid flow..) –Operations research?? Not recognized yet –Other non-traditional applications?
12 Scientific Computing Demand
13 Engineering Computing Demand Large parallel machines a mainstay in many industries –Petroleum (reservoir analysis) –Automotive (crash simulation, drag analysis, combustion efficiency), –Aeronautics (airflow analysis, engine efficiency, structural mechanics, electromagnetism), –Computer-aided design –Pharmaceuticals (molecular modeling) –Visualization in all of the above entertainment (films like Toy Story) architecture (walk-throughs and rendering) –Financial modeling (yield and derivative analysis) –etc.
14 What is Challenging? Writing parallel programs is difficult –Office worker analogy Issues of Coordination: I thought you were going to get the pizza Asynchrony: what happens before what Race conditions: can’t determine which will happen first And finally: Performance!