(Short) Introduction to Parallel Computing CS 6560: Operating Systems Design
2 Why Parallel Computing? Performance! Many applications require serious performance. Examples: Structural biology Chemical dynamics Pharmaceutical design Weather forecasting Human genome Ocean modeling
3 Processor Performance: Need Parallelism! 2-3 GHz
4 Case Study 1: Simulating Ocean Currents Model as two-dimensional grids Discretize in space and time finer spatial and temporal resolution => greater accuracy Many different computations per time step set up and solve equations Concurrency across and within grid computations (a) Cross sections(b) Spatial discretization of a cross section
5 Simulate interactions of many stars evolving over time Computing forces is expensive O(n 2 ) brute force approach Hierarchical methods take advantage of force law: G m1m2m1m2 r2r2 Case Study 2: Simulating Galaxy Evolution
6 Case Study 2: Barnes-Hut Many time steps, plenty of concurrency across stars Locality Goal Particles close together in space should be on same processor Difficulties: Non-uniform, dynamically changing Spatial DomainQuad-tree
7 Case Study 3: Rendering by Ray Tracing Goal is to produce image from representation of real world Shoot rays into scene through pixels in projection plane Result is color for pixel Rays shot through pixels in projection plane are called primary rays Reflect and refract when they hit objects Recursive process generates ray tree per primary ray Tradeoffs between execution time and image quality Viewpoint Projection Plane 3D Scene Ray from viewpoint to upper right corner pixel Dynamically generated ray
8 Partitioning Need dynamic assignment Use contiguous blocks to exploit spatial coherence among neighboring rays, plus tiles for task stealing A block, the unit of assignment A tile, the unit of decomposition and stealing
9 Sample Speedups Speedups on NUMA multiprocessor Speedup = (best) time on 1 processor / time on multiple processors
10 Ideal/Linear Speedup? Amdahl’s Law If a fraction s of a computation is not parallelizable, then the best achievable speedup is
11 Pictorial Depiction of Amdahl’s Law 1 p 1 Time Parallelizable work Sequential work
12 But Goal is not just Performance At some point, we’re willing to trade some performance for: Ease of programming High portability Low cost Ease of programming & high portability Parallel programming for the masses Leverage new or faster hardware asap Low cost High-end parallel machines are expensive resources
13 Parallel Applications Scientific computing not the only class of parallel applications Examples of non-scientific parallel applications: Data mining Real-time rendering Distributed servers Today, programmers are encouraged to find parallelism in all sorts of software