Download presentation
Presentation is loading. Please wait.
Published byLeandro Lima Prada Modified over 6 years ago
1
Software Enablement for Multicore Architectures
David Bernstein Bilha Mendelson
2
Conventional Bulk CMOS SOI (silicon-on-insulator)
Technology Scaling – We’ve Hit The Wall 0.2 0.4 0.6 0.8 1 2 4 6 8 10 20 1988 1992 1996 2000 2004 2008 2012 Conventional Bulk CMOS SOI (silicon-on-insulator) High mobility Double-Gate Year Relative Device Performance ? 11/14/2018
3
Has This Ever Happened Before?
140 Bipolar CMOS IBM RY5 IBM GP IBM RY6 Apache Pulsar Merced IBM RY7 IBM RY4 Pentium II(DSIP) Pentium 4 120 IBM ES9000 ? 100 80 Fujitsu VP2000 Watts / cm2 IBM 3090S 60 NTT Fujitsu M-780 40 IBM 3090 Start of CDC Cyber 205 20 Water Cooling IBM 4381 IBM 3081 Fujitsu M380 IBM 370 IBM 3033 IBM 360 Vacuum 1950 1960 1970 1980 1990 2000 2010 Source: Bernie Meyerson, IBM 11/14/2018
4
Industry trends Intel Quad-Core Sun’s 8-Core Chips: T1 - Niagra
Cell Broadband Engine 11/14/2018
5
Hierarchy of Modular Building Blocks
Systems will increasingly need to implement a hybrid execution model New programming systems need to reduce the need for programmer awareness of the topology on which their program executes Grid/Cluster High Speed Network Hierarchical SMP servers with non-uniform memory access characteristics Rack High Speed Network Hierarchical SMP servers with NUMA characteristics Board SMP Interconnect Homogenous SMP on Board 2 – 128 HW contexts on board Main Processor(s) with Accelerator(s) Master-Slace relationship between entities Memory Memory Chip Homogenous SMP on chip 2-32 HW contexts on chip Various forms of resource sharing Heterogenous collection of processors on chip Heterogenity at data and control flow level Cache I/O Attach Interconnect Fabric Mem Ctrl Core Core The next gen programming system must support programming simplicity while leveraging the performance of the underlying HW topology. Core Core Core will support multiple HW threads sharing a single cache exhibiting SMP characteristics. 11/14/2018
6
Architecture trends Several processor cores on a chip and specialized computing engines XML processing, cryptography, graphics Questions: how to interconnect large number of processor cores how to provide sufficient memory bandwidth how to structure the multilevel caching subsystem how to balance the general purpose computing resources with specialized processing engines and all the supporting memory, caching and interconnect structure, given a constant power budget Software development processes how to program for multicore architectures how to test and evaluate the performance of multithreaded applications 11/14/2018
7
Programming multiprocessor systems
Two main directions: explicit manual programming exploit the combination of compiler optimization, build tool chains, and run-time subsystems In HPC and embedded communities: emphasis was more on explicit manual programming and special resources by expert programmers resulted in numerous home-grown language directives and extensions, internal tools, obscure run-time systems hardly portable to new generations of hardware 11/14/2018
8
Programming languages
Very few new languages were invented in the last 2 decades Java - virtual machine, interpreter, JIT, garbage collection, set of libraries, etc. Can multicore spur development of new language/environment for parallelism? map-reduce, cilk, UPC, X10, and STAPL programmers can provide additional information related to parallelism Multicore provide multiple types of parallelism thread-level parallelism (TLP) – coarse-grain OpenMP - standard for shared-memory models MPI - standard for distributed-memory models pthreads, java threads - explicitly use automatic parallelization optimizations Most of the original auto-parallelizing compilers focused on FORTRAN data-level parallelism (DLP) – fine-grain auto-vectorization, auto-simdification What about asymmetric multicore architectures (like Cell processor)? is it possible to have a single source compilation for multiple ISAs? - initial attempts… how OpenMP can be used for programs - streaming 11/14/2018
9
Performance Analysis Tools
Profile based tools – data aggregation FDPR-Pro, Code Analyzer, Diablo Performance evaluation is heavily influenced by thread interaction stales, locks, races, memory thrashing, pollute hardware counters trace-based analysis and visualization introduces timeline views and data to deal with communication issues lack of scalability: tend to grow fast, making it difficult to manipulate and visualize In HPC context: selecting arbitrary subset of cores/threads and arbitrary time intervals tracing might disturbs program's behavior HPCToolkit, TAU, Paraver, VTune, Code Analyzer, PDT, Trace Analyzer Lack of determinism 11/14/2018
10
Performance tools for multi-core: Cell
Visual Performance Analyzer 5.0 Cell SDK 3.0 Profile Analyzer Code Analyzer Pipeline Analyzer Trace Analyzer PDT Lock Analyzer Infrastructure for collecting profiles on several systems Infrastructure for using databases for large data sets Set of interconnected views Cell support Infrastructure for collecting traces on SDK 3.0 libraries Analysis of lock usage Input for Trace Analyzer 11/14/2018
11
Debugging and testing tools
Concurrent problems constitute about 10% of the bugs Bugs like crashes (races) or freeze (deadlocks) stay in the application reducing the up-time Testing is done at load testing - very late in the process We have been working on a tool supported methodology try to find the concurrency issues as early as possible: teach how to write concurrent code concurrent bug patterns explain the concurrent programming constructs teach general concurrency design patterns reviews - developed a specialized review technique for concurrent code teach how to do unit testing - developed synchronization coverage ConTest - a tool supported method for measuring contention Make the tests that are likely to exhibit bugs - changing the internal timing Tools for pinpointing locations of bugs if we have a test that we can cause the application to fail some of the time healing bugs so that the impact will not be seen 11/14/2018
12
Software trends Software enablement system for multicores
Various directions for providing solutions Active area of research only some early results in the academic and industrial worlds in terms of established standards and technology much more will evolve in the years to come Need: programming models and compiler support for multicores performance evaluation tools testing and debugging tools 11/14/2018
13
Thank You 11/14/2018
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.