Download presentation
Presentation is loading. Please wait.
Published byAndrea Mills Modified over 9 years ago
1
DATA STRUCTURES OPTIMISATION FOR MANY-CORE SYSTEMS Matthew Freeman | Supervisor: Maciej Golebiewski CSIRO Vacation Scholar Program 2013-14
2
Presentation title | Presenter name The Multi-core Age 2 | Mobile PhonePCIntel Xeon Phi CSIRO ‘Bragg’ Compute Cluster 2-4 Cores4-16 Cores61 Cores2048 Cores
3
Presentation title | Presenter name Programming for multi-cores 3 | Problem CPU Core 1 CPU Core 2 CPU Core 3 CPU Core 4 Machine Instructions Execution Divide the problem
4
The maximum speedup is dependent on % of the problem you can run in parallel Presentation title | Presenter name Amdahl's Law 4 | Single Core Processor 1x Speed 50% 2x speedup 75% 4x speedup 90% 95% 10x speedup 20x speedup Maximum Speedup
5
Presentation title | Presenter name Data structures: 5 | Memory (data) is still a shared resource. Memory (data) CPU core Single Core Computer Memory (data) CPU core 4-Core Computer
6
Presentation title | Presenter name Linked-list (Stack) Data Structure 6 | Data A link to the next data point EMPTY A “node” that holds data. TOP
7
Presentation title | Presenter name Add new item (Push) 7 | Data A EMPTY We want to add a chunk of data (Data B) to the structure TOP Data B
8
Presentation title | Presenter name Add new item (Push) 8 | Data A EMPTY Steps: For new data B 1)Find the start of the structure (TOP) Data B TOP
9
Presentation title | Presenter name Add new item 9 | Data A EMPTY Data B Steps: For new data B 2) Link into the structure. TOP
10
Presentation title | Presenter name Add new item 10 | Data A NULL TOP (new) Data B Steps: For new data B 3) Update TOP.
11
Like stacking dinner plates Only need to keep track of where TOP is to access the rest. Presentation title | Presenter name Resulting structure 11 | Data NULL Data TOP
12
Presentation title | Presenter name What happens in multi-core systems? 12 | Two threads trying to operate on the stack structure: Thread 1 attempts at time T. Thread 2 attempts at time T + 1 nanosecond. Because each of the steps takes time to complete, errors occur.
13
Presentation title | Presenter name What happens in multi-core systems? 13 | This causes the interleaving of steps Thread 1 reads TOP (1) Thread 2 reads TOP (1) Thread 1 sets the next pointer (2) Thread 2 sets the next pointer (2) Thread 1 updates TOP (3) Thread 2 updates TOP (3)
14
Presentation title | Presenter name 14 | Data A EMPTY TOP Data CData B Data B is lost forever because it is not linked to TOP anymore (Stack failure) Thread 1 Thread 2
15
Use “data locks”. Protect the 3 steps. One thread at a time is granted access to the stack. Complete an operation and release the lock. This is the standard approach for multithreaded structures. Presentation title | Presenter name How do we fix this? 15 |
16
Easy to use. 2 lines of code added to fix. -Get Lock -Step 1, 2,3. -Release Lock. × Slow. One thread at a time can use the lock. This becomes sequential code. This is the code that cannot run in parallel. Analogy: Merging highway traffic into a single lane. Presentation title | Presenter name Locks 16 |
17
New method Lock-free data structure. Special low-level instructions allows three steps in one computer instruction. Removes the need for locks. Called a Compare-Exchange. Presentation title | Presenter name Lock-free 17 |
18
Downside: Writing lock-free code is difficult (hence the project). The Compare-Exchange operation forms the base for writing lock-free code. The project takes specifications from research papers to implement. Presentation title | Presenter name Lock-free 18 |
19
Implemented a range of lock-free optimizations for the stack. Open coding standards (C++, OpenMP) Benchmarked using a Intel Xeon Phi 61 core processor. Lock-free structure performed about 2x better for pure stack operations. Presentation title | Presenter name Lock-free 19 |
20
Amdahl’s Law shows that it’s important to optimize sequential sections of code. The shared data structures are often sequential bottlenecks. Implementing lock-free data structures reduced this bottleneck. Presentation title | Presenter name Summary 20 |
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.