Download presentation
Presentation is loading. Please wait.
Published byRobyn Warner Modified over 9 years ago
1
Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012
2
2
3
Multi-Core challenges Access to data is a key factor Cache efficiency is determinant – Algorithms – Schedulers – Paging strategies Extensively studied for sequential case Almost no previous theory for multi-core case 3
4
Agenda Sequential paging Multi-Core paging Natural online strategies The offline problem Conclusions and open problems 4
5
Sequential Paging 5 Slow memory Cache of size K …p 6 p 3 p 2 p 4 p 4 p 2 p 10 p 11 p 5 p 4 …Page request Is p i in the cache? -Yes, do nothing (hit) -No, fetch p i from slow memory, evict one page from cache (fault) Goal: minimize number of faults
6
Sequential Paging 6
7
Multi-Core Paging 7 RAM Core 1 Core 2 Core 3 Core 4 L2/L3 Cache
8
t 123456789101112 R1:R1:p2p2 p8p8 p1p1 p4p4 p3p3 p4p4 p 10 p5p5 … R2:R2:p9p9 p1p1 ___ p8p8 p2p2 p1p1 p1p1 p4p4 p7p7 … R3:R3:p3p3 p 18 p 17 p8p8 p2p2 p3p3 p2p2 p9p9 … Multi-Core Paging 8 t 123456789101112 R1:R1:p2p2 p8p8 p1p1 p4p4 p3p3 p4p4 p 10 p5p5 … R2:R2:p9p9 p1p1 p8p8 p2p2 p1p1 p1p1 p4p4 p7p7 … R3:R3:p3p3 p 18 p 17 p8p8 p2p2 p3p3 p2p2 p9p9 …
9
Related Models Multiple applications or threads Multi-Core model [Hassidim, ICS‘10] – Makespan – LRU is not competitive – Scheduling Our model: – No scheduling of requests – Separates scheduling and paging – Minimize faults 9
10
Natural Strategies Share the cache – Eviction policy Partition the cache among cores – Partition function (static, dynamic) – Eviction policy Examples: – Shared-LRU – Optimal Static Partition with LRU 10
11
Partition vs. Shared 11 For any online dynamic partition that changes o(n) times Partitions that don’t change enough are not competitive
12
Shared strategies The same applies to FIFO, CLOCK, FWF 12
13
Proof idea 13 Faults LRU ≥ n/2 Obs: Furthest-In-The-Future is not optimal
14
The Offline Problem 14
15
PARTIAL-INDIVIDUAL-FAULTS (PIF): 15 1234567891011121314151617181920 p1p1 __p2p2 p8p8 p1p1 p4p4 __p 10 p5p5 p1p1 p4p4 p2p2 p9p9 p9p9 p5p5 p2p2 p3p3 p7p7 p2p2 p9p9 p1p1 __p4p4 p8p8 __p1p1 p4p4 p7p7 p2p2 __p3p3 p4p4 __p1p1 p3p3 p4p4 __p8p8 p2p2 p3p3 p2p2 p9p9 p5p5 p1p1 p4p4 p2p2 p9p9 p9p9 p1p1 __p4p4 p2p2 p2p2 __p3p3 p8p8 p1p1 p1p1 p3p3 p9p9 __ p5p5 p1p1 p8p8 __p1p1 p4p4 p2p2 E.g. At t=18,?
16
PARTIAL-INDIVIDUAL-FAULTS (PIF): Optimization version (MAX-PIF): given an instance of PIF, maximize the number of sequences that fault within given bound Unless P=NP, there is no PTAS for MAX-PIF Theorem: PIF is NP-complete Theorem: MAX-PIF is APX-hard
17
PIF vs. Min Faults 17
18
The Offline Problem Offline algorithm can align sequences properly by means of faults Algorithm could “force faults” for this sake Regular execution Forcing a fault on p 1 18 p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p 10 p7p7 … p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p2p2 p3p3 p5p5 p8p8 p4p4 p1p1 p5p5 p4p4 ___ p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 ___ p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p4p4 p3p3 p5p5 p8p8 p9p9 p1p1 ___ p5p5 p4p4 ___ p5p5 p1p1 p4p4 p6p6 p9p9 p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 …
19
The Offline Problem However, this has no advantage over an honest offline algorithm 19 Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R A(R) =A’(R) Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R A(R) =A’(R)
20
The Offline Problem 20
21
Conclusions Multi-core paging is significantly different from sequential paging Traditional paging strategies are not competitive Serving a set of requests while limiting faults in each sequence is hard Multi-core paging is in P when number of cores is constant 21
22
Open Problems What are good online strategies? What are good measures of performance? – Fairness? What is the complexity of minimizing the number of faults? Can we obtain more efficient offline algorithms (exact or approximate)? 22 Thank you
23
23
24
Partition vs. Shared 24 E.g. K=12, p=3 OPT={5,5,2} p 1 p 2 p 3 p 4 p 5 p 1 p 2 p 3 p 4 p 5 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 2 q 3 q 4 q 5 q 1 q 2 q 3 q 4 q 5 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 2 s 3 s 4 s 5 s 1 s 2 s 3 s 4 s 5 For any online dynamic partition D that changes o(n) times
25
Dynamic Programming Alg. Running time exponential in K and p But polynomial in n (recall n>>p) In practice p=2,4,8. p=O(log n) Running time is infeasible Both minimizing the number of faults and PIF are in P for constant p Problem’s hardness stems from the number of sequences, not their length 25
26
Real world numbers 26
27
Hassidim’s model 27
28
Our Model Offline cannot delay sequences Requests must be served as they arrive Separates paging algorithm and scheduler Minimize number of faults 28
29
Static Partitions 29
30
Static Partitions 30 The choice of a good partition is more important than the choice of the eviction policy
31
The Offline Problem 31 Theorem: PIF is NP-complete f__f__hhhhhhhhhhhf__f__f__f__f__f__f__f__f__f__f__f__f__... f__f__f__f__f__f__f__hhhhhhhhhhhhhhhhhhhf__f__f__f__f__... f__f__f__f__f__f__f__f__f__f__f__f__f__f__f__hhhhhhhhhhh… t
32
Dynamic Programming Alg. 32
33
Dynamic Programming Alg. f 33 Positions in sequences Cache Polynomial for constants K and p
34
Partition vs. Shared Competitive strategies must be either shared or change the partition often In fact, these are equivalent (for disjoint sequences) 34 Partitions that don’t change enough are not competitive
35
Partition vs. Shared 35 For any online dynamic partition D that changes o(n) times Partitions that don’t change enough are not competitive
36
Related Models 36 ReferenceModelSchedulingDelayRemarks [Fiat, Karlin ’95]Multi-pointer in Access graph (applications and threads) No Algorithm with optimal Competitive Ratio [Barve et al. ‘00]Multiple applications No [Feuerstein, Strejilevich de Loma 02’] Multi-Threaded Paging YesNo [Hassidim 10’]Multi-coreYes [This]Multi-coreNoYes
37
Related Models 37 ReferenceModelSchedulingDelay [Fiat, Karlin ’95]Multi-pointer in Access graph (applications and threads) No [Barve et al. ‘00]Multiple applicationsNo [Feuerstein, Strejilevich de Loma 02’] Multi-Threaded PagingYesNo [Hassidim 10’]Multi-coreYes [This]Multi-coreNoYes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.