Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.

Slides:



Advertisements
Similar presentations
On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.
Advertisements

Weighted Matching-Algorithms, Hamiltonian Cycles and TSP
Online Algorithm Huaping Wang Apr.21
Tight Bounds for Online Class- constrained Packing Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir The Technion IIT.
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
Approximation Algorithms
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
What is Intractable? Some problems seem too hard to solve efficiently. Question 1: Does an efficient algorithm exist?  An O(a ) algorithm, where a > 1,
Online Algorithms Amrinder Arora Permalink:
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
8.1 Advanced Operating Systems Defeating The Thrashing The RAM is limited. Sometimes, too many processes need large memory allocations and the sum of these.
Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Competitive Analysis.
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
2007/3/6 1 Online Chasing Problems for Regular n-gons Hiroshi Fujiwara* Kazuo Iwama Kouki Yonezawa.
1 Competitive analysis of the LRFU paging algorithm Edith Cohen -- AT&T Haim Kaplan -- Tel Aviv Univ. Uri Zwick -- Tel Aviv Univ.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member,
Limitations of VCG-Based Mechanisms Shahar Dobzinski Joint work with Noam Nisan.
Ecole Polytechnique, Nov 7, Minimizing Total Completion Time Each job specified by  procesing time (length p j )  release time r j Goal: compute.
Towards a Theory of Cache-Efficient Algorithms Summary for the seminar: Analysis of algorithms in hierarchical memory – Spring 2004 by Gala Golan.
Online Algorithms Motivation and Definitions Paging Problem Competitive Analysis Online Load Balancing.
CS 104 Introduction to Computer Science and Graphics Problems
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
Ecole Polytechnique, Nov 7, Online Job Scheduling Marek Chrobak University of California, Riverside.
1 Combinatorial Dominance Analysis Keywords: Combinatorial Optimization (CO) Approximation Algorithms (AA) Approximation Ratio (a.r) Combinatorial Dominance.
Algoritmi on-line e risoluzione di problemi complessi Carlo Fantozzi
OS Spring’04 Virtual Memory: Page Replacement Operating Systems Spring 2004.
NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
The Hardness of Cache Conscious Data Placement Erez Petrank, Technion Dror Rawitz, Caesarea Rothschild Institute Appeared in 29 th ACM Conference on Principles.
1 Last Time: Paging Motivation Page Tables Hardware Support Benefits.
Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson.
Minimizing Cache Usage in Paging Alejandro Salinger University of Waterloo Joint work with Alex López-Ortiz.
1/24 Algorithms for Generalized Caching Nikhil Bansal IBM Research Niv Buchbinder Open Univ. Israel Seffi Naor Technion.
Bold Stroke January 13, 2003 Advanced Algorithms CS 539/441 OR In Search Of Efficient General Solutions Joe Hoffert
Online Paging Algorithm By: Puneet C. Jain Bhaskar C. Chawda Yashu Gupta Supervisor: Dr. Naveen Garg, Dr. Kavitha Telikepalli.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
Batch Scheduling of Conflicting Jobs Hadas Shachnai The Technion Based on joint papers with L. Epstein, M. M. Halldórsson and A. Levin.
Online Algorithms. Introduction An offline algorithm has a full information in advance so it can compute the optimal strategy to maximize its profit (minimize.
Packing Rectangles into Bins Nikhil Bansal (CMU) Joint with Maxim Sviridenko (IBM)
Minimizing Cache Usage in Paging Alejandro López-Ortiz, Alejandro Salinger University of Waterloo.
Online Algorithms By: Sean Keith. An online algorithm is an algorithm that receives its input over time, where knowledge of the entire input is not available.
Approximation Schemes Open Shop Problem. O||C max and Om||C max {J 1,..., J n } is set of jobs. {M 1,..., M m } is set of machines. J i : {O i1,..., O.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Jennifer Campbell November 30,  Problem Statement and Motivation  Analysis of previous work  Simple - competitive strategy  Near optimal deterministic.
A polylog competitive algorithm for the k-server problem Nikhil Bansal (IBM) Niv Buchbinder (Open Univ.) Aleksander Madry (MIT) Seffi Naor (Technion)
Orienteering and related problems: mini-survey and open problems Chandra Chekuri University of Illinois (UIUC)
A Optimal On-line Algorithm for k Servers on Trees Author : Marek Chrobak Lawrence L. Larmore 報告人:羅正偉.
Martin Kruliš by Martin Kruliš (v1.1)1.
Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.
1 Windows Scheduling as a Restricted Version of Bin-packing. Amotz Bar-Noy Brooklyn College Richard Ladner Tami Tamir University of Washington.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
Chapter 15 P, NP, and Cook’s Theorem. 2 Computability Theory n Establishes whether decision problems are (only) theoretically decidable, i.e., decides.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
Great Theoretical Ideas in Computer Science.
The NP class. NP-completeness
Chapter 10 NP-Complete Problems.
Introduction | Model | Solution | Evaluation
Maximum Matching in the Online Batch-Arrival Model
Computability and Complexity
Greedy Algorithms / Caching Problem Yin Tat Lee
k-center Clustering under Perturbation Resilience
The Subset Sum Game Revisited
Gokarna Sharma Costas Busch Louisiana State University, USA
Greedy Algorithms / Caching Problem Yin Tat Lee
Presentation transcript:

Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012

2

Multi-Core challenges Access to data is a key factor Cache efficiency is determinant – Algorithms – Schedulers – Paging strategies Extensively studied for sequential case Almost no previous theory for multi-core case 3

Agenda Sequential paging Multi-Core paging Natural online strategies The offline problem Conclusions and open problems 4

Sequential Paging 5 Slow memory Cache of size K …p 6 p 3 p 2 p 4 p 4 p 2 p 10 p 11 p 5 p 4 …Page request Is p i in the cache? -Yes, do nothing (hit) -No, fetch p i from slow memory, evict one page from cache (fault) Goal: minimize number of faults

Sequential Paging 6

Multi-Core Paging 7 RAM Core 1 Core 2 Core 3 Core 4 L2/L3 Cache

t R1:R1:p2p2 p8p8 p1p1 p4p4 p3p3 p4p4 p 10 p5p5 … R2:R2:p9p9 p1p1 ___ p8p8 p2p2 p1p1 p1p1 p4p4 p7p7 … R3:R3:p3p3 p 18 p 17 p8p8 p2p2 p3p3 p2p2 p9p9 … Multi-Core Paging 8 t R1:R1:p2p2 p8p8 p1p1 p4p4 p3p3 p4p4 p 10 p5p5 … R2:R2:p9p9 p1p1 p8p8 p2p2 p1p1 p1p1 p4p4 p7p7 … R3:R3:p3p3 p 18 p 17 p8p8 p2p2 p3p3 p2p2 p9p9 …

Related Models Multiple applications or threads Multi-Core model [Hassidim, ICS‘10] – Makespan – LRU is not competitive – Scheduling Our model: – No scheduling of requests – Separates scheduling and paging – Minimize faults 9

Natural Strategies Share the cache – Eviction policy Partition the cache among cores – Partition function (static, dynamic) – Eviction policy Examples: – Shared-LRU – Optimal Static Partition with LRU 10

Partition vs. Shared 11 For any online dynamic partition that changes o(n) times Partitions that don’t change enough are not competitive

Shared strategies The same applies to FIFO, CLOCK, FWF 12

Proof idea 13 Faults LRU ≥ n/2 Obs: Furthest-In-The-Future is not optimal

The Offline Problem 14

PARTIAL-INDIVIDUAL-FAULTS (PIF): p1p1 __p2p2 p8p8 p1p1 p4p4 __p 10 p5p5 p1p1 p4p4 p2p2 p9p9 p9p9 p5p5 p2p2 p3p3 p7p7 p2p2 p9p9 p1p1 __p4p4 p8p8 __p1p1 p4p4 p7p7 p2p2 __p3p3 p4p4 __p1p1 p3p3 p4p4 __p8p8 p2p2 p3p3 p2p2 p9p9 p5p5 p1p1 p4p4 p2p2 p9p9 p9p9 p1p1 __p4p4 p2p2 p2p2 __p3p3 p8p8 p1p1 p1p1 p3p3 p9p9 __ p5p5 p1p1 p8p8 __p1p1 p4p4 p2p2 E.g. At t=18,?

PARTIAL-INDIVIDUAL-FAULTS (PIF): Optimization version (MAX-PIF): given an instance of PIF, maximize the number of sequences that fault within given bound Unless P=NP, there is no PTAS for MAX-PIF Theorem: PIF is NP-complete Theorem: MAX-PIF is APX-hard

PIF vs. Min Faults 17

The Offline Problem Offline algorithm can align sequences properly by means of faults Algorithm could “force faults” for this sake Regular execution Forcing a fault on p 1 18 p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p 10 p7p7 … p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p2p2 p3p3 p5p5 p8p8 p4p4 p1p1 p5p5 p4p4 ___ p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 ___ p5p5 p4p4 p5p5 p1p1 p4p4 p6p6 p9p9 … p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 … p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p2p2 p3p3 p5p5 p8p8 p9p9 p1p1 p4p4 p3p3 p5p5 p8p8 p9p9 p1p1 ___ p5p5 p4p4 ___ p5p5 p1p1 p4p4 p6p6 p9p9 p2p2 p3p3 p3p3 p2p2 p8p8 p8p8 p3p3 p7p7 …

The Offline Problem However, this has no advantage over an honest offline algorithm 19 Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R A(R) =A’(R) Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R A(R) =A’(R)

The Offline Problem 20

Conclusions Multi-core paging is significantly different from sequential paging Traditional paging strategies are not competitive Serving a set of requests while limiting faults in each sequence is hard Multi-core paging is in P when number of cores is constant 21

Open Problems What are good online strategies? What are good measures of performance? – Fairness? What is the complexity of minimizing the number of faults? Can we obtain more efficient offline algorithms (exact or approximate)? 22 Thank you

23

Partition vs. Shared 24 E.g. K=12, p=3 OPT={5,5,2} p 1 p 2 p 3 p 4 p 5 p 1 p 2 p 3 p 4 p 5 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 p 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 2 q 3 q 4 q 5 q 1 q 2 q 3 q 4 q 5 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 q 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 1 s 2 s 3 s 4 s 5 s 1 s 2 s 3 s 4 s 5 For any online dynamic partition D that changes o(n) times

Dynamic Programming Alg. Running time exponential in K and p But polynomial in n (recall n>>p) In practice p=2,4,8. p=O(log n) Running time is infeasible Both minimizing the number of faults and PIF are in P for constant p Problem’s hardness stems from the number of sequences, not their length 25

Real world numbers 26

Hassidim’s model 27

Our Model Offline cannot delay sequences Requests must be served as they arrive Separates paging algorithm and scheduler Minimize number of faults 28

Static Partitions 29

Static Partitions 30 The choice of a good partition is more important than the choice of the eviction policy

The Offline Problem 31 Theorem: PIF is NP-complete f__f__hhhhhhhhhhhf__f__f__f__f__f__f__f__f__f__f__f__f__... f__f__f__f__f__f__f__hhhhhhhhhhhhhhhhhhhf__f__f__f__f__... f__f__f__f__f__f__f__f__f__f__f__f__f__f__f__hhhhhhhhhhh… t

Dynamic Programming Alg. 32

Dynamic Programming Alg. f 33 Positions in sequences Cache Polynomial for constants K and p

Partition vs. Shared Competitive strategies must be either shared or change the partition often In fact, these are equivalent (for disjoint sequences) 34 Partitions that don’t change enough are not competitive

Partition vs. Shared 35 For any online dynamic partition D that changes o(n) times Partitions that don’t change enough are not competitive

Related Models 36 ReferenceModelSchedulingDelayRemarks [Fiat, Karlin ’95]Multi-pointer in Access graph (applications and threads) No Algorithm with optimal Competitive Ratio [Barve et al. ‘00]Multiple applications No [Feuerstein, Strejilevich de Loma 02’] Multi-Threaded Paging YesNo [Hassidim 10’]Multi-coreYes [This]Multi-coreNoYes

Related Models 37 ReferenceModelSchedulingDelay [Fiat, Karlin ’95]Multi-pointer in Access graph (applications and threads) No [Barve et al. ‘00]Multiple applicationsNo [Feuerstein, Strejilevich de Loma 02’] Multi-Threaded PagingYesNo [Hassidim 10’]Multi-coreYes [This]Multi-coreNoYes