Improving Cache Management Policies Using Dynamic Reuse Distances

Slides:

Advertisements

Similar presentations

I MPROVING C ACHE M ANAGEMENT P OLICIES U SING D YNAMIC R EUSE D ISTANCES Nam Duong 1, Dali Zhao 1, Taesu Kim 1, Rosario Cammarota 1, Mateo Valero 2, Alexander.

Advertisements

Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite Javier Lira ψ Carlos Molina ф Antonio González λ λ Intel Barcelona Research.

Dead Block Replacement and Bypass with a Sampling Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

Bypass and Insertion Algorithms for Exclusive Last-level Caches

QuakeTM: Parallelizing a Complex Serial Application Using Transactional Memory Vladimir Gajinov 1,2, Ferad Zyulkyarov 1,2,Osman S. Unsal 1, Adrián Cristal.

ICS’02 UPC An Interleaved Cache Clustered VLIW Processor E. Gibert, J. Sánchez * and A. González * Dept. d’Arquitectura de Computadors Universitat Politècnica.

Link-Time Path-Sensitive Memory Redundancy Elimination Manel Fernández and Roger Espasa Computer Architecture Department Universitat.

Parallel H.264 Decoding on an Embedded Multicore Processor

1 Optimizing compilers Managing Cache Bercovici Sivan.

Hopkins Storage Systems Lab, Department of Computer Science Automated Physical Design in Database Caches T. Malik, X. Wang, R. Burns Johns Hopkins University.

1 Lecture 17: Large Cache Design Papers: Managing Distributed, Shared L2 Caches through OS-Level Page Allocation, Cho and Jin, MICRO’06 Co-Operative Caching.

Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture Seongbeom Kim, Dhruba Chandra, and Yan Solihin Dept. of Electrical and Computer.

Computer Science, University of Oklahoma Reconfigurable Versus Fixed Versus Hybrid Architectures John K. Antonio Oklahoma Supercomputing Symposium 2008.

FLEXclusion: Balancing Cache Capacity and On-chip Bandwidth via Flexible Exclusion Jaewoong Sim Jaekyu Lee Moinuddin K. Qureshi Hyesoon Kim.

COMP Superscalar: Bringing GRID superscalar and GCM together Enric Tejedor Universitat Politècnica de Catalunya V ProActive and GCM.

1 Lecture 9: Large Cache Design II Topics: Cache partitioning and replacement policies.

Probability-based Dynamic Time Warping for Gesture Recognition on RGB-D data All rights reserved HuBPA© Human Pose Recovery and Behavior Analysis Group.

Improving Cache Performance by Exploiting Read-Write Disparity

1 Lecture 11: Large Cache Design IV Topics: prefetch, dead blocks, cache networks.

HK-NUCA: Boosting Data Searches in Dynamic NUCA for CMPs Javier Lira ψ Carlos Molina ф Antonio González ψ,λ λ Intel Barcelona Research Center Intel Labs.

UPC Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures Carlos Molina ψ, ф Jordi Tubella ф Antonio González λ,ф ISHPC-VI,

Virtual Laboratory for the dissemination of energy management systems: The case of the metropolitan transport system A. Escolà, F. Babot, A. Dòria-Cerezo,

UPC Value Compression to Reduce Power in Data Caches Carles Aliagas, Carlos Molina and Montse García Universitat Rovira i Virgili – Tarragona, Spain {caliagas,

1 Lecture 10: Large Cache Design III Topics: Replacement policies, prefetch, dead blocks, associativity Sign up for class mailing list Pseudo-LRU has a.

Compilation Techniques for Energy Reduction in Horizontally Partitioned Cache Architectures Aviral Shrivastava, Ilya Issenin, Nikil Dutt Center For Embedded.

1 GRID D. Royo, O. Ardaiz, L. Díaz de Cerio, R. Meseguer, A. Gallardo, K. Sanjeevan Computer Architecture Department Universitat Politècnica de Catalunya.

Josefina López Herrera Institut d’Informàtica i Robòtica Industrial Universitat Politècnica de Catalunya Edifici Nexus Gran Capità 2-4 Barcelona 08034,

Luis H. Bibiano, Enric Mayol, Joan A. Pastor

Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.

Dynamic Runtime Testing for Cycle-Accurate Simulators Saša Tomić, Adrián Cristal, Osman Unsal, Mateo Valero Barcelona Supercomputing Center (BSC) Universitat.

LIGHTNESS Introduction 10th Oct, 2012 Low latency and hIGH Throughput dynamic NEtwork infrastructureS for high performance datacentre interconnectS.

A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.

A Hardware-based Cache Pollution Filtering Mechanism for Aggressive Prefetches Georgia Institute of Technology Atlanta, GA ICPP, Kaohsiung, Taiwan,

Performance Tuning on Multicore Systems for Feature Matching within Image Collections Xiaoxin Tang*, Steven Mills, David Eyers, Zhiyi Huang, Kai-Cheung.

Module 7 Reading SQL Server® 2008 R2 Execution Plans.

Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.

ECE8833 Polymorphous and Many-Core Computer Architecture Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Lecture 6 Fair Caching Mechanisms.

(1) Scheduling for Multithreaded Chip Multiprocessors (Multithreaded CMPs)

WormBench A Configurable Application for Evaluating Transactional Memory Systems MEDEA Workshop Ferad Zyulkyarov 1, 2, Sanja Cvijic 3, Osman.

A Model for Computational Science Investigations Supercomputing Challenge

Talbot Hill Recycling Center We reduce, recycle and reuse!

Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs Javier Lira ψ Carlos Molina ф Antonio González λ λ Intel Barcelona Research.

Improving Cache Performance by Exploiting Read-Write Disparity Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jiménez.

MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,

LLMGuard: Compiler and Runtime Support for Memory Management on Limited Local Memory (LLM) Multi-Core Architectures Ke Bai and Aviral Shrivastava Compiler.

Analysis of NUCA Policies for CMPs Using Parsec Benchmark Suite Javier Lira ψ Carlos Molina ф Antonio González λ λ Intel Barcelona Research Center Intel.

System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.

CMP-MSI Feb. 11 th 2007 Core to Memory Interconnection Implications for Forthcoming On-Chip Multiprocessors Carmelo Acosta 1 Francisco J. Cazorla 2 Alex.

Adaptive GPU Cache Bypassing Yingying Tian *, Sooraj Puthoor†, Joseph L. Greathouse†, Bradford M. Beckmann†, Daniel A. Jiménez * Texas A&M University *,

PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Caching in multiprocessor systems Tiina Niklander In AMICT 2009, Petrozavodsk

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications Vladimir Subotic, Jose Carlos Sancho,

Visual-FIR: A new platform for modeling and prediction of dynamical Systems Antoni Escobet 1 Àngela Nebot 2 François E. Cellier 3 1 Dept. ESAII Universitat.

LRU-PEA: A Smart Replacement Policy for NUCA caches on Chip Multiprocessors Javier Lira ψ Carlos Molina ψ,ф Antonio González ψ,λ λ Intel Barcelona Research.

Speaker : Kyu Hyun, Choi. Problem: Interference in shared caches – Lack of isolation → no QoS – Poor cache utilization → degraded performance.

A Model for Computational Science Investigations Supercomputing Challenge 2007.

Cache Replacement Championship

Improving Cache Performance using Victim Tag Stores

Reducing OLTP Instruction Misses with Thread Migration

Less is More: Leveraging Belady’s Algorithm with Demand-based Learning

Javier Díaz1, Pablo Ibáñez1, Teresa Monreal2,

18742 Parallel Computer Architecture Caching in Multi-core Systems

FPGA: Real needs and limits

Bojian Zheng CSCD70 Spring 2018

CARP: Compression Aware Replacement Policies

Memory Management 11/17/2018 A. Berrached:CS4315:UHD.

CARP: Compression-Aware Replacement Policies

Lecture 14: Large Cache Design II

Progress Report 2012/12/20.

The Bill of Rights.

Presentation transcript:

Improving Cache Management Policies Using Dynamic Reuse Distances Nam Duong1, Dali Zhao1, Taesu Kim1, Rosario Cammarota1, Mateo Valero2, Alexander V. Veidenbaum1 1University of California Irvine, 2Universitat Politecnica de Catalunya and Barcelona Supercomputing Center Proposed new cache replacement and partitioning policies with a better balance between reuse and pollution Cache lines must be kept long enough to be reused Cache lines pollute the cache if kept too long without reuse Introduced a new concept, Protecting Distance (PD), that is based on a reuse distance balancing reuse and pollution An inserted line cannot be evicted for PD accesses to its set Can be guaranteed if cache bypass is used Developed single- and multi-core hit rate models as a function of PD, cache configuration and program behavior The models’ inputs are collected during execution and used to dynamically compute the PD maximizing hit rate For multi-core shared cache the partitioning problem is shown to be solved by computing a set of pre-thread PDs Showed that PD-based cache management policies improve performance for both single- and multi-core systems PD