A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.

Slides:



Advertisements
Similar presentations
Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.
Advertisements

Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite Hussein Al-Zoubi.
LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.
M. Mateen Yaqoob The University of Lahore Spring 2014.
Chapter 12 Memory Organization
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
1 Lecture 12: Cache Innovations Today: cache access basics and innovations (Sections )
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
CS 104 Introduction to Computer Science and Graphics Problems
Memory Management 2010.
1 Lecture 13: Cache Innovations Today: cache access basics and innovations, DRAM (Sections )
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
1 Lecture 14: Virtual Memory Today: DRAM and Virtual memory basics (Sections )
Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Dyer Rolan, Basilio B. Fraguela, and Ramon Doallo Proceedings of the International Symposium on Microarchitecture (MICRO’09) Dec /7/14.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Memory Hierarchy 2.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Paging. Memory Partitioning Troubles Fragmentation Need for compaction/swapping A process size is limited by the available physical memory Dynamic growth.
8.4 paging Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous. The basic method for implementation.
Hardware Assisted Control Flow Obfuscation for Embedded Processors Xiaoton Zhuang, Tao Zhang, Hsien-Hsin S. Lee, Santosh Pande HIDE: An Infrastructure.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.
A Single-Pass Cache Simulation Methodology for Two-level Unified Caches + Also affiliated with NSF Center for High-Performance Reconfigurable Computing.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
1 Some Real Problem  What if a program needs more memory than the machine has? —even if individual programs fit in memory, how can we run multiple programs?
Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
Lecture 20 Last lecture: Today’s lecture: Types of memory
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Memory Design Principles Principle of locality dominates design Smaller = faster Hierarchy goal: total memory system almost as cheap as the cheapest component,
CACHE MEMORY CS 147 October 2, 2008 Sampriya Chandra.
Presented by Rania Kilany.  Energy consumption  Energy consumption is a major concern in many embedded computing systems.  Cache Memories 50%  Cache.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
A Framework For Trusted Instruction Execution Via Basic Block Signature Verification Milena Milenković, Aleksandar Milenković, and Emil Jovanov Electrical.
A PRESENTATION ON VIRTUAL MEMORY (PAGING) Submitted to Submitted by Prof. Dr. Ashwani kumar Ritesh verma Dept. Of Physics Mtech (Instrumentation) Roll.
CS 704 Advanced Computer Architecture
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
New Cache Designs for Thwarting Cache-based Side Channel Attacks
5.2 Eleven Advanced Optimizations of Cache Performance
Cache Memory Presentation I
CSCI206 - Computer Organization & Programming
Chapter 5 Memory CSE 820.
ECE 445 – Computer Organization
Another Performance Evaluation of Memory Hierarchy in Embedded Systems
Module IV Memory Organization.
Chap. 12 Memory Organization
Miss Rate versus Block Size
Cache - Optimization.
Principle of Locality: Memory Hierarchies
Module IV Memory Organization.
Presentation transcript:

A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee

Introduction  Introduction  Problems with current designs  Attempts to mitigate information leakage  Proposed Design  Cache Miss  Replacement Policy  Address Decoder  Results

Introduction  Current Cache designs are susceptible to cache based attacks.  Caches should have low miss rates and short access times and should be power efficient at the same time.  The author used the SPEC2000 suite to evaluate cache miss behaviour. CACTI and HSPICE to validate the circuit design.  The proposed cache architecture has low miss rates comparable to a highly associative cache and short access times and power efficiency close to that of direct mapped cache.

Problems with Current Designs  Hardware caches in processors introduce interference between programs and users.  One process can evict cache lines of other processes, causing them to miss cache accesses. Critical information can be leaked out due to common cache behaviour.  Cache-based attacks allow the recovery of the full secret cryptographic key and require much less time and computation power to do so.  A remote computer user can become an attacker without the need for special equipment.

Past Attempts to Mitigate Information Leakage  Software: mostly involves re-writing code to prevent known attacks from succeeding  Hardware: Cache line locking and cache Partitioning  Both prevent undesirable cache evictions if the objects are put into a private partition or locked in cache thus helping to achieve constant execution time.  Drawback is cache underutilization. Randomized approach avoids this.

Design  New address decoder  New SecRAND replacement algorithm  Adopt the direct mapped architecture and extend this with dynamic memory-to-cache remapping and larger ache index.

Design (2)  Line width: 2 n+k [n – from traditional direct mapping] [n+k – equivalent to mapping the memory space to a large logical direct mapped cache with 2 n+k ]  So it has 2 n physical cache lines, but has 2 n+k lines in memory that can be mapped to these lines.  RMT – Re-Mapping Table – allows different processes to have different memory to cache mappings.

Cache Miss  Because they have chosen to use dynamic re-mapping, a cache replacement algorithm is needed.  Index miss: none of the LNregs matches the given RMT_ID and index. None of the cache lines is selected in an index miss. [Unique to this cache design]  Tag Miss: essentially the same thing as an ordinary miss in a traditional direct-mapped cache.  Note from figure: Protection bit is also included.

Replacement Policy  Because of the dynamic remapping, we need a cache replacement algorithm.  The tag misses are conflict misses in the LDM cache since the addresses of the incoming data line and the line in cache have the same index but different tags. No two LNregs can contain the same index bits. Either the original line is replaced with the incoming line, or the incoming line is not cached at all.  The proposed replacement method is a new modified random replacement policy (SecRAND).

Replacement Policy (2)  Tag Miss (most likely a part of the same process): To avoid information leaking interference, a random cache line is selected to be evicted if either C or D are protected. Since D cannot replace lines other than C, it is sent directly to the CPU core.  Index miss: C and D may or may not belong to the same process. The new memory block, D can replace any cache line, which is randomly selected.  The random placement algorithm requires less hardware than other commonly used replacement algorithms (such as LRU and FIFO) due to its stateless nature.

Address Decoder

HSPICE Results on Address Decoder Delay

Cache Miss Rates As reported by a cache simulator derived from sim-cache and sim-cheetah of the simplescalar toolset. Run on all 26 SPEC2000 benchmarks.

Overall Power Consumption Obtained using CACTI 5.0

Additional Benefits/Conclution  Fault Tolerance: Due to dynamic remapping  Hot Spot Mitigation: Due to special and temporal locality  Ability to optimize for power efficiency: Due to ability to turn off unused cache lines.  Many challenging and even conflicting design goals such as security and high performance can be achieved at the same time.  Proposed architecture is secure, yet requires lower hardware cost.