1 Towards Phase Change Memory as a Secure Main Memory André Seznec IRISA/INRIA.

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

Jaewoong Sim Alaa R. Alameldeen Zeshan Chishti Chris Wilkerson Hyesoon Kim MICRO-47 | December 2014.
Trading Flash Translation Layer For Performance and Lifetime
Citadel: Efficiently Protecting Stacked Memory From Large Granularity Failures June 14 th 2014 Prashant J. Nair - Georgia Tech David A. Roberts- AMD Research.
Module 10: Virtual Memory Background Demand Paging Performance of Demand Paging Page Replacement Page-Replacement Algorithms Allocation of Frames Thrashing.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
Memory Management Design & Implementation Segmentation Chapter 4.
Memory Management Norman White Stern School of Business.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
Chapter 91 Translation Lookaside Buffer (described later with virtual memory) Frame.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Due to the economic downturn, Microsoft Research has eliminated all funding for title slides. We sincerely apologize for any impact these austerity measures.
Defining Anomalous Behavior for Phase Change Memory
Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.
Security Refresh Prevent Malicious Wear-out and Increase Durability for Phase-Change Memory with Dynamically Randomized Address Mapping Nak Hee Seong Dong.
A Low-Cost Memory Remapping Scheme for Address Bus Protection Lan Gao *, Jun Yang §, Marek Chrobak *, Youtao Zhang §, San Nguyen *, Hsien-Hsin S. Lee ¶
Microprocessor-based systems Curse 7 Memory hierarchies.
Lecture 19: Virtual Memory
8.4 paging Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous. The basic method for implementation.
RDIS: A Recursively Defined Invertible Set Scheme to Tolerate Multiple Stuck-At Faults in Resistive Memory Rami Melhem, Rakan Maddah and Sangyeun cho Computer.
Storage Management - Chap 10 MANAGING A STORAGE HIERARCHY on-chip --> main memory --> 750ps - 8ns ns. 128kb - 16mb 2gb -1 tb. RATIO 1 10 hard disk.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
Paging Example What is the data corresponding to the logical address below:
Virtual Memory Virtual Memory is created to solve difficult memory management problems Data fragmentation in physical memory: Reuses blocks of memory.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
Virtual Memory 1 1.
© 2007 IBM Corporation MICRO-2009 Start-Gap: Low-Overhead Near-Perfect Wear Leveling for Main Memories Moinuddin Qureshi John Karidis, Michele Franceschini.
© 2007 IBM Corporation WEST-2010 Practical and Secure PCM Memories via Online Attack Detection Moinuddin Qureshi Luis Lastras, Michele Franceschini, John.
Introduction to Virtual Memory and Memory Management
Operating Systems Unit 7: – Virtual Memory organization Operating Systems.
Swap Space and Other Memory Management Issues Operating Systems: Internals and Design Principles.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo 1 Memory management & paging.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
Mellow Writes: Extending Lifetime in Resistive Memories through Selective Slow Write Backs Lunkai Zhang, Diana Franklin, Frederic T. Chong 1 Brian Neely,
CS 704 Advanced Computer Architecture
Memory Management Virtual Memory.
CE 454 Computer Architecture
The Memory System (Chapter 5)
Memory COMPUTER ARCHITECTURE
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Demand Paging Reference Reference on UNIX memory management
Cache Memory Presentation I
Demand Paging Reference Reference on UNIX memory management
Lecture 28: Virtual Memory-Address Translation
CMSC 611: Advanced Computer Architecture
Lecture 27: Virtual Memory
Overheads for Computers as Components 2nd ed.
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Use ECP, not ECC, for hard failures in resistive memories
Milestone 2 Enhancing Phase-Change Memory via DRAM Cache
Contents Memory types & memory hierarchy Virtual memory (VM)
CSE451 Virtual Memory Paging Autumn 2002
COMP755 Advanced Operating Systems
Virtual Memory.
Virtual Memory 1 1.
Presentation transcript:

1 Towards Phase Change Memory as a Secure Main Memory André Seznec IRISA/INRIA

2 Phase Change Memories: the technology promises Non volatile RAM:  More scalable than DRAM (up to 4X)  No leakage  Read access time in the same range as DRAM  or at least close But limited write endurance:  10 Mwrites ? 100 Mwrites ? 1Gwrites ?

3 ISCA 2009 (june) 3 papers on using PCM memories as main memory:  Concentrate at showing that simple mechanisms would allow a PCM main memory to accommodate conventional applications for the computer lifetime  Did not even notice the security breach:  Overwrite attack:  can just physically destroy the memory  can be run by any user without any priviledge  « just want my machine to be replaced before the end of the 3 years guarantee » Main memory should resist YEARS to overwrite attacks

4 Memory Controller: PA-to-PCMA translation PCM bank PCM bank PCM bank PCM bank PCM address space Physical address space

5 Start-Gap scheme, Micro 2009 (dec) Still targeting « normal » users applications:  Physical address to PCM address translation is dynamically changed at runtime  Randomization to avoid « hot write cells » associated with spatial locality  Security as a by-product of randomization First study to consider possible malicious attack:  Region-based Start-Gap scheme

6 Memory Controller: PA-to-PCMA translation PCM bank PCM bank PCM bank PCM bank PCM address space Physical address space PCM address is invisible

7 Start-Gap Wear Leveling Two registers (Start & Gap) + 1 line (GapLine) to support movement. Move GapLine every G writes to memory.  START A B C PCMAddr = (Start+Addr); (PCMAddr >= Gap) PCMAddr++) D GAP  Storage overhead: less than 8 bytes (GapLine taken from spares) Write overhead: One extra write every G writes  1% (G=100) Randomized address space to avoid “hot region” and predictability Courtesy from Moinuddin Qureshi

8 The security on RBSG W the write endurance On a given region of S blocks, the PA-to-PCMA address translation of one block is changed every Gap writes: induce an extra PCM block write For a given physical block PA-to-PCMA translation is guaranteed to change every Gap*S writes For a given physical block PA-to-PCMA translation is periodic with period Gap*S < W

9 RBSG (Micro 2009) W= 32M S= 256Kblocks, Gap =100 4Ghz || write acces time, 4Kcycles: 1Mwrite/s Basing security on low write bandwidth (256Mbytes/s) ? Resist to overwriting same physical block for 4 months (77 days from my counting !!)

10 Birthday Paradox Attack (BPA) In a group of 24 persons it is likely (p>1/2) that at least two persons have the same birthday. In a sequence of 9645 randomly selected elements in a set of 64M memory blocks, it is likely to have twice the same element. Micro RBSG hypothesis + 4GBs/s write bandwidth: should resist 4 years at full bandwidth +interleaving 16 sequences of 32M writes on 16 different addresses 4 1/2 hours of write endurance (first failure)

11 Sandbagging RBSG against BPA Reduce region size S, reduce Gap  S*Gap << W  S=128K, Gap=64  Optimized BPA 11.5 days  RAA: 48 days  S=64K, Gap=64  Optimized BPA 97 days  RAA: 24 days BUT..

12 Combined BPA-RAA 1/16 th of the bandwidth for RAA, 15/16 th for BPA S= 64K, Gap= 64  days S=256K, Gap= 8  61 days, but 10 % write overhead  But no page mode ?

13 RBSG + page mode The PA-to-PCMA translation granularity is a page  4KB pages: write overhead 16 blocks  Gap =128 (12.5% write overhead), S=32K pages  4 1/2 days

14 And spare lines ? Main memory are implemented with spare blocks to get some permanent fault tolerance.  Any spare line can replace any memory line  Gap=100, 64K spares, no page mode:  RAA-BPA : 51 days

15 Spare lines + page mode  Gap =128,  1K spares : 7.75 days, S=32K pages  64K spares: 16 days, S= 64K pages  + Endurance = 128M writes  1K spares: 65 days, S= 128K pages  64K spares: 110 days, S= 128K pages

16 Still want to use PCM main memory and guarantee the hardware for 3 years ?

17 Or

18 S-PCM memory Security as the first class citizen Should resist to attacks for a sizeable fraction of the expected lifetime

19 Principles for a secure PCM main memory Invisible PA-to-PCMA translation:  Malicious user cannot figure out PA-to-PCMA translation Complete « randomization » of the PA-to-PCMA translation changes  Any physical block could be mapped onto any PCM block  Defeat RAA Frequent changes of the PA-to-PCMA translation:  Defeat BPA:  Experimentally, translation change frequency must be much higher than 1/W to reach 50 % of the expected memory life time (256/W in practice)

20 Implementation principles Use of a PA-to-PCMA translation table  One entry for a region of R= blocks  A physical region is mapped on a PCM region  A block can be mapped on any block in the target region  PA-to-PCMA translation change:  Only on writes  Randomly trigerred with frequency F  No counter: only a random number generator  Swap two PA-to-PCMA translations

21 Some implementation constraints A region must be larger than a page  16 GB memory, 4KB pages: 4M pages..  Regions should be large:  256KB  64Kentries  4MB  4Kentries A PA-to-PCMA translation change induces 2 R memory block reads and 2 R memory block writes:  For limiting write overhead, should limit the frequency F

22 Dealing with the constraints W= 32M, 16GB memory, 256 bytes blocks, 1 extra write per 8 writes F= 256/W   50 % total write endurance  extra write bandwidth: 2S*F = 1/8  S= 8K blocks  8K 26-bit translation table entries –26Kbytes, not a huge table !!   52 % total write endurance  4GBs/s: 2 years of endurance to BPA or RAA

23 Initializing the translation table The translation table has to set a one-to-one mapping  Boot-time initialization ? With « random » mapping ?

24 T(B).addr  B  R_initT(B).disp  X  D_init B region displacement X addressdisp PCM address space Physical memory address space Initialized at boot-time Initialized with zeros at boot-time

25 Swapping two translations blocks T(A).addr= oldT(B).addr  B  A T(A).disp= oldT(A).addr  RAND T(B).disp= oldT(B).addr  RAND  Randomizing the displacement is needed to avoid attacks on a fixed position in the region

26 Managing region swaps Large regions have to be swapped on PA-to-PCMA translation changes:  Normal reads and writes should not to be stopped  Randomly triggered PA-to-PCMA translation changes The memory controller must interleave normal access flows with region swapping:  In practice, a random priority biased to normal access flow limits the buffer of regions to be swapped.

27 Endurance of the secure PCM memory 16GB memory, 256B blocks, 4Kblocks regions  52 Kbytes translation table 32M64M128M256M %42%53%66%74% 12.5 %62%69%74%79% Endurance Write overhead Expected life time under attack

28 Endurance of the secure PCM memory 16GB memory, 256B blocks, 64Kblocks regions  3.25 Kbytes translation table 32M64M128M256M 3.125% 3 min0.4 %7.4%19% 12.5 % 7.4 %  3 months 19 %38 % 51 %  2 years endurance Write overhead Expected life time under attack

29 And « normal » applications ? Region swap after 1/F writes (average) In a swap interval:  Malicious attacks:  One block 1/F writes, the other blocks no writes  « Normal » applications:  A total of 1/F writes on different blocks in the same region For a single PCM block: swap frequency is much higher than F ÜEndurance is very close to theoretical

30 S-PCM + Years of endurance + Address translation: –Table read + XOR - Hardware logic for region swapping RBSG - Days of endurance - Address Translation: –1st logic + table read + 2nd logic + Simple logic for page moving

31 Conclusion If PCM technology delivers then secure PCM main memory will be possible Wear leveling comes for free with security Main overhead costs:  Hardware logic to interleave region swapping with normal access flow  Random number generator  Will fix write overhead to less than 1 % for « normal » workload (just adapt ideas from Moinuddin) No need for « monstruous » cell endurance

32 Disclaimer There might be other forms of attacks:  Probably not on the scheme by itself: randomization is a quite good defense  Side channels attacks against specific hardware implementations:  E.g. concentrate attack on a single bank

33 An attack against new Moinuddin’s scheme

34 repeat A (x N) Random (x M) With Moinuddin’s parameters N=84, M=1792, Gap= min(128,d),LRU stack 4 entries Same block written 22M times before PA-PCMA translation change + BPA: 7 days and that is it !!

35 But that might be corrected decrease the gap factor :  Gap = Min (128, d/32), 3.5 M consecutive writes decrease the region size :  Gap = Min(128,d), 512K regions, 2.75 M consecutive writes

36 Concern Each new attack generates new countermeasure:  Extra hardware complexity  New opportunity for new attacks  Possibility of snowball effects

37 New attack opportunities decrease the gap factor :  Gap = Min (128, d/32), 3.5 M consecutive writes  Combined with a RAA: 4 months decrease the region size :  Gap = Min(128,d), 512Kblocks regions, 2.75 M consecutive writes  RAA is improved by a 8x factor