IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University.

Slides:



Advertisements
Similar presentations
NC STATE UNIVERSITY 1 Assertion-Based Microarchitecture Design for Improved Fault Tolerance Vimal K. Reddy Ahmed S. Al-Zawawi, Eric Rotenberg Center for.
Advertisements

IMPACT Second Generation EPIC Architecture Wen-mei Hwu IMPACT Second Generation EPIC Architecture Wen-mei Hwu Department of Electrical and Computer Engineering.
D. Tam, R. Azimi, L. Soares, M. Stumm, University of Toronto Appeared in ASPLOS XIV (2009) Reading Group by Theo 1.
LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
1 Implementing an Untrusted Operating System on Trusted Hardware David Lie Chandramohan A. Thekkath Mark Horowitz University of Toronto, Microsoft Research,
1 Lecture 6: Chipkill, PCM Topics: error correction, PCM basics, PCM writes and errors.
Digital Signatures and Hash Functions. Digital Signatures.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Memory Chapter 3. Slide 2 of 14Chapter 1 Objectives  Explain the types of memory  Explain the types of RAM  Explain the working of the RAM  List the.
SAFER: Stuck-At-Fault Error Recovery for Memories Nak Hee Seong † Dong Hyuk Woo † Vijayalakshmi Srinivasan ‡ Jude A. Rivers ‡ Hsien-Hsin S. Lee † ‡†
1 CS 577 “TinySec: A Link Layer Security Architecture for Wireless Sensor Networks” Chris Karlof, Naveen Sastry, David Wagner UC Berkeley Summary presented.
TinySec: Performance Characteristics Chris K :: Naveen S :: David W January 16, 2004.
Yinglei Wang, Wing-kei Yu, Sarah Q. Xu, Edwin Kan, and G. Edward Suh Cornell University Tuan Tran.
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
1 Towards Scalable and Energy-Efficient Memory System Architectures Rajeev Balasubramonian School of Computing University of Utah.
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
TitleEfficient Timing Channel Protection for On-Chip Networks Yao Wang and G. Edward Suh Cornell University.
Secure Embedded Processing through Hardware-assisted Run-time Monitoring Zubin Kumar.
Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.
A Low-Cost Memory Remapping Scheme for Address Bus Protection Lan Gao *, Jun Yang §, Marek Chrobak *, Youtao Zhang §, San Nguyen *, Hsien-Hsin S. Lee ¶
LOT-ECC: LOcalized and Tiered Reliability Mechanisms for Commodity Memory Systems Ani Udipi § Naveen Muralimanohar* Rajeev Balasubramonian Al Davis Norm.
Architecture for Protecting Critical Secrets in Microprocessors Ruby Lee Peter Kwan Patrick McGregor Jeffrey Dwoskin Zhenghong Wang Princeton Architecture.
Computer Security and Penetration Testing
Energy-Efficient Cache Design Using Variable-Strength Error-Correcting Codes Alaa R. Alameldeen, Ilya Wagner, Zeshan Chishti, Wei Wu,
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
R Enabling Trusted Software Integrity Darko Kirovski Microsoft Research Milenko Drinić Miodrag Potkonjak Computer Science Department University of California,
Lecture 4.1: Hash Functions, and Message Authentication Codes CS 436/636/736 Spring 2015 Nitesh Saxena.
Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.
Yun-Chung Yang SimTag: Exploiting Tag Bits Similarity to Improve the Reliability of the Data Caches Jesung Kim, Soontae Kim, Yebin Lee 2010 DATE(The Design,
Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.
Implicit-Storing and Redundant- Encoding-of-Attribute Information in Error-Correction-Codes Yiannakis Sazeides 1, Emre Ozer 2, Danny Kershaw 3, Panagiota.
Data Integrity Proofs in Cloud Storage Author: Sravan Kumar R and Ashutosh Saxena. Source: The Third International Conference on Communication Systems.
Lecture 4.1: Hash Functions, and Message Authentication Codes CS 436/636/736 Spring 2014 Nitesh Saxena.
Project: Simulated Encrypted File System (SEFS) Omar Chowdhury Fall 2015CS526: Information Security1.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
1 CMP-MSI.07 CARES/SNU A Reusability-Aware Cache Memory Sharing Technique for High Performance CMPs with Private Caches Sungjune Youn, Hyunhee Kim and.
ECE/CS 552: Main Memory and ECC © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
1 Lecture 5: Refresh, Chipkill Topics: refresh basics and innovations, error correction.
1 Lecture 5: Scheduling and Reliability Topics: scheduling policies, handling DRAM errors.
Hashes Lesson Introduction ●The birthday paradox and length of hash ●Secure hash function ●HMAC.
IT 221: Introduction to Information Security Principles Lecture 5: Message Authentications, Hash Functions and Hash/Mac Algorithms For Educational Purposes.
FILE SYSTEM IMPLEMENTATION 1. 2 File-System Structure File structure Logical storage unit Collection of related information File system resides on secondary.
Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept. of Electrical and Science Computer Engineering Duke.
Database Laboratory Regular Seminar TaeHoon Kim Article.
Types of RAM (Random Access Memory) Information Technology.
Architecture Support for Secure Computing Mikel Bezdek Chun Yee Yu CprE 585 Survey Project 12/10/04.
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
Cryptographic Hash Function. A hash function H accepts a variable-length block of data as input and produces a fixed-size hash value h = H(M). The principal.
Rakan Maddah1, Sangyeun2,1 Cho and Rami Melhem1
Types of RAM (Random Access Memory)
Cryptographic Hash Functions
Cryptographic Hash Function
Cryptographic Hash Functions
Cryptographic Hash Functions Part I
Lecture 28: Reliability Today’s topics: GPU wrap-up Disk basics RAID
AEGIS: Secure Processor for Certified Execution
Lecture 6: Reliability, PCM
SYNERGY: Rethinking Secure-Memory Design for Error-Correcting Memories
Use ECP, not ECC, for hard failures in resistive memories
Cryptographic Hash Functions Part I
Lecture 4.1: Hash Functions, and Message Authentication Codes
Dynamic Verification of Sequential Consistency
Hashing Hash are the auxiliary values that are used in cryptography.
Fault Tolerant Systems in a Space Environment
University of Wisconsin-Madison Presented by: Nick Kirchem
2019 2학기 고급운영체제론 ZebRAM: Comprehensive and Compatible Software Protection Against Rowhammer Attacks 3 # 단국대학교 컴퓨터학과 # 남혜민 # 발표자.
Presentation transcript:

IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University

2 ECC Integrity Verification (IV) IV+ECC Random Error Detection Malicious Attack Detection Random Error Correction Motivation Processor Off-chip Memory Random Transient Errors ECC ECC Parity Malicious Attacks IV IV Hash It’s easy to compute the ECC parity bits for the injected attack data. Execution is aborted when IV fails. Twice the overhead for random error detection!!

3 IVEC – Integrity Verification with Error Correction  Goal: Extend IV to correct errors while ensuring a proper level of security Cover both single-bit and multi-bit errors  Challenge Error correction is essentially finding the erroneous bits Cryptographic hash in IV does not reveal error locations 3 Can we extend the capability of IV to handle both security and reliability errors with minimal overheads?

4 Outline  Background ECC Integrity Verification (IV)  IVEC error correction Single-bit errors Multi-bit errors  HW Implementation  Evaluation

5 ECC (SEC-DED)  In general, a modern system uses (72, 64) SEC-DED ECC  For every 64-bit data, 8 additional parity bits are needed  Memory space and bandwidth overheads of 12.5%  Correct 1-bit errors 5 ECC DIMM (18 x4 DRAM chips) DRAM 1 72-bit SEC-DED ECC Word DRAM 2 DRAM 3 DRAM 4 DRAM 5 DRAM 6 DRAM 7 DRAM 8 DRAM 9 DRAM 10 DRAM 11 DRAM 12 DRAM 13 DRAM 14 DRAM 15 DRAM 16 DRAM 18 DRAM 17 Two extra DRAM chips for 8-bit parity of ECC  ECC can be extended to correct common multi-bit errors  Chip-kill correct: correct up to one DRAM chip failure

6 Cryptographic Hash  IV relies on cryptographic hash to detect any changes on data saved in an un-trusted memory Fixed length “finger print” of the data Collision resistance is a key property  Message Authentication Code (MAC) is a keyed cryptographic hash that can also be used for IV Data (d) Hash (h) On data access, check if h == H(d)

7 hash Size of a cache block Protected data in memory hash IV - Hash/MAC Trees  Integrity verification techniques often rely on hash/MAC trees Any changes in data memory would be detected H(h 1 || h 2 || h 3 || h 4 ) root hash h1h1 h2h2 h3h3 h4h4 In processor In off-chip memory 7 hash Size of a cache block Protected data in memory hash h1h1 h2h2 h3h3 h4h4 h1h1 h2h2 h3h3 h4h4 Previous works suggest that IV’s performance overhead is only 2-5% when using Cached MAC Trees

8 Outline  Background ECC Integrity Verification (IV)  IVEC error correction Single-bit errors Multi-bit errors  HW Implementation  Evaluation

9 Single-bit Error Model  A single-bit error in a cache block (64B)  Error is detected by checking the computed hash value to the stored hash value on-chip 9 DIMM1 DIMM4 DRAM 1 DRAM 16 DRAM 1 DRAM 16 1 st Read-block (256 bits) 2 nd Read-block (256 bits)  64B cache block, 256-bits per read-block (2 read-blocks required to fill 1 cache block)

10 Single-bit Error Correction  Correction as searching problem Flip one bit at a time for all possible combinations, and check if the new value passes the integrity verification 10 DIMM1 DIMM4 DRAM 1 DRAM 16 DRAM 1 DRAM st Read-block (256 bits) nd Read-block (256 bits)  64B cache block, 256bits per read-block (2 reads required to fill 1 cache block) Corrected!

11 Multi-bit Error Model  Any bits in one DRAM chip can fail in each read- block Similar to chip-kill correct 11 DIMM1 DIMM4 DRAM 1 DRAM 16 DRAM 1 DRAM 16 1 st Read-block (256 bits) 2 nd Read-block (256 bits)  64B cache block, 256bits per read-block (2 reads required to fill 1 cache block)

12 2 nd Read-block (256 bits) IVEC Error Correction with Parity  Each parity bit covers one bit from every DRAM chip in a read-block x4 DRAM: 4 parity bits per read-block 12 DIMM1 DIMM4 DRAM 1 DRAM 16 DRAM 1 DRAM 16 1 st Read-block (256 bits)  64B cache block, 256bits per read-block (2 reads required to fill 1 cache block), 8 parity bits P1P2P3P4P1P2P3P4P1P2P3P4P1P2P3P4 P5P6P7P8P5P6P7P8P5P6P7P8P5P6P7P8 P1 P3 P4 P2

13 IVEC Correction with Parity  Use parity bits to guide our correction search Correction scheme can be extended with more or fewer number of parity bits 13 DIMM1 DIMM4 DRAM 1 DRAM 16 DRAM 1 DRAM st Read-block (256 bits) nd Read-block (256 bits)  64B cache block, 256bits per read-block (2 reads required to fill 1 cache block), 8 parity bits P1P2P3P4P1P2P3P4P1P2P3P4P1P2P3P4 P5P6P7P8P5P6P7P8P5P6P7P8P5P6P7P Corrected! For hard faults, start searching from recent error locations

14 Parity Handling  Parity bits are stored in regular memory space  Parity bits are not needed for reads unless there is an error They are only updated on write-back operations Decoupled error detection and correction  A parity cache can be used to load and store parity bits when necessary

15 Outline  Background ECC Integrity Verification (IV)  IVEC error correction Single-bit errors Multi-bit errors  HW Implementation  Evaluation

16 IVEC Hardware Implementation  Blue – new blocks for IVEC  Yellow – already exist in a system with IV 16 IVEC Control Parent MAC from cache Counter Cache Counter Cache L2 Cache AES Check GF Multiply LDQ To memory From memory IV Queue Data Queue MACQ Correction Buffer To L2 Result to control Parity Cache

17 Outline  Background ECC Integrity Verification (IV)  IVEC error correction Single-bit errors Multi-bit errors  HW Implementation  Evaluation

18 Error Detection  IV detects any error pattern unless there is a hash/MAC collision  Error detection probability depends on the length of the hash/MAC ↑ hash/MAC length, ↓ collision rate For example, 64-bit MAC has 1/ 2 64 collision rate

19 Error Correction  Mis-correction happens if there is a hash/MAC collision on a correction attempt Every time a hash is recomputed for a possible correction (correction attempt), there is a chance of a collision ↑ number of correction attempts, ↑ mis-correction rate  Security is weakened by correction attempts An integrity violation is not detected on a mis-correction ↑ number of correction attempts, ↓ security  Correction latency GMAC: 4-8 cycles per correction attempt

20 Worst-Case Numbers  Maximum number of correction attempts 20 Parity Single-bit ErrorMulti-bit Error x4 DRAM Chip x8 DRAM Chip x16 DRAM Chip x4 DRAM Chip x8 DRAM Chip x16 DRAM Chip None bits bits bits bits Security is reduced by ~12-bit (64bits->52bits) Max correction latency: cycles Security is reduced by ~8-bit (64bits->56bits) Max correction latency: 4096 cycles 512-bit cache block, 256-bit read-block

21 Memory Space Overhead 21  ECC: 64 parity bits per cache block (512 bits)  IV: 64-bit MAC per cache block (512 bits) in a MAC tree structure plus meta-data

22 Performance Evaluation  Run-time overheads Error correction latency: negligible with a typical SER rate Performance overhead due to off-chip bandwidth usage from updating parity bits  Tools Pin instrumentation tool and TAXI performance simulator  Parameters Core2-like single processor: 4-issue OoO core  Baseline is chosen to have IV implemented 64-bit GMAC-tree with split counter mode (< 5% overhead)

23 Memory Bandwidth Overhead  Traditional ECC bandwidth overhead is 12.5%  IVEC Memory bandwidth overhead is <= 9% in the worst case  Performance overhead is negligible (0.5% in the worst case) 23 9%3.2%

24 Related Work  Memory integrity verification  Off-chip DRAM ECC SEC-DED ECC Chip-kill Correct  Tiered ECC  Reliability and Security Engine (RSE) 24

25 Conclusion  IVEC enables efficient protection of off-chip memory from both security attacks and random errors Can handles both single-bit errors and multi-bit errors Minimal impact on security  IVEC is able to eliminate the use of traditional ECC for off- chip memory when a system requires IV for security 25