Landon Cox January 17, 2018 January 22, 2018

Slides:

Advertisements

Similar presentations

Memory Protection: Kernel and User Address Spaces  Background  Address binding  How memory protection is achieved.

Advertisements

Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.

Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.

Memory Management (II)

Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.

Computer Architecture 2011 – out-of-order execution (lec 7) 1 Computer Architecture Out-of-order execution By Dan Tsafrir, 11/4/2011 Presentation based.

Informationsteknologi Friday, November 16, 2007Computer Architecture I - Class 121 Today’s class Operating System Machine Level.

Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.

Chapter 91 Translation Lookaside Buffer (described later with virtual memory) Frame.

Computer Architecture Lecture 28 Fasih ur Rehman.

Native Client: A Sandbox for Portable, Untrusted x86 Native Code

IT253: Computer Organization

Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel.

Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.

LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”

Memory management The main purpose of a computer system is to execute programs. These programs, together with the data they access, must be in main memory.

Speculative execution Landon Cox April 13, Making disk accesses tolerable Basic idea Remove disk accesses from critical path Transform disk latencies.

CS161 – Design and Architecture of Computer

Translation Lookaside Buffer

CMSC 611: Advanced Computer Architecture

Lecture 11 Virtual Memory

Introduction to Operating Systems

Non Contiguous Memory Allocation

ECE232: Hardware Organization and Design

Chapter 8: Main Memory.

CS161 – Design and Architecture of Computer

Memory Protection: Kernel and User Address Spaces

Lecture 12 Virtual Memory.

Dynamic Branch Prediction

Virtual Memory User memory model so far:

CS703 - Advanced Operating Systems

Paging COMP 755.

Scheduler activations

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Some Real Problem What if a program needs more memory than the machine has? even if individual programs fit in memory, how can we run multiple programs?

Swapping Segmented paging allows us to have non-contiguous allocations

Architecture Background

Bruhadeshwar Meltdown Bruhadeshwar

Virtual Memory Chapter 8.

CSE 153 Design of Operating Systems Winter 2018

Lecture 28: Virtual Memory-Address Translation

Memory Protection: Kernel and User Address Spaces

Virtual Memory: Concepts /18-213/14-513/15-513: Introduction to Computer Systems 17th Lecture, October 23, 2018.

CSCI1600: Embedded and Real Time Software

Introduction to Operating Systems

Memory Protection: Kernel and User Address Spaces

Memory Protection: Kernel and User Address Spaces

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

CMPT 886: Computer Architecture Primer

Lecture 17: Case Studies Topics: case studies for virtual memory and cache hierarchies (Sections )

Meltdown and Spectre: Complexity and the death of security

Meltdown CSE 351 Winter 2018 Instructor: Mark Wyse

Meltdown / Spectre issue?

Introduction to the Intel x86’s support for “virtual” memory

Lecture 24: Memory, VM, Multiproc

Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory

Mengjia Yan† , Jiho Choi† , Dimitrios Skarlatos,

Meltdown and Spectre: Complexity and the death of security

PROCESSES & THREADS ADINA-CLAUDIA STOICA.

Speculative execution and storage

CSE451 Virtual Memory Paging Autumn 2002

CSE 153 Design of Operating Systems Winter 2019

Paging and Segmentation

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Operating Systems: Internals and Design Principles, 6/E

Isolation Enforced by the Operating System

Memory Protection: Kernel and User Address Spaces

Meltdown & Spectre Attacks

Virtual Memory 1 1.

Presentation transcript:

Landon Cox January 17, 2018 January 22, 2018 Meltdown and Spectre Landon Cox January 17, 2018 January 22, 2018

Understanding these attacks Building blocks Address spaces Speculative execution Cache side channels Attack targets Kernel memory Web browser sandboxes (e.g., javascript)

Speculative execution “Dynamic self-analysis” Allow thread to run into the future on faked data in parallel w/ retrieving the real data If faked data turns out to be same as real data, then you can use speculated results If faked data turns out to different from real data, then you proceed as normal Why is this approach appealing? Doesn’t require apps to be modified If you’re good at guessing the faked data, there can be huge performance benefits However … If you’re not good at guessing faked data, it’s huge waste of effort Also, the incorrect speculation must not produce visible side effects …

Speculative execution B B Seq.instructions Branch instruction Why are branch instructions relatively slow? Pipelined architectures utilize knowledge of the next instruction. On a branch, the next instruction may not be known.

Speculative execution B B Seq.instructions Branch instruction B Seq.instructions Branch instruction Compare speculation input to actual value =? Speculation

Speculative execution B B Seq.instructions Branch instruction B B Seq.instructions Branch instruction If speculation was wrong, discard state. != Speculation

Speculative execution B B Seq.instructions Branch instruction B Seq.instructions Branch instruction =? Speculation

Speculative execution B B Seq.instructions Branch instruction B B Seq.instructions Branch instruction If speculation was correct, swap in state. == Speculation

Speculative execution B B Seq.instructions Branch instruction B B Seq.instructions When speculation is correct, we can get a speed up. Branch instruction Speed up Speculation

Speculative execution B B Seq.instructions Branch instruction B B Seq.instructions Branch instruction A few things have to be true of the speculative execution in case it is wrong… != Speculation

How good speculation went bad Speculation modifies the processor cache Changes to the cache are visible when speculation is wrong Speculation runs without normal protections For example, a speculative thread will operate on page mappings For Meltdown, accessing mappings may ignore protections

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Secret is cached. K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ array1_size and array2 are not cached. $ array2 Victim’s virtual memory

Exploiting branch misprediction Attacker controls the value of x K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Attacker trains CPU to predict x < array1_size K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Primary thread accesses array1_size, causing a cache miss K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Attacker chooses x so that array1[x] lands on secret K $ secret x if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Attacker chooses x so that array1[x] lands on secret K $ secret x if (x < array1_size) { y = array2[k * 256]; } array1 $ array1_size $ $ array2 Victim’s virtual memory

Exploiting branch misprediction Speculative thread reads from address array2[k * 256], which causes a cache miss K $ secret if (x < array1_size) { y = array2[k * 256]; } array1 $ array1_size $ $ k * 256 $ array2 array2 Victim’s virtual memory

Exploiting branch misprediction Meanwhile, main thread realizes that prediction was wrong K $ secret if (x < array1_size) { y = array2[k * 256]; } array1 $ array1_size $ $ k * 256 $ array2 array2 Victim’s virtual memory

Exploiting branch misprediction But! Value of array2[k * 256] is now in the cache. K $ secret if (x < array1_size) { y = array2[k * 256]; } array1 $ array1_size $ $ k * 256 $ array2 array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret Attacker can iterate through values of array2 to recover the secret if it can access array2. if (x < array1_size) { y = array2[k * 256]; } array1 for (i=0; i<N; i++) { time1=getTime(); z = array2[i * 256]; time2=getTime(); if (time2-time1 > BOUND) print “secret = “ + i; } $ array1_size $ $ $ array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ Why was it critical for array2 to be uncached? array1_size $ $ Allowed attacker to see which entry was cached after misprediction array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ Why was it critical for array1_size to be uncached? array1_size $ $ Miss allowed speculative thread time to run ahead of main thread. array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ Why was it critical for secret to be cached? array1_size $ $ Miss on secret would be slow and prevent read of array2[k*256] before main thread finished array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ Does it matter whether array1 is cached or not? array1_size $ $ No, since the attack doesn’t actually read from array1 array2 Victim’s virtual memory

Exploiting branch misprediction K $ secret if (x < array1_size) { y = array2[array1[x] * 256]; } array1 $ Why does the attack read in chunks of 256 bytes? array1_size $ $ x86 cache lines are typically 128 bytes … 256 to be safe? array2 Victim’s virtual memory

How good speculation went bad Why would I want to attack my own address space? Lots of code runs in a managed runtime, e.g., javascript Assumption is that code cannot break out What kind of secrets might malicious javascript read? Browser tabs have their own address space Malicious javascript could read state from other websites E.g., login into google, then browse to mal.org in same tab

Example javascript

simpleByteArray acts as array1 probeTable acts as array2 Example javascript simpleByteArray acts as array1 probeTable acts as array2

Example javascript Like ”k * 256”

Attacking kernel memory 4GB Kernel data (same for all page tables) 3GB (0xc0000000) User data (different for every process) 0GB Virtual memory

Attacking kernel memory Why is this design extremely dangerous if a process could read kernel memory?

Attacking kernel memory In what settings might a malicious process want to read another process’s memory?

Attacking kernel memory What is an example exception?

Attacking kernel memory Explain why line 3 could still be executed.

Attacking kernel memory Load byte value at kernel address into least significant byte of RAX register represented by AL

Attacking kernel memory This will trigger an exception but it will also run in parallel with subsequent instructions

Attacking kernel memory No part of our probe array can be cached.

Attacking kernel memory Multiply the secret kernel value by the page size (4KB or 0xc)

Attacking kernel memory Retry if the multiplied value is zero (we’ll come back to this)

Attacking kernel memory If the multiplied value is non-zero, then index into our probe array with the multiplied value

Attacking kernel memory But how do we read this array after the process is killed? Map probe array into partner process. The probing process will die but the partner will survive.

Attacking kernel memory Why retry on zero? If exception is triggered while reading kernel memory, register value is zeroed out.

Attacking kernel memory Why retry on zero? Don’t want to falsely read zero when register holds it due to losing the race.

Attacking kernel memory How do we detect a true zero? If all of probe array remains uncached, true value was zero.

Attacking kernel memory How do we prevent this attack? Remove kernel mappings from address space except for exception handlers. (KAISER)

Next time A little history lesson THE by Edgar Dijkstra Send me your groups if you haven’t already