Mondrix: Memory Isolation for Linux using Mondriaan Memory Protection

Slides:



Advertisements
Similar presentations
Nooks: Safe Device Drivers with Lightweight Kernel Protection Domains Mike Swift, Steve Martin Hank Levy, Susan Eggers, Brian Bershad University of Washington.
Advertisements

Operating Systems Components of OS
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Emmett Witchel Krste Asanović MIT Lab for Computer Science Hardware Works, Software Doesn’t: Enforcing Modularity with Mondriaan Memory Protection.
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Considerations for Mondriaan-like Systems 2009 Workshop on Duplicating, Deconstructing, and Debunking Emmett Witchel University of Texas at Austin.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
CS 300 – Lecture 22 Intro to Computer Architecture / Assembly Language Virtual Memory.
Nooks: an architecture for safe device drivers Mike Swift, The Wild and Crazy Guy, Hank Levy and Susan Eggers.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
Operating System Organization
CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
I/O Systems ◦ Operating Systems ◦ CS550. Note:  Based on Operating Systems Concepts by Silberschatz, Galvin, and Gagne  Strongly recommended to read.
IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS Michael M. Swift Brian N. Bershad Henry M. Levy University of Washington.
Emmett Witchel Krste Asanovic MIT Lab for Computer Science Mondriaan Memory Protection.
Operating Systems ECE344 Ding Yuan Paging Lecture 8: Paging.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Introduction to virtualization
Operating Systems Security
Processes and Virtual Memory
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Efficient software-based fault isolation Robert Wahbe, Steven Lucco, Thomas Anderson & Susan Graham Presented by: Stelian Coros.
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Efficient Software-Based Fault Isolation Authors: Robert Wahbe Steven Lucco Thomas E. Anderson Susan L. Graham Presenter: Gregory Netland.
Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.
CS5204 Fall 20051Oct. 26, 2005 Mondrix: Memory Isolation for Linux using Mondriaan Memory Protection Emmett Witchel Junghwan Rhee Krste Asanovic Sreeram.
Emmett Witchel Josh Cates Krste Asanovic MIT Lab for Computer Science Mondrian Memory Protection.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
CS703 - Advanced Operating Systems By Mr. Farhan Zaidi.
Translation Lookaside Buffer
Efficient Software-Based Fault Isolation
Modularity Most useful abstractions an OS wants to offer can’t be directly realized by hardware Modularity is one technique the OS uses to provide better.
CMSC 611: Advanced Computer Architecture
Module 3: Operating-System Structures
Introduction to Kernel
Introduction to Operating Systems
Free Transactions with Rio Vista
Virtualization Virtualize hardware resources through abstraction CPU
Memory COMPUTER ARCHITECTURE
CS 6560: Operating Systems Design
From Address Translation to Demand Paging
Today How was the midterm review? Lab4 due today.
143A: Principles of Operating Systems Lecture 6: Address translation (Paging) Anton Burtsev October, 2017.
CSE 153 Design of Operating Systems Winter 2018
Introduction to Operating Systems
Introduction to Operating Systems
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS
Chapter 3: Operating-System Structures
Free Transactions with Rio Vista
Process Description and Control
Translation Lookaside Buffer
Hardware Works, Software Doesn’t: Enforcing Modularity with Mondriaan Memory Protection Emmett Witchel Krste Asanović MIT Lab for Computer Science.
Lecture Topics: 11/1 General Operating System Concepts Processes
CSE 451: Operating Systems Autumn 2005 Memory Management
Virtual Memory Overcoming main memory size limitation
Chapter 2: Operating-System Structures
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Outline Operating System Organization Operating System Examples
CSE 471 Autumn 1998 Virtual memory
CS703 - Advanced Operating Systems
CSE 153 Design of Operating Systems Winter 2019
Chapter 2: Operating-System Structures
Presentation transcript:

Mondrix: Memory Isolation for Linux using Mondriaan Memory Protection Emmett Witchel Junghwan Rhee Krste Asanović University of Texas at Austin Purdue University MIT CSAIL

Uniprocessor Performance Not Scaling OS can help HW designers keep their job Graph by Dave Patterson

Lightweight HW Protection Domains Divisions within address space Backwards compatible with binaries, OS, ISA Linear addressing – one datum per address HW complexity about same as TLB Switching protection contexts faster than addressing contexts Protection check off load critical path No pipeline flush on cross-domain call thttpd MySQL find User ide-mod ide-disk ne unix rtc OS

Problems With Modern Modules Single Address Space Modules in a single address space Simple Inter-module calls are fast Data sharing is easy (no marshalling) No isolation Bugs lead to bad memory accesses One bad access crashes system ide.o ide.o Read-write Read-only Execute No access

Current Hardware Broken Page based memory protection Came with virtual memory, not designed for protection A reasonable design point, but not for safe modules Modules are not clean abstractions Hardware capabilities have problems Different programming model Revocation difficult [System/38, M-machine] Tagged pointers complicate machine x86 segment facilities are broken capabilities HW that does not nourish SW Doing it all in SW is as dumb as doing it all in HW. After mondriaan, possibly put in capabilities and x86 segment (brain dead version of capabilities) i432 done with x86 bashing

Mondriaan + Linux = Mondrix Each kernel module in different protection domain to increase memory isolation Failure indicated before data corruption Failures localized, damage bounded Mondriaan Memory Protection (MMP) makes legacy software memory safe Verify HW design by building software (OS) ASPLOS ’02, the MMP permission table Nine months SOSP ’05, Linux support + MMP redesign Two years

Mondrix In Action Memory Addresses Kernel loader establishes initial permission regions 0xFFF… unix.o rtc.o 3 4 No perm Kernel calls mprotect(buf0, RO, 2) Read-write ide.o calls mprotect(req_q,RW,1) mprotect(buf1, RW, 2) Read-only mprotect(kfree, EX, 2) Execute-read If you are working late at night putting a bunch of these together looks like Mondriaan’s works. Pop up small copy of mondriaan painting mprotect(mod_init,EX,1) 0xC00… 1 2 Kernel ide.o Multiple protection domains

Challenges for Mondrix Memory supervisor Manage permissions, enforce sharing policy Memory allocators Keep semantics of kfree even with memory sharing Cross-domain calling (lightweight, local RPC) e.g., kernel calls start_recv in network driver Group domains Permissions for groups of memory locations whose members change with time Device drivers (disk and net) Evaluation (safety and performance)

Memory Supervisor Kernel subsystem to manage memory permissions (Mtop). Not trust kernel. Exports device independent protection API mprot_export(ptr,len,prot,domain-ID) Tracks memory owned by each domain Enforces memory isolation policy Non-owner can not increase permissions Regulates domains joining a group domain Writes protection tables (Mbot) All-powerful. Small. OS Mtop Mbot HW

Memory Allocation Memory allocators kept out of supervisor Allocator finds block of proper length Supervisor grants permissions Supervisor tracks sharing relationships kfree applies to all domains & groups No modifications to kernel to track sharing Slab allocator made MMP aware Allows some writes to uninitialized memory

Cross-Domain Calling Mondrix guarantees: Kernel Module Domain ID Mondrix guarantees: Module only entered at switch gate Return gate returns to instruction after call, to calling domain “Marshalling” = Giving permissions Stack allocated parameters are OK HW writes cross-domain call stack push add call mi ret pop ret Mention stack mi: push mov ret

MMP Hardware CPU Memory Domain ID Stack Regs Program Counter Protection Lookaside Buffer (PLB) Gate Lookaside Buffer Refill Refill Memory Xscale TLB is 15% Stack Permissions Table Permissions Table Only permissions table is large Gate Table

Group Protection Domains Domains need permission on group of related memory objects. Group domain virtual until a regular domain joins. Supervisor regulates membership ide.o 3 No perm Read-write Read-only Execute-read 0xC00… 1 2 Kernel ne.o inodes

Disk and Network Device Drivers Disk driver (EIDE) Permission granted before device read/write Permission revoked after device read/write DMA supported Network driver (NE2000) Permissions tightly controlled Read-write to 32 of 144 bytes of sk_buff Device driver does not write kernel pointers Device does not support DMA

Net Driver Example Kernel loader modifications mprot_export(&skb, PROT_RW,sr_pd); dev->start_recv(skb, dev); // XD mprot_export(&skb,PROT_NONE,sr_pd) Kernel loader modifications start_recv becomes cross-domain call Also add module memory sharing policy Permission grant/revoke explicit

Evaluation Methodolgy Turned x86 into x86 with MMP Instrumented SimICS & bochs machine simulator Complete system simulation, including BIOS 4,000 lines of hardware model of MMP Turned Linux into Mondrix 4,000 lines of memory supervisor top 1,720 lines of memory supervisor bottom 2,000 lines of kernel changes Modified allocators, tough but only done once Modified disk & network code easier

Fault Injection Experiments Ext2 file system, RIO/Nooks fault injector Symptom # runs MMP catch None 157 4 (2.5%) Hang 23 9 (39%) Panic 20 18 (90%) Mondrix prevented 3 of 5 cases where filesystem became corrupt (lost data) MMP detected problems before propagation 2 of 3 errors detected outside device driver

Workloads ./configure for xemacs-21.4.14 thttpd Launches many processes, creates many temporary files thttpd Web server with cgi scripts find /usr –print | xargs grep kangaroo MySQL – client test subset 150 test transactions

Performance Model 1 instruction per cycle 16KB 4-way L1 I & D cache 2MB 8-way associative unified L2 cache 4 GHz processor, 50ns memory L1 miss = 16 cycles, L2 miss 200 cycles Slowdown = Total time of Mondrix workload/Total time of Linux workload

config-xemacs

Performance Benchmark Slow Cyc*109 Mbot Mtop Kern conf-xemacs 4.4% 16.5 2.4% 0.7% 1.3% thttpd 14.8% 0.23 9.3% 2.0% 3.7% find 3.3% 14.3 1.2% 0.8% MySQL 9.6% 0.21 4.0% 2.3% Benchmark Mem XD Cy/XD PLB conf-xemacs 10.2% 0.3% 1,286 0.8% thttpd 1.1% 939 3.8% find 7.8% 0.2% 846 0.4% MySQL 1.6% 0.7% 664 1.7%

Performance, Protection, Programming Incremental performance cost for incremental isolation Loader only (~0.1%) Gates, inaccessible words between strings Memory allocation package (~1.0%) Guard words Fault on accessing uninitialized data Module-specific policies (~10%)

Related Work Safe device drivers with Nooks [Swift ’04] Asbestos [Efstathopoulos ’05] event processes Isolating user state perfect task for MMP Failure oblivious software [Rinard ’04] MMP optimizes out some memory checks Useful to implement safe languages? Unmanaged pieces/unsafe extensions Reduce trusted computing base

Thanks to the PC, and I hope SOSP ’07 accepts ~20% Conclusion Mondrix demonstrates that legacy software can be made safe (efficiently) MMP enables fast, robust, and extensible software systems Previously it was pick two out of three OS should demand more of HW Thanks to the PC, and I hope SOSP ’07 accepts ~20%