Nooks: an architecture for safe device drivers Mike Swift, The Wild and Crazy Guy, Hank Levy and Susan Eggers.

Slides:



Advertisements
Similar presentations
Nooks: Safe Device Drivers with Lightweight Kernel Protection Domains Mike Swift, Steve Martin Hank Levy, Susan Eggers, Brian Bershad University of Washington.
Advertisements

Threads, SMP, and Microkernels
OS Components and Structure
Compaq Enterprise Technical Symposium 2001 OpenVMS on the Itanium TM Processor Family Clair Grant OpenVMS Engineering Clair Grant OpenVMS Engineering.
EXTENSIBILITY, SAFETY AND PERFORMANCE IN THE SPIN OPERATING SYSTEM B. Bershad, S. Savage, P. Pardyak, E. G. Sirer, D. Becker, M. Fiuczynski, C. Chambers,
Bart Miller. Outline Definition and goals Paravirtualization System Architecture The Virtual Machine Interface Memory Management CPU Device I/O Network,
Extensibility, Safety and Performance in the SPIN Operating System Department of Computer Science and Engineering, University of Washington Brian N. Bershad,
Extensible Kernels Edgar Velázquez-Armendáriz September 24 th 2009.
1/21/2008CSCI 315 Operating Systems Design1 Operating System Structures Notice: The slides for this lecture have been largely based on those accompanying.
Operating System Structure. Announcements Make sure you are registered for CS 415 First CS 415 project is up –Initial design documents due next Friday,
1/28/2004CSCI 315 Operating Systems Design1 Operating System Structures & Processes Notice: The slides for this lecture have been largely based on those.
Microkernels: Mach and L4
Exokernel: An Operating System Architecture for Application-Level Resource Management Dawson R. Engler, M. Frans Kaashoek, and James O’Toole Jr. M.I.T.
G Robert Grimm New York University Xen and Nooks.
Virtual Machines. Virtualization Virtualization deals with “extending or replacing an existing interface so as to mimic the behavior of another system”
Virtual Machine Monitors CSE451 Andrew Whitaker. Hardware Virtualization Running multiple operating systems on a single physical machine Examples:  VMWare,
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
1 MASTERING (VIRTUAL) NETWORKS A Case Study of Virtualizing Internet Lab Avin Chen Borokhovich Michael Goldfeld Arik.
CSE598C Virtual Machines and Their Applications Operating System Support for Virtual Machines Coauthored by Samuel T. King, George W. Dunlap and Peter.
CSE 451: Operating Systems Winter 2012 Module 18 Virtual Machines Mark Zbikowski and Gary Kimura.
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
CSE 451: Operating Systems Autumn 2013 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
1 CS503: Operating Systems Part 1: OS Interface Dongyan Xu Department of Computer Science Purdue University.
IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS Michael M. Swift Brian N. Bershad Henry M. Levy University of Washington.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
CS533 Concepts of Operating Systems Jonathan Walpole.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
ICOM Noack Operating Systems - Administrivia Prontuario - Please time-share and ask questions Info is in my homepage amadeus/~noack/ Make bookmark.
Architecture Support for OS CSCI 444/544 Operating Systems Fall 2008.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Virtualization: Not Just For Servers Hollis Blanchard PowerPC kernel hacker.
Improving the Reliability of Commodity Operating Systems.
CS533 Concepts of Operating Systems Jonathan Walpole.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Can We Make Operating Systems Reliable and Secure? Andrew S. Tanenbaum, Jorrit N. Herder, and Herbert Bos Vrije Universiteit, Amsterdam May 2006 Group.
Processes Introduction to Operating Systems: Module 3.
EXTENSIBILITY, SAFETY AND PERFORMANCE IN THE SPIN OPERATING SYSTEM
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
Operating Systems Security
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Full and Para Virtualization
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
CSE 451: Operating Systems Winter 2015 Module 25 Virtual Machine Monitors Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Efficient Software-Based Fault Isolation Authors: Robert Wahbe Steven Lucco Thomas E. Anderson Susan L. Graham Presenter: Gregory Netland.
Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.
Lecture 4 Page 1 CS 111 Online Modularity and Memory Clearly, programs must have access to memory We need abstractions that give them the required access.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Computer System Structures
Virtual Machine Monitors
Kernel Design & Implementation
Operating System Structure
Memory Management Paging (continued) Segmentation
Lecture 24 Virtual Machine Monitors
Modularity and Memory Clearly, programs must have access to memory
IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS
Threads, SMP, and Microkernels
Memory Management Paging (continued) Segmentation
IMPROVING THE RELIABILITY OF COMMODITY OPERATING SYSTEMS
Chapter 3: Operating-System Structures
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
OS Components and Structure
CSE 542: Operating Systems
Outline Operating System Organization Operating System Examples
System calls….. C-program->POSIX call
Virtualization Dr. S. R. Ahmed.
Memory Management Paging (continued) Segmentation
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Presentation transcript:

Nooks: an architecture for safe device drivers Mike Swift, The Wild and Crazy Guy, Hank Levy and Susan Eggers

What are the big problems? Performance? –Solved by Intel Functionality? –Solved by Microsoft Scalability? –Solved by Akamai Reliability? –Solved by Boeing, NASA

Reliability is the problem When do my parents call me? –When their computer crashes. Reliability is getting better! –Computers now execute 100x more cycles between crashes than 10 years ago But that was on a … But I now have three computers in my office and two at home… But my computers are on 24x7 so I can check the weather faster…

Windows 2000 Failure Analysis. Device drivers 16% Core NT 43% Other third- party drivers 16% Anti-virus 12% 12% HardwareFailure13% Source: Brendan Murphy, Sample from PSS Incidents: NT4 Drivers for HCL HW 7% Drivers for NonHCL HW 20% HW Failure 22% Anti-Virus 4% System Config 34% Other 3rd Party Kernel code 11% MSInternalCode 2% Other IFSDrivers 0% Windows 2000

Drivers are the culprit! 32% of NT 4 faults, 27% of W2k faults –Microsoft knows how to fix bugs Drivers are the bulk of the code in the kernel –Accounts for largest portion of source code –Accounts for large portion of runtime code Hardware failures make things worse

Why are drivers hard? Not written by software companies Challenging programming environment Absolute correctness required Complex asynchronous device protocols

What can we do about it? There have been past projects on isolating code: –Multics –Microkernels – Mach, L4, Fluke –Extensible kernels – Spin, Exokernel, Vino –Safe code – SFI, Java Why not isolate drivers?

Goals Preserve investment in existing OS –Don’t require rewrite of large portions of kernel Preserve investments in existing drivers –Allow existing drivers to execute safely with just recompilation Allow different isolation techniques for different drivers, depending on needs –SFI for low-latency –VM protection for high-throughput

Why is this feasible? Drivers: –Have a limited interface to kernel –Have limited dependencies from other code –Are designed to be loaded/unloaded independently –Make few performance-critical calls-backs into kernel

How hard is this? What makes it hard? –Shared state between drivers and kernel –Weak processors What makes it easy? –Read only parameters –Void functions

Architecture

Optimizations Defer as much work as possible –Timers are only manipulated when already context switching –Packets are only received when context switching Provide local resource pools –Local pool of socket buffers, stacks, local heaps

Implementation Implemented in Linux –147 call into kernel –10 interfaces to drivers File operations, VM operations, network device operations, timers, interrupts … 103 calls into drivers Duplicated kernel page table grants drivers read- only access to kernel memory Lowered privileg level prevents drivers from deadlocking

Wrapping and Protection Protection domain switch when calling into drivers –Identify all calls to/from kernel –Implement wrapper functions for all calls Grant drivers read-only access to kernel memory Trap privileged instructions when running at with lowered privileges

Hacks for evaluation Don’t run with separate page table –Just flush TLB instead Don’t run with lowered privileges –Just trap to kernel at appropriate times

Evaluation Test platform: Blackbox machines –1.7 GHz P4 –1 GB sdram –Intel PRO/1000 gigabit Ethernet NIC 200 microsecond round trip time Configurations –Isolate performance impact of wrapping calls, flushing TLB, trapping to kernel

Ongoing / Future work Create page table structure for safe drivers on IA-32 Allow recovery of drivers without full restart –Hardware is idempotent… –Rather than rebooting driver, just retry request

Conclusions Operating systems should remove their dependence on driver safety Processors are fast enough spend a little performance on isolation Existing operating systems can be extended to run existing driver code safely