Hints for Computer System Design Butler W. Lampson
System Design: Introduction Designing a computer system is hard! Is it like designing an algorithm? Designers hesitate about unclear choices & how they affect Flexibility Performance The freedom to make subsequent choices within the constraints of previous ones because: OSs don’t partition into isolated subsystems very well OS subsystems may interact in unexpected ways
System Design: Introduction Other reasons it’s so hard: Moore’s law OSs don’t improve by a factor of 100 every decade Backward compatibility OSs are fundamentally different than standard apps. Extremely large programs UNIX: 1 Million lines of code Windows: 29 million lines of code OSs have to deal with potentially hostile users & at same time allow users to share information
System Design: Introduction Even more reasons it’s so hard: OSs live for a very long time UNIX is over a quarter of a century old Must prepare for hardware/software changes OS designers do not have a good idea how their systems will ultimately be used Email & web browsers came after UNIX OSs have to run on multiple hardware platforms OSs frequently have to be backward compatible
System Design: The Slogans
Functionality: The Interface Do one thing well (keep it simple) Don’t generalize Extensible OS designs Microkernels: general interface, few assumptions made about implementation Exokernel: kernel provides resource protection, just exports hardware Get it right Make it fast Don’t hide power Virtualization: virtual hardware exposed is functionally identical to the underlying hardware Exokernel: the high-performance of the hardware is just exported to the user
Functionality: The Interface Use procedure arguments to provide flexibility in an interface Leave it to the client Monitors: locking & signaling mechanisms to very little, leaving all the real work to the client Keep basic interfaces stable Keep a place to stand if you must change interfaces (backward compatibility) Examples: Emulation library: microkernels emulate interfaces by trap redirection Elephant file system: Keep One Virtualization: VMM multiplexing at the granularity of an operating system
Functionality: Implementation Plan to throw one away Keep secrets (encapsulation) Virtualization: Hide the messy stuff like NUMA Monitors Allow access only at certain entry points Use a good idea again Divide and Conquer RPC/LRPC/URPC: Progression solved a problem in parts: cross-machine, cross-domain, better cross-domain (less kernel involvement)
Functionality: Completeness Separate normal and worst case The normal case must be fast The worst case must make progress RPC & its variations Cross-domain common case Cross-machine happens only sometimes Specialization Partial evaluation when possible Replugging when necessary
Speed: Interface Split resources Use static analysis VMM: Multiplexing resources Guest OSs aren’t even aware that they’re sharing Use static analysis Specialization (static): predicates known at compile time allows for partial evaluation Dynamic Translation Specialization (dynamic): at run-time predicates that hold for the remainder of execution or some bounded time period (packet filter interpreter)
Speed: Implementation Cache Answers to expensive computations Use hints The Ethernet: exponential backoff Links in web pages When in doubt use brute force Compute in background Use batch processing Soft timers: uses trigger states to batch process handling events to avoid trashing the cache more often than necessary RCU: memory reclamation done in batches Cleanup in log structured file systems: segment cleaning could be scheduled at nighttime.
Speed: Completeness Shed load to control demand Safety First SEDA Adaptive load shedding Apache Bounded thread pool Safety First When allocating resources remember safety first Avoid temptation to try to obtain optimum performance (instead try to avoid disaster) Progression of kernel structuring approaches for extensibility Microkernels, sandboxing, SPIN vs. Mach (some versions) & Chorus
Fault-tolerance End-to-end Log updates We’ll see this next! Log updates Logs can be reliably written/read Logs can be cheaply forced out to disk, which can survive a crash Log structured file systems RAID5 in Elephant Make actions atomic or restartable RCU: changes are seen as atomic to all processes
System Design: Conclusion Just remember: Designing an OS starts with determining what it should do The interface should be simple, complete, & efficient There probably isn’t a “best” solution for even one part of any system Just don’t choose a terrible solution Have a clear division of responsibility among the parts
“The unavoidable price of reliability is simplicity.” System Design: Quotes “Perfection is reached not when there is no longer anything to add, but when there is no longer anything to take away” --A. Saint-Exupery “The unavoidable price of reliability is simplicity.” --C. Hoare
System Design: References Jon Walpole “Hints for Computer System Design” Lampson. Modern Operating Systems, Second Edition. Tannenbaum, pp. 855-894. http://c2.com/cgi/wiki?SpiralModel http://www.dpw.wau.nl/pv/temp/clipart/ http://firstmonday.org/issues/issue7_11/tuomi/