Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS B: Comparing the Linux and Windows Kernels.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Unit OS B: Comparing the Linux and Windows Kernels
Operating Systems Process Scheduling (Ch 3.2, )
OS/2 Warp Chris Ashworth Cameron Davis John Weatherley.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
© Neeraj Suri EU-NSF ICT March 2006 Budapesti Műszaki és Gazdaságtudományi Egyetem Méréstechnika és Információs Rendszerek Tanszék Zoltán Micskei
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
1 Threads, SMP, and Microkernels Chapter 4. 2 Process: Some Info. Motivation for threads! Two fundamental aspects of a “process”: Resource ownership Scheduling.
Operating System Process Scheduling (Ch 4.2, )
Operating System I Process Scheduling. Schedulers F Short-Term –“Which process gets the CPU?” –Fast, since once per 100 ms F Long-Term (batch) –“Which.
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
1 Thursday, June 15, 2006 Confucius says: He who play in root, eventually kill tree.
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
Chapter 13 Embedded Systems
Operating Systems Process Scheduling (Ch 4.2, )
Operating System Organization
Operating System Process Scheduling (Ch 4.2, )
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze CSE 5343/7343 Fall 2006 Case Studies Comparing Windows.
Budapesti Műszaki és Gazdaságtudományi Egyetem Méréstechnika és Információs Rendszerek Tanszék Scheduling in Windows Zoltan Micskei
Chapter 8 Windows Outline Programming Windows 2000 System structure Processes and threads in Windows 2000 Memory management The Windows 2000 file.
Ceng Operating Systems
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 11 Case Study 2: Windows Vista Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Chapter 5: CPU Scheduling (Continuation). 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Determining Length.
Minix Jeff Ward, Robert Burghart, Jeb Collins, Joe Creech.
Windows OS Internals - Copyright © 2005 David A. Solomon, Mark E. Russinovich, and Andreas Polze Unit OS4: Scheduling and Dispatch 4.4. Windows Thread.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS3: Concurrency 3.5. Lab Slides & Lab Manual.
Operating System Examples - Scheduling
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.1. Principles of I/O.
ICOM Noack Operating Systems - Administrivia Prontuario - Please time-share and ask questions Info is in my homepage amadeus/~noack/ Make bookmark.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
April 2000Dr Milan Simic1 Network Operating Systems Windows NT.
Windows 2000 Course Summary Computing Department, Lancaster University, UK.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
CE Operating Systems Lecture 11 Windows – Object manager and process management.
What Every Developer Should Know about the Kernel Dr. Michael L. Collard 1.
Fall 2013 SILICON VALLEY UNIVERSITY CONFIDENTIAL 1 Introduction to Embedded Systems Dr. Jerry Shiao, Silicon Valley University.
Processes Introduction to Operating Systems: Module 3.
Scott Ferguson Section 1
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
CSE 5343/7343UNIX Case Study1 CSE 5343/7343 Fall 2006 Case Studies UNIX History/Processes.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
UNIX & Windows NT Name: Jing Bai ID: Date:8/28/00.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS3: Concurrency 3.3. Advanced Windows Synchronization.
CENG334 Introduction to Operating Systems 1 Erol Sahin Dept of Computer Eng. Middle East Technical University Ankara, TURKEY URL:
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 1.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 3.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS1: Overview of Operating Systems 1.1. Windows.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Operating System Examples - Scheduling. References r er/ch10.html r bangalore.org/blug/meetings/200401/scheduler-
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Chapter 3: Windows7 Part 5.
CS 6560: Operating Systems Design
Unit OS9: Real-Time and Embedded Systems
Unit OS4: Scheduling and Dispatch
Unit OS2: Operating System Principles
Unit OSB: Comparing the Linux and Windows Kernels
Chapter 3: Windows7 Part 2.
Chapter 3: Windows7 Part 5.
Threads, SMP, and Microkernels
Chapter 2: The Linux System Part 1
Chapter 3: Windows7 Part 2.
Chapter 2: The Linux System Part 3
Lecture 4- Threads, SMP, and Microkernels
CPU scheduling decisions may take place when a process:
Threads Chapter 4.
Outline Operating System Organization Operating System Examples
Presentation transcript:

Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS B: Comparing the Linux and Windows Kernels

2 Copyright Notice © David A. Solomon and Mark Russinovich These materials are part of the Windows Operating System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze Microsoft has licensed these materials from David Solomon Expert Seminars, Inc. for distribution to academic organizations solely for use in academic environments (and not for commercial use)

3 The History of Linux The real history of Linux starts in 1969, when Ken Thompson developed the first version of UNIX at Bell Labs After Dennis Ritchie, designer of the C programming language, joined the project it debuted to the research community in an academic paper in 1974 Bell Labs released the first commercial version in 1976 as UNIX Version 6 (V6) UNIX spread throughout universities and in 1978 Bell Labs released UNIX Time-Sharing System, a version with portability in mind

4 Linux History Continued Because Bell Labs distributed UNIX with source code, the early 1980’s saw three major branches grow on the UNIX tree: UNIX System III from Bell Lab’s UNIX Support Group (USG) UNIX Berkeley Source Distribution (BSD) from the University of California at Berkeley Microsoft’s XENIX The UNIX market fragmented further in the 1980’s, despite the IEEE’s POSIX standard and the X/Open Group’s Portability Guide

5 Linus and Linux In 1991 Linus Torvalds took a college computer science course that used the Minix operating system Minix is a “toy” UNIX-like OS written by Andrew Tanenbaum as a learning workbench Linus wanted to make MINIX more usable, but Tanenbaum wanted to keep it ultra-simple Linus went in his own direction and began working on Linux In October 1991 he announced Linux v0.02 In March 1994 he released Linux v1.0

6 The History of Windows (NT) The history of Windows really begins in the mid-1970s, when Dick Hustvedt, Peter Lipman and David Cutler designed the VMS operating system for Digital’s 32-bit VAX processor Digital shipped VMS v1.0 in 1978 Cutler moved to Seattle to open DECWest and worked on the Digital Mica OS for a new CPU codenamed Prism 12 engineers went with him and the facility grew to 200 In 1988 Digital cancelled the project

7 The History of Windows Continued Bill Gates wanted a UNIX rival He hired Cutler and 20 Digital engineers in 1989 The new project was called NT OS/2 because it focused on OS/2 backward compatibility With the success of Windows 3.0’s 1990 release Gates refocused the project on Windows compatibility The project renamed to Windows NT Microsoft released Windows NT 3.1 in August 1993

8 Windows and Linux Both Linux and Windows are based on foundations developed in the mid-1970s UNIX born UNIX public UNIX V6 Linux v1.0 v2.0v2.1v2.2v2.3v2.4v VMS v1.0 Windows NT 3.1 NT 4.0 Windows 2000 Windows XP Server 2003

9 Comparing the Architectures Both Linux and Windows are monolithic All core operating system services run in a shared address space in kernel-mode All core operating system services are part of a single module Linux: vmlinuz Windows: ntoskrnl.exe Windowing is handled differently: Windows has a kernel-mode Windowing subsystem Linux has a user-mode X-Windowing system

10 Kernel Architectures Device Drivers Process Management, Memory Management, I/O Management, etc. X-Windows Application System Services User Mode Kernel Mode Hardware Dependent Code Linux Device Drivers Process Management, Memory Management, I/O Management, etc. Win32 Windowing Application System Services User Mode Kernel Mode Hardware Dependent Code Windows

11 Linux Kernel Linux is a monolithic but modular system All kernel subsystems form a single piece of code with no protection between them Modularity is supported in two ways: Compile-time options Most kernel components can be built as a dynamically loadable kernel module (DLKM) DLKMs Built separately from the main kernel Loaded into the kernel at runtime and on demand (infrequently used components take up kernel memory only when needed) Kernel modules can be upgraded incrementally Support for minimal kernels that automatically adapt to the machine and load only those kernel components that are used

12 Windows Kernel Windows is a monolithic but modular system No protection among pieces of kernel code and drivers Support for Modularity is somewhat weak: Windows Drivers allow for dynamic extension of kernel functionality Windows XP Embedded has special tools / packaging rules that allow coarse-grained configuration of the OS Windows Drivers are dynamically loadable kernel modules Significant amount of code run as drivers (including network stacks such as TCP/IP and many services) Built independently from the kernel Can be loaded on-demand Dependencies among drivers can be specified

13 Comparing Portability Both Linux and Windows kernels are portable Mainly written in C Have been ported to a range of processor architectures Windows i486, MIPS, PowerPC, Alpha, IA-64, x86-64 Only x86-64 and IA-64 currently supported > 64MB memory required Linux Alpha, ARM, ARM26, CRIS, H8300, i386, IA-64, M68000, MIPS, PA-RISC, PowerPC, S/390, SuperH, SPARC, VAX, v850, x86-64 DLKMs allow for minimal kernels for microcontrollers > 4MB memory required

14 Comparing Layering, APIs, Complexity Windows Kernel exports about 250 system calls (accessed via ntdll.dll) Layered Windows/POSIX subsystems Rich Windows API ( functions on top of native APIs) Linux Kernel supports about 200 different system calls Layered BSD, Unix Sys V, POSIX shared system libraries Compact APIs (1742 functions in Single Unix Specification Version 3; not including X Window APIs)

15 Comparing Architectures Processes and scheduling SMP support Memory management I/O File Caching Security

16 Process Management WindowsProcess Address space, handle table, statistics and at least one thread No inherent parent/child relationship Threads Basic scheduling unit Fibers - cooperative user- mode threads Linux Process is called a Task Basic Address space, handle table, statistics Parent/child relationship Basic scheduling unit Threads No threads per-se Tasks can act like Windows threads by sharing handle table, PID and address space PThreads – cooperative user-mode threads

17 Scheduling Priorities Windows Two scheduling classes “Real time” (fixed) - priority Dynamic - priority 1-15 Higher priorities are favored Priorities of dynamic threads get boosted on wakeups Thread priorities are never lowered Fixed Dynamic I/O Windows

18 Scheduling Priorities Windows Two scheduling classes “Real time” (fixed) - priority Dynamic - priority 1-15 Higher priorities are favored Priorities of dynamic threads get boosted on wakeups Thread priorities are never lowered Linux Has 3 scheduling classes: Normal – priority Fixed Round Robin – priority 0-99 Fixed FIFO – priority 0-99 Lower priorities are favored Priorities of normal threads go up (decay) as they use CPU Priorities of interactive threads go down (boost)

19 Scheduling Priorities (cont) Fixed Dynamic I/O Windows Fixed FIFO Fixed Round-Robin Normal CPU I/O Linux

20 Linux Scheduling Details Most threads use a dynamic priority policy Normal class - similar to the classic UNIX scheduler A newly created thread starts with a base priority Threads that block frequently (I/O bound) will have their priority gradually increased Threads that always exhaust their time slice (CPU bound) will have their priority gradually decreased “Nice value” sets a thread’s base priority Larger values = less priority, lower values = higher priority Valid nice values are in the range of -20 to +20 Nonprivileged users can only specify positive nice value Dynamic priority policy threads have static priority zero Execute only when there are no runnable real-time threads

21 Real-Time Scheduling on Linux Linux supports two static priority scheduling policies: Round-robin and FIFO (first in, first out) Selected with the sched-setscheduler( ) system call Use static priority values in the range of 1 to 99 Executed strictly in order of decreasing static priority FIFO policy lets a thread run to completion Thread needs to indicate completion by calling the sched-yield( ) Round-robin lets threads run for up to one time slice Then switches to the next thread with the same static priority RT threads can easily starve lower-prio threads from executing Root privileges or the CAP-SYS-NICE capability are required for the selection of a real-time scheduling policy Long running system calls can cause priority-inversion Same as in Windows; but cmp. rtLinux

22 Windows Scheduling Details Most threads run in variable priority levels Priorities 1-15; A newly created thread starts with a base priority Threads that complete I/O operations experience priority boosts (but never higher than 15) A thread’s priority will never be below base priority The Windows API function SetThreadPriority() sets the priority value for a specified thread This value, together with the priority class of the thread's process, determines the thread's base priority level Windows will dynamically adjust priorities for non-realtime threads

23 Real-Time Scheduling on Windows Windows supports static round-robin scheduling policy for threads with priorities in real-time range (16-31) Threads run for up to one quantum Quantum is reset to full turn on preemption Priorities never get boosted RT threads can starve important system services Such as CSRSS.EXE SeIncreaseBasePriorityPrivilege required to elevate a thread’s priority into real-time range (this privilege is assigned to members of Administrators group) System calls and DPC/APC handling can cause priority inversion

24 Scheduling Timeslices Windows The thread timeslice (quantum) is 10ms-120ms When quanta can vary, has one of 2 values Reentrant and preemptible Fixed: 120ms 20ms Foreground: 60ms Background Linux The thread quantum is 10ms-200ms Default is 100ms Varies across entire range based on priority, which is based on interactivity level Reentrant and preemptible 100ms 200ms 10ms

25 Virtual Memory Management Windows 32-bit versions split user- mode/kernel-mode from 2GB/2GB to 3GB/1GB Demand-paged virtual memory 32 or 64-bits Copy-on-write Shared memory Memory mapped files User System 0 2GB 4GB Linux Splits user-mode/kernel-mode from 1GB/3GB to 3GB/1GB 2.6 has “4/4 split” option where kernel has its own address space Demand-paged virtual memory 32-bits and/or 64-bits Copy-on-write Shared memory Memory mapped files User System 0 3GB 4GB

26 Physical Memory Management Windows Per-process working sets Working set tuner adjust sets according to memory needs using the “clock” algorithm No “swapper” Process LRU Reused PageLinux Global working set management uses “clock” algorithm No “swapper” (the working set trimmer code is called the swap daemon, however) LRU Reused Page Other Process LRU

27 I/O Management Windows Centered around the file object Layered driver architecture throughout driver types Most I/O supports asynchronous operation Internal interrupt request level (IRQL) controls interruptability Interrupts are split between an Interrupt Service Routine (ISR) and a Deferred Procedure Call (DPC) Supports plug-and-play Linux Centered around the vnode No layered I/O model Most I/O is synchronous Only sockets and direct disk I/O support asynchronous I/O Internal interrupt request level (IRQL) controls interruptability Interrupts are split between an ISR and soft IRQ or tasklet Supports plug-and-play IRQL Masked

28 File Caching Windows Single global common cache Virtual file cache Caching is at file vs. disk block level Files are memory mapped into kernel memory Cache allows for zero-copy file serving File Cache File System Driver Disk DriverLinux Single global common cache Virtual file cache Caching is at file vs. disk block level Files are memory mapped into kernel memory Cache allows for zero-copy file serving File Cache File System Driver Disk Driver

29 Security Windows Very flexible security model based on Access Control Lists Users are defined with Privileges Member groups Security can be applied to any Object Manager object Files, processes, synchronization objects, … Supports auditing Linux Two models: Standard UNIX model Access Control Lists (SELinux) Users are defined with: Capabilities (privileges) Member groups Security is implemented on an object-by-object basis Has no built-in auditing support Version 2.6 includes Linux Security Module framework for add-on security models

30 Monitoring - Linux procfs Linux supports a number of special filesystems Like special files, they are of a more dynamic nature and tend to have side effects when accessed Prime example is procfs (mounted at /proc) provides access to and control over various aspects of Linux (I.e.; scheduling and memory management) /proc/meminfo contains detailed statistics on the current memory usage of Linux Content changes as memory usage changes over time Services for Unix implements procfs on Windows

31 Windows’ Evolution Towards Linux Services for Unix really targeted at POSIX, not Linux POSIX threads, full POSIX subsystem (Interix) X Window clients+server (X-Win32 LX) nfs, NIS, pam proc-file system for Windows Configurability / Module Management Windows XP Embedded Target Designer/Component Designer/ Component Management Database Editions targeting new Application Domains Windows Compute Cluster Server 2003 POSIX compatibility in Windows actually predates Linux and was one of the original design goals

32 Linux’s Evolution Towards Windows I/O processing Kernel reentrancy Kernel preemptibility Per-processor memory allocation O(1) scheduler and per-CPU ready queues Zero-Copy SendFile Wake-One socket semantics Asynchronous I/O Light-weight synchronization

33 I/O Processing Linux 2.2 had the notion of bottom halves (BH) for low- priority interrupt processing Fixed number of BHs Only one BH of a given type could be active on a SMP Linux 2.4 introduced tasklets, which are non-preemptible procedures called with interrupts enabled Tasklets are the equivalent of Windows Deferred Procedure Calls (DPCs)

34 Kernel Reentrancy Mark Russinovich’s April 1999 Windows NT Magazine article, “Linux and the Enterprise”, pointed out that much of the Linux 2.2 was not reentrant Ingo Molnar stated in rebuttal: “his example is a clear red herring.” A month later he made all major paths reentrant cpu 1 cpu 2 cpu 1 cpu 2 Non-reentrant Reentrant Time Saved

35 Kernel Preemptibility A preemptible kernel is more responsive to high-priority tasks Through the base release of v2.4 Linux was only cooperatively preemptible There are well-defined safe places where a thread running in the kernel can be preempted The kernel is preemptible in v2.4 patches and v2.6 Windows NT has always been preemptible

36 Scheduling The Linux 2.4 scheduler is O(n) If there are 10 active tasks, it scans 10 of them in a list in order to decide which should execute next This means long scans and long durations under the scheduler lock Ready List Highest Priority Task

37 Scheduling Linux 2.6 has a revamped scheduler that’s O(1) from Ingo Molnar that: Calculates a task’s priority at the time it makes scheduling decision Has per-CPU ready queues where the tasks are pre-sorted by priority Highest-priority Non-empty Queue

38 Scheduling Windows NT has always had an O(1) scheduler based on pre-sorted thread priority queues Server 2003 introduced per-CPU ready queues Linux load balances queues Windows does not Not seen as an issue in performance testing by Microsoft Applications where it might be an issue are expected to use affinity

39 Zero-Copy Sendfile Linux 2.2 introduced Sendfile to efficiently send file data over a socket I pointed out that the initial implementation incurred a copy operation, even if the file data was cached Linux 2.4 introduced zero-copy Sendfile Windows NT pioneered zero-copy file sending with TransmitFile, the Sendfile equivalent, in Windows NT 4 File Data Buffer Network Adapter Buffer Network File Data Buffer Network Driver Network Driver 1-Copy 0-Copy

40 Wake-one Socket Semantics Linux 2.2 kernel had the thundering herd or overscheduling problem In a network server application there are typically several threads waiting for a new connection In v2.2 when a new connection came in all the waiters would race to get it Ingo Molnar’s response: 5/2/99: “here he again forgets to _prove_ that overscheduling happens in Linux.” 5/7/99: “as of my wake-one implementation and waitqueues rewrite went in” In Linux 2.4 only one thread wakes up to claim the new connection Windows NT has always had wake-1 semantics

41 Asynchronous I/O Linux 2.2 only supported asynchronous I/O on socket connect operations and tty’s Linux 2.6 adds asynchronous I/O for direct-disk access AIO model includes efficient management of asynchronous I/O Also added alternate epoll model Useful for database servers managing their database on a dedicated raw partition Database servers that manage a file-based database suffer from synchronous I/O Windows I/O is inherently asynchronous Windows has had completion ports since NT 3.5 More advanced form of AIO

42 Light-Weight Synchronization Linux 2.6 introduces Futexes There’s only a transition to kernel-mode when there’s contention Windows has always had CriticalSections Same behavior Futexes go further: Allow for prioritization of waits Works interprocess as well

43 A Look at the Future The kernel architectures are fundamentally similar There are differences in the details Linux implementation is adopting more of the good ideas used in Windows For the next 2-4 years Windows has and will maintain an edge Linux is still behind on the cutting edge of performance tricks Large performance team and lab at Microsoft has direct ties into the kernel developers As time goes on the technological gap will narrow Open Source Development Labs (OSDL) will feed performance test results to the kernel team IBM and other vendors have Linux technology centers Squeezing performance out of the OS gets much harder as the OS gets more tuned

44 Linux Technology Unknowns Linux kernel forking RedHat has already done it: Red Hat Enterprise Server v3.0 is Linux 2.4 with some Linux 2.6 features Backward compatibility philosophy Linus Torvalds makes decisions on kernel APIs and architecture based on technical reasons, not business reasons

45 Further Reading Transaction Processing Council: SPEC: NT vs Linux benchmarks: benchmarks.html The C10K problem: Linus Torvald’s home: Linux Kernel Archives: Linux history: Veritest Netbench result: Mark Russinovich’s 1999 article, “Linux and the Enterprise”: The Open Group's Single UNIX Specification: