Protection, identity, and trust Jeff Chase Duke University.

Slides:



Advertisements
Similar presentations
Remote Procedure Call (RPC)
Advertisements

Access Control Jeff Chase Duke University. The need for access control Processes run programs on behalf of users. (“subjects”) Processes create/read/write/delete.
CSE331: Introduction to Networks and Security Lecture 28 Fall 2002.
CSI 400/500 Operating Systems Spring 2009 Lecture #20 – Security Measures Wednesday, April 29 th.
Inter Process Communication:  It is an essential aspect of process management. By allowing processes to communicate with each other: 1.We can synchronize.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
G Robert Grimm New York University Protection and the Control of Information Sharing in Multics.
What Is TCP/IP? The large collection of networking protocols and services called TCP/IP denotes far more than the combination of the two key protocols.
CSE 451: Operating Systems Autumn 2013 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
CSE 451: Operating Systems Winter 2012 Processes Mark Zbikowski Gary Kimura.
Agenda  Terminal Handling in Unix File Descriptors Opening/Assigning & Closing Sockets Types of Sockets – Internal(Local) vs. Network(Internet) Programming.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Systems Security & Audit Operating Systems security.
Protection, identity, and trust Jeff Chase Duke University.
UNIX! Landon Cox September 3, Dealing with complexity How do you reduce the complexity of large programs? Break functionality into modules Goal.
Controlling Files Richard Newman based on Smith “Elementary Information Security”
Jozef Goetz, Application Layer PART VI Jozef Goetz, Position of application layer The application layer enables the user, whether human.
D u k e S y s t e m s CPS 210 Security in Networked Systems Always Use Protection Jeff Chase Duke University
4P13 Week 1 Talking Points. Kernel Organization Basic kernel facilities: timer and system-clock handling, descriptor management, and process Management.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 14: Protection.
Inter-process communication: Socket. socket Internet socket From Wikipedia, the free encyclopedia Jump to: navigation,
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Protection (Chapter 14)
CE Operating Systems Lecture 21 Operating Systems Protection with examples from Linux & Windows.
(a) What is the output generated by this program? In fact the output is not uniquely defined, i.e., it is not always the same. So please give three examples.
Linux Security. Authors:- Advanced Linux Programming by Mark Mitchell, Jeffrey Oldham, and Alex Samuel, of CodeSourcery LLC published by New Riders Publishing.
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
CE Operating Systems Lecture 13 Linux/Unix interprocess communication.
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
Lecture 18 Page 1 CS 111 Online OS Use of Access Control Operating systems often use both ACLs and capabilities – Sometimes for the same resource E.g.,
Operating Systems Security
The Client-Server Model And the Socket API. Client-Server (1) The datagram service does not require cooperation between the peer applications but such.
Wireless and Mobile Security
CSC414 “Introduction to UNIX/ Linux” Lecture 2. Schedule 1. Introduction to Unix/ Linux 2. Kernel Structure and Device Drivers. 3. System and Storage.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
1 Process Description and Control Chapter 3. 2 Process A program in execution An instance of a program running on a computer The entity that can be assigned.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 2.
Lecture 5 Rootkits Hoglund/Butler (Chapters 1-3).
The UNIX Time-Sharing System Landon Cox February 10, 2016.
Implementing Remote Procedure Call Landon Cox February 12, 2016.
LINUX Presented By Parvathy Subramanian. April 23, 2008LINUX, By Parvathy Subramanian2 Agenda ► Introduction ► Standard design for security systems ►
Security-Enhanced Linux Stephanie Stelling Center for Information Security Department of Computer Science University of Tulsa, Tulsa, OK
Chapter 29: Program Security Dr. Wayne Summers Department of Computer Science Columbus State University
Protection, identity, and trust Illustrated with Unix “setuid” Jeff Chase Duke University.
Information Flow Control for Standard OS Abstractions Landon Cox April 6, 2016.
Modularity Most useful abstractions an OS wants to offer can’t be directly realized by hardware Modularity is one technique the OS uses to provide better.
Kernel Design & Implementation
Introduction to Operating Systems
Protecting Memory What is there to protect in memory?
Operating Systems Protection Alok Kumar Jagadev.
Chapter 14: System Protection
Chapter 3: Process Concept
Protecting Memory What is there to protect in memory?
Protecting Memory What is there to protect in memory?
Outline What does the OS protect? Authentication for operating systems
The UNIX Time-Sharing System
Outline What does the OS protect? Authentication for operating systems
Chapter 3: Windows7 Part 4.
Introduction to Operating Systems
Chapter 14: Protection.
CE Operating Systems Lecture 21
Chapter 2: The Linux System Part 2
Lecture Topics: 11/1 General Operating System Concepts Processes
Chapter 14: Protection.
Authorization and Identity
CS510 Operating System Foundations
Preventing Privilege Escalation
Isolation Enforced by the Operating System
Presentation transcript:

Protection, identity, and trust Jeff Chase Duke University

Processes and you Your processes act on your behalf. They represent you to the outside world. They can spy on you. Why do you trust them? What is “trust” anyway? How might an attacker exploit your trust? What about servers?

The need for access control Processes run on behalf of users. (“subjects”) Processes create/read/write/delete files. (“objects”) The OS kernel mediates these accesses. How should the kernel determine which subjects can access which objects? This problem is called access control or authorization (“authz”). It also encompasses the question of who is authorized to make a given statement. The concepts are general, but we can consider Unix as an initial example.

Principles of Computer System Design Saltzer & Kaashoek 2009 Concept: reference monitor Reference monitor Example: Unix kernel Isolation boundary

More terminology A reference monitor monitors references to its objects… …by a subject (one or more), also called a principal. – I generally reserve the term “principal” for an entity with responsibility and accountability in the real world (e.g.: you). – A subject identity may be a program speaking for a user, which is distinct from the user herself. The reference monitor must determine the true and correct identity of the subject. – This is called authentication. Example: system login – We’ll return to this later. In Unix, once a user is logged in, the kernel maintains the identity across chained program executions. – Things are different on a network, where there is no single trusted kernel to anchor all interactions.

Reference monitor What is the nature of the isolation boundary? If we’re going to post a guard, there should also be a wall. Otherwise somebody can just walk in past the guard, right? subject requested operation “boundary” protected state/objects program Alic e guard identity

Example: Unix kernel The Unix kernel is an example of a reference monitor. – Reference monitor is a general concept. – We’re going to keep most of the discussion general and abstract. How does the kernel protect itself? What is the boundary? The kernel protects all objects accessed by user processes through the syscall API.

Concept: access control matrix Alice Bob obj1obj2 RW R --- We can imagine the set of all allowed accesses for all subjects over all objects as a huge matrix. How is the matrix stored?

Concept: ACLs and capabilities Alice Bob obj1obj2 RW R --- How is the matrix stored? Capabilities: each subject holds a list of its rights and presents them as proof of access rights. A capability is an unforgeable token that confers rights to access a specific object. Access control list (ACL): each object stores a list of subjects permitted to access it. capability list ACL

Concept: roles or groups Alice Bob wizardstudent yes no A role or group is a named set S of subjects. Or, equivalently, it is a named boolean predicate that is true for s iff s  S. Roles/groups provide a level of indirection that simplifies the access control matrix, and makes it easier to store and manage. roles, groups wizard student obj1obj2 objects RW R ---

Attributes and policies More generally, we may view a subject/object identity as a list of named attributes with typed values (e.g., boolean). Generally, an access policy for a requested access is a boolean (logic) function over the subject and object identities. name = “Alice” wizard = true Name = “obj1” access: wizard name = “Bob” student = true Name = “obj2” access: student

Access control: general model We use this simple model for identity and authorization. Real-world access control schemes use many variants of and restrictions of the model, with various terminology. – Role-Based Access Control (RBAC) – Attribute-Based Access Control (ABAC) A guard is the code that checks access for an object, or (alternatively) the access policy that the code enforces. There are many extensions. E.g., some roles are nested (a partial order or lattice). The guard must reason about them. – Security clearance level TopSecret  Secret [subset] – Equivalently: TopSecret(s)  Secret(s) [logical implication]

Identity in an OS (Unix) Every process has a security label that governs access rights granted to it by kernel. Abstractly: the label is a list of named attributes and values. An OS defines a space of attributes and their interpretation. Some attributes and values represent an identity bound to the process. Unix: e.g., userID: uid There is a special administrator uid==0 called root or superuser or su. Root “can do anything”: normal access checks do not apply to root. uid=“alice” Alice uid credentials Every Unix user account has an associated userID (uid). (an integer)

Unix: identity and access A Unix program running in a process with Alice’s identity can do anything that Alice herself can do. In particular, a process running as Alice can access any of Alice’s files. – Saved passwords, security keys, etc. – Programs and their configuration files – Browser history, cookies… How does Alice sleep at night? How does the system determine what program executions can run as Alice? Alice

Init and Descendents Kernel “handcrafts” initial root process to run “init” program. Other processes descend from init, and also run as root, including user login guards. Login invokes a setuid system call to run user shell in a child process after user authenticates. Children of user shell inherit the user’s identity (uid).

Protection systems Every process (or other entity) has a label that governs its access rights on the system. The label is a list of named attributes and optional values. Each system defines the space of attributes and their interpretation. Some attributes and values represent an identity bound to the process. E.g.: uid=“alice” login shell tool log in fork, setuid(“alice”), exec fork/exec uid=“alice” Every file and every process is labeled/tagged with a user ID. A root process may change its user ID. A process inherits its userID from its parent process. Alice

Unix: setuid and login A process with uid==root may change its userID with the setuid system call. This means that a root process can speak for any user or act as any user, if it tries. This mechanism enables system processes to set up an environment for a user after login. login shell tool log in fork, setuid(“alice”), exec fork/exec uid=“alice” Alice A privileged login program verifies a user password and starts a command interpreter (shell) and/or window manager for a logged-in user. A user may then interact with a shell to direct launch of other programs. They run as children of the shell, with the user’s uid.

while (1) { Char *cmd = getcmd(); int retval = fork(); if (retval == 0) { // This is the child process // Setup the child’s process environment here // E.g., where is standard I/O, how to handle signals? exec(cmd); // exec does not return if it succeeds printf(“ERROR: Could not execute %s\n”, cmd); exit(1); } else { // This is the parent process; Wait for child to finish int pid = retval; wait(pid); } Inside the Shell

Reference monitor revisited subject (with attributes) requested operation “boundary” protected state/objects program The reference monitor pattern can work with various protection mechanisms. Let’s talk about it generally. Later we’ll see how it applies to servers as well. Alic e identity

Labels and access control login shell tool foo login shell tool log in fork, setuid(“alice”), exec fork/exec creat(“foo”) write,close open(“foo”) read fork/exec fork, setuid(“bob”), exec owner=“alice” uid=“alice” uid=“bob” Every file and every process is labeled/tagged with a user ID. A process inherits its userID from its parent process. A file inherits its owner userID from its creating process. A privileged process may set its user ID. Alice Bob

Labels and access control login shell tool foo login shell tool creat(“foo”) write,close open(“foo”) read owner=“alice” uid=“alice” uid=“bob” Should processes running with Bob’s userID be permitted to open file foo? Alice Bob Every system defines rules for assigning security labels to subjects (e.g., Bob’s process) and objects (e.g., file foo). Every system defines rules to compare the security labels to authorize attempted accesses.

File permissions in Unix (vanilla) The owner of a Unix file may tag it with a mode value specifying access rights for subjects. (Don’t confuse with kernel/user mode.) – Access types = {read, write, execute} [3 bits] – Subject types = {owner, group, other/anyone} [3 bits] – If the file is executed, should the system setuid the process to the userID of the file’s owner. [1 bit, the setuid bit] – 10 bits total: (3x3)+1. Usually given in octal: e.g., “777” means 9 bits set: anyone can r/w/x the file, but no setuid. – Unix “mode bits” are a simple form of an access control list (ACL). Later systems like AFS have richer ACLs. Unix provides a syscall and shell command for owner to set the permission bits on each file or directory (inode). – chmod, chown, chgrp “Group” was added later and is a little more complicated: a user may belong to multiple groups.

Another way to setuid login shell tool log in fork setuid(“alice”) exec fork exec(“sudo”) setuid bit set /usr/bin/sudo uid=“root” Alice Example: sudo program runs as root, checks user authorization to act as superuser, and (if allowed) executes requested command as root. fork exec(“tool) sudo A program file itself may be tagged to setuid to program owner’s uid on exec. This is called setuid bit. Some call it the most important innovation of Unix. It enables users to request protected ops by running secure programs. The user cannot choose the code: only the program owner can choose the code to run with the owner’s uid. Parent cannot subvert/control a child after program launch: a property of Unix exec.

Protected programs login shell tool log in fork setuid(“alice”) exec fork/exec uid = 0 uid=“root” Alice Example: sudo program runs as root, checks user authorization to act as superuser, and (if allowed) executes requested command as root. fork exec(“tool”) sudo xkcd.com “sudo tool” With great power comes great responsibility.

The secret of setuid The setuid bit is best seen as a mechanism for programs to function as reference monitors. A setuid program can govern access to files. – Protect the files with the program owner’s uid. – Make them inaccessible to other users. – Other users can access them via the setuid program. – But only in the manner permitted by the program. – Example: “moo accounting problem” in 1974 paper (cryptic) What is the reference monitor’s “isolation boundary”? What protects it from its callers? Note: setuid can also be used to restrict the program’s privileges by stripping it of the user’s uid.

MOO accounting problem Multi-player game called Moo Want to maintain high score in a file Should players be able to update score? Yes Do we trust users to write file directly? No, they could lie about their score High score Game client (uid y) Game client (uid y) Game client (uid x) “x’s score = 10” “y’s score = 11” [Landon Cox]

MOO accounting problem Multi-player game called Moo Want to maintain high score in a file Could have a trusted process update scores Is this good enough? High score Game client (uid y) Game client (uid y) Game client (uid x) Game server “x’s score = 10” “y’s score = 11” “x:10 y:11” [Landon Cox]

MOO accounting problem Multi-player game called Moo Want to maintain high score in a file Could have a trusted process update scores Is this good enough? Can’t be sure that reported score is genuine Need to ensure score was computed correctly High score Game client (uid y) Game client (uid y) Game client (uid x) Game server “x’s score = 100” “y’s score = 11” “x:100 y:11” [Landon Cox]

Access control Insight: sometimes simple inheritance of uids is insufficient Tasks involving management of “user id” state Logging in (login) Changing passwords (passwd) Why isn’t this code just inside the kernel? This functionality doesn’t really require interaction w/ hardware Would like to keep kernel as small as possible How are “trusted” user-space processes identified? Run as super user or root (uid 0) Like a software kernel mode If a process runs under uid 0, then it has more privileges [Landon Cox]

Access control Why does login need to run as root? Needs to check username/password correctness Needs to fork/exec process under another uid Why does passwd need to run as root? Needs to modify password database (file) Database is shared by all users What makes passwd particularly tricky? Easy to allow process to shed privileges (e.g., login) passwd requires an escalation of privileges How does UNIX handle this? Executable files can have their setuid bit set If setuid bit is set, process inherits uid of image file’s owner on exec [Landon Cox]

MOO accounting problem Multi-player game called Moo Want to maintain high score in a file How does setuid solve our problem? Game executable is owned by trusted entity Game cannot be modified by normal users Users can run executable though High-score is also owned by trusted entity This is a form of trustworthy computing Only trusted code can update score Root ownership ensures code integrity Untrusted users can invoke trusted code High score Gam e client (root) Gam e client (root) Gam e client (root) Gam e client (root) Shell (uid y) Shell (uid y) Shell (uid x) Shell (uid x) “ fork/exec game” “x’s score = 10” “y’s score = 11” “ fork/exec game”

Reference monitor revisited What is the nature of the isolation boundary?

Protection: abstract concept Programs run with an ability to access some set of data/state/objects. We call that set a domain or protection domain. – Defines the “powers” that the running code has available to it. – Determined by context and/or identity (label/attributes). – The domain is not (directly) a property of the program itself. – Protection domains may overlap. Some mechanism must prevent the code from accessing state outside of its domain. Examples?

Protection Programs may invoke other programs. – E.g., call a procedure/subprogram/module. – E.g., launch a program with exec. Often/typically/generally the invoked program runs in the same protection domain as the caller. (Or i.e., it has the same privilege over some set of objects. It may also have some private state. We’re abstracting.) Examples ?

Protection Real systems always use protection. They have various means to invoke a program with more/less/different privilege than the caller. In the reference monitor pattern, B has powers that A does not have, and that A wants. B executes operations on behalf of A, with safety checks. AB Be sure that you understand the various examples from Unix. Why?

Subverting protection If we talk about protection systems, we must also talk about how they can fail. One way is for an attacker to breach the boundary. A more common way is for an attacker to somehow choose the code that executes inside the boundary. For example, it can replace the guard, install a backdoor, or just execute malicious ops directly. Install or control code inside the boundary. Inside job But how?

Program integrity

Malware [Source: somewhere on the web.] [Source: Google Chrome comics.]

Any program you install or run can be a Trojan Horse vector for a malware payload.

Trusting Programs In Unix – Programs you run use your identity (process UID). – Maybe you even saved them with setuid so others who trust you can run them with your UID. – The programs that run your system run as root. You trust these programs. – They can access your files – send mail, etc. – Or take over your system… Where did you get them?

Trusting Trust Perhaps you wrote them yourself. – Or at least you looked at the source code… You built them with tools you trust. But where did you get those tools?

Where did you get those tools? Thompson’s observation: compiler hacks cover tracks of Trojan Horse attacks.

Login backdoor: the Thompson Way Step 1: modify login.c – (code A) if (name == “ken”) login as root – This is obvious so how do we hide it? Step 2: modify C compiler – (code B) if (compiling login.c) compile A into binary – Remove code A from login.c, keep backdoor – This is now obvious in the compiler, how do we hide it? Step 3: distribute a buggy C compiler binary – (code C) if (compiling C compiler) compile code B into binary – No trace of attack in any (surviving) source code

Reflections on trust What is “trust”? – Trust is a belief that some program or entity will be faithful to its expected behavior. Thompson’s parable shows that trust in programs is based on a chain of reasoning about trust. – We often take the chain for granted. And some link may be fragile. – Corollary: successful attacks can propagate down the chain. The trust chain concept also applies to executions. – We trust whoever launched the program to select it, configure it, and protect it. But who booted your kernel?

How much damage can malware do? If it compromises a process with your uid? – Read and write your files? – Overwrite other programs? – Install backdoor entry points for later attacks? – Attack your account on other machines? – Attack your friends? – Attack other users? – Subvert the kernel? What if an attacker compromises a process running as root?

Process-based protection, isolation, interprocess communication, and servers, and…

Alic e

Isolation We need protection domains and protected contexts (“sandboxes”), even on single-user systems like your smartphone. There are various dimensions of isolation for protected contexts (e.g., processes): Fault isolation. One app or app instance (context or process) can fail independently of others. If it fails, the OS can kill the process and reclaim all of its memory, etc. Performance isolation. The OS manages resources (“metal and glass”: computing power, memory, disk space, I/O throughput capacity, network capacity, etc.). Each instance needs the “right amount” of resources to run properly. The OS prevents apps from impacting the performance of other apps. E.g., the OS can prevent a program with an endless loop from monopolizing the processor or “taking over the machine”. (How?) Security. An app may contain malware that tries to corrupt the system, steal data, or otherwise compromise the integrity of the system. The OS uses protected contexts and a reference monitor to check and authorize all accesses to data or objects outside of the context.

Example: Chrome browser [Google Chrome Comics]

Processes in the browser [Google Chrome Comics] Chrome makes an interesting choice here. But why use processes?

Problem: heap memory and fragmentation [Google Chrome Comics]

Solution: whack the whole process [Google Chrome Comics] When a process exits, all of its virtual memory is reclaimed as one big slab.

Processes for fault isolation [Google Chrome Comics]

Concept: enforced modularity pipe (or other channel) An important theme By putting each program or component instance in a separate process, we can enforce modularity boundaries among components. Each component runs in a sandbox: they can interact only through pipes or some other channel. Neither can access the internals of the other.

Communication endpoints and channels channel pipe binding connection endpoint port data transfer stream flow messages request/reply “Remote Procedure Call” (RPC) events operations advertise (bind) listen connect (bind) close write/send read/receive If one side advertises a named endpoint, we call it a server. If one side initiates a channel to a named endpoint, we call it a client.

Interprocess communication (IPC) and services RPC (API) GET/PUT (content) Clients initiate connection and send requests. Server listens for and accepts clients, handles requests, sends replies

Concept: RPC Remote Procedure Call (RPC) is request/response interaction through a published API, using IPC messaging to cross an inter- process boundary. API stubs RPC is used in many standard Internet services. It is also the basis for component frameworks like DCOM, CORBA, and Android. Software is packaged into named “objects” or components. Components may publish interfaces and/or invoke published interfaces of other components. Components may execute in different processes and/or on different nodes. generated from an Interface Description Language (IDL) Establishing an RPC connection to a named remote interface is often called binding.

Server as reference monitor What is the nature of the isolation boundary? Clients can interact with the server only by sending messages through a socket channel. The server chooses the code that handles received messages. subject requested operation “boundary” protected state/objects program Alic e guard

“Subsystems” A server process may provide trusted system functions to other processes, outside of the kernel. – E.g., this code is trusted, but like other processes it cannot manipulate the hardware state except by invoking the kernel. Android AMS subsystem JVM+lib Example: Android Activity Manager subsystem provides many functions of Android, e.g., component launch and brokering of component interactions. With no special kernel support! It uses same syscalls as anyone else. AMS controls app contexts by forking them with a trusted lib, and issuing RPC commands to that lib. Linux kernel “binder” messaging driver in kernel

Binding to a service (Android) public abstract boolean bindService (Intent service, ServiceConnection conn, int flags) Connect to an application service, creating it if needed. …The given conn will receive the service object when it is created and be told if it dies and restarts. … This function will throw SecurityException if you do not have permission to bind to the given service. In Android, binding is a protected operation.

Subverting services There are lots of security issues here. TBD Q: Is networking secure? How can the client and server authenticate over a network? How can they know the messages aren’t tampered? How to keep them private? A: crypto. TBD Q: Can an attacker inject malware scripting into my browser? What are the isolation defenses? Q for now: Can an attacker penetrate the server, e.g., to choose the code that runs in the server? Install or control code inside the boundary. Inside job But how?

“confused deputy”

wrd[3] wrd[2] wrd[1] wrd[0] const2=0 wrd[3] wrd[2] wrd[1] wrd[0] const2=0 main b= 0x00234 RA=0x804838c b= 0x00234 RA=0x804838c cap 0xfffffff 0x0 Memory void cap (char* b){ for (int i=0; b[i]!=‘\0’; i++) b[i]+=32; } int main(char*arg) { char wrd[4]; strcpy(arg, wrd); cap (wrd); return 0; } 0x x804838c Code Stack … SP 0x00234 What can go wrong? Can overflow wrd variable … Overwrite cap’s RA …

The Point You should understand the basics of a “stack smash” or “buffer overflow” attack as a basis for pathogens. These have caused a lot of pain and damage in the real world.

Networked services: big picture Internet “cloud” server hosts with server applications client applications NIC device kernel network software client host Data is sent on the network as messages called packets.

Sockets The socket() system call creates a socket object. Other socket syscalls establish a connection (e.g., connect). A file descriptor for a connected socket is bidirectional. Bytes placed in the socket with write are returned by read in order. The read syscall blocks if the socket is empty. The write syscall blocks if the socket is full. Both read and write fail if there is no valid connection. A socket is a buffered channel for passing data over a network. socket client int sd = socket( ); gethostbyname(“ connect(sd, ); write(sd, “abcdefg”, 7); read(sd, ….);

Unix “file descriptors” illustrated user space socket per-process descriptor table kernel space “open file table” Disclaimer: this drawing is oversimplified pointer There’s no magic here: processes use read/write (and other syscalls) to operate on sockets, just like any Unix I/O object (“file”). A socket can even be mapped onto stdin or stdout. Deeper in the kernel, sockets are handled differently from files, pipes, etc. Sockets are the entry/exit point for the network protocol stack. int fd pipe file tty

The network stack, simplified TCP/IP Client Network adapter Global IP Internet TCP/IP Server Network adapter Internet client hostInternet server host Sockets interface (system calls) Hardware interface (interrupts) User code Kernel code Hardware and firmware Note: the “protocol stack” should not be confused with a thread stack. It’s a layering of software modules that implement network protocols: standard formats and rules for communicating with peers over a network.

Ports and packet demultiplexing Data is sent on the network in messages called packets addressed to a destination node and port. Kernel network stack demultiplexes incoming network traffic: choose process/socket to receive it based on destination port. Network adapter hardware aka, network interface controller (“NIC”) Incoming network packets Apps with open sockets

TCP/IP Ports Each transport endpoint on a host has a logical port number (16-bit integer) that is unique on that host. This port abstraction is an Internet Protocol concept. – Source/dest port is named in every IP packet. – Kernel looks at port to demultiplex incoming traffic. What port number to connect to? – We have to agree on well-known ports for common services – Look at /etc/services – Ports 1023 and below are ‘reserved’. Clients need a return port, but it can be an ephemeral port assigned dynamically by the kernel.

A peek under the hood chase$ netstat -s tcp: packets sent data packets ( bytes) 4927 data packets ( bytes) retransmitted ack-only packets (10662 delayed) window update packets packets received acks (for bytes) duplicate acks packets ( bytes) received in-sequence completely duplicate packets ( bytes) 225 old duplicate packets 24 packets with some dup. data (2126 bytes duped) out-of-order packets ( bytes) 73 discarded for bad checksums connection requests 21 connection accepts

inetd Classic Unix systems run an inetd “internet daemon”. Inetd receives requests for standard services. – Standard services and ports listed in /etc/services. – inetd listens on all the ports and accepts connections. For each connection, inetd forks a child process. Child execs the service configured for the port. Child executes the request, then exits. [Apache Modeling Project: Modeling Project:

Children of init: inetd New child processes are created to run network services. They may be created on demand on connect attempts from the network for designated service ports. Should they run as root?

Multi-process server architecture Each of P processes can execute one request at a time, concurrently with other processes. If a process blocks, the other processes may still make progress on other requests. Max # requests in service concurrently == P The processes may loop and handle multiple requests serially, or can fork a process per request. – Tradeoffs? Examples: – inetd “internet daemon” for standard /etc/services – Design pattern for (Web) servers: “prefork” a fixed number of worker processes.