TELE 402 Lecture 10: Unix domain … 1 Overview Last Lecture –Daemon processes and advanced I/O functions This Lecture –Unix domain protocols and non-blocking I/O –Source: Chapters 15&16&17 of Stevens’ book Next Lecture –Advanced UDP sockets and Threads –Source: Chapters 22&26 of Stevens’ book
TELE 402 Lecture 10: Unix domain … 2 Unix domain sockets A way of performing client-server communication on a single host using the same socket API Two types: stream and datagram Why use Unix domain sockets? –Unix domain sockets are twice as fast as a TCP socket Example: X Window System –Can be used to pass descriptors between processes on the same host –Can provide the client’s credentials (user ID and group IDs) to the server for additional security check (newer implementations)
TELE 402 Lecture 10: Unix domain … 3 Unix domain socket protocol address Are pathnames within the normal filesystem Cannot read from or write to these files except as a socket
TELE 402 Lecture 10: Unix domain … 4 Socket address structure struct sockaddr_un {sa_family_t sun_family; char sun_path[104]; } sun_family should be AF_LOCAL sun_path is a pathname string terminated with a \0. The unspecified address is indicated by a null string as the pathname. The pathname should be an absolute pathname, not a relative pathname. The macro SUN_LEN calculates the length of a sockaddr_un structure.
TELE 402 Lecture 10: Unix domain … 5 sockpair function 1 Creates two sockets that are then connected together int socketpair(int family, int type, int protocol, int sockfd[2]) family must be AF_LOCAL protocol must be 0 type can be either SOCK_STREAM or SOCK_DGRAM.
TELE 402 Lecture 10: Unix domain … 6 sockpair function 2 The two sockets created are returned as sockfd[0] and sockfd[1], which are unnamed. There is no implicit bind involved. They form a stream pipe if their type is SOCK_STREAM. The pipe is full-duplex.
TELE 402 Lecture 10: Unix domain … 7 Differences from inet sockets 1 –Default file permissions for a pathname created by bind should be 0777, modified by umask value. –The pathname associated with a Unix domain socket should be an absolute pathname, not a relative name. –The pathname specified in a call to connect must be a pathname currently bound to an open Unix domain socket of the same type. –A bind will fail if the pathname already exists (use unlink before bind) –The permission testing for connect of a Unix domain socket is the same as if open had been called for write- only access.
TELE 402 Lecture 10: Unix domain … 8 Differences from inet sockets 2 –Unix domain stream sockets are similar to TCP sockets They provide a byte stream with no record boundaries. –If a call to connect finds that the listening socket’s queue is full, ECONNREFUSED is returned immediately –DGRAM sockets are similar to UDP sockets They provide an unreliable datagram service that preserves record boundaries. –Sending a datagram on an unbound Unix domain datagram socket does not bind a pathname to the socket (bind must be called).
TELE 402 Lecture 10: Unix domain … 9 Passing descriptors 1 Descriptors can be shared between processes in the following ways –A child process shares all the open descriptors with the parent after a call to fork –All descriptors normally remain open when exec is called –Pass descriptors using Unix domain sockets and recvmsg
TELE 402 Lecture 10: Unix domain … 10 Passing descriptors 2 Steps involved in passing a descriptor –Create Unix domain sockets (preferably SOCK_STREAM) and connect them for communication between a server and a client –One process opens a descriptor. Any type of descriptor can be exchanged. –Sender builds a msghdr structure containing the descriptor to be passed, and calls sendmsg with the structure across one of the Unix domain sockets –Reciever calls recvmsg to receive the descriptor from the other Unix domain socket. Client and server must have an application protocol so they know when the descriptor is to be passed.
TELE 402 Lecture 10: Unix domain … 11 Example 1 Refer to unixdomain/mycat.c, unixdomain /myopen.c, unixdomain/openfile.c, lib/read_fd.c, and lib/write_fd.c
TELE 402 Lecture 10: Unix domain … 12 Example 2
TELE 402 Lecture 10: Unix domain … 13 Passing user credentials 1 User credentials (user ID, group IDs) can be passed along a Unix domain socket as the fcred structure struct fcred { uid_t fc_ruid; gid_t fc_rgid; char fc_login[MAXLOGNAME]; uid_t fc_uid; short fc_ngroups; gid_t fc_groups[NGROUPS];}
TELE 402 Lecture 10: Unix domain … 14 Passing user credentials 2 The above information is always available on a Unix domain socket, subject to the following conditions –The credentials are sent as ancillary data when data is sent on the Unix domain socket, but only if the receiver of the data has enabled the LOCAL_CREDS socket option. The level for this option is 0. –On a datagram socket, the credentials accompany every datagram. On a stream socket, the credentials are sent only once (the first time data is sent) –Credentials cannot be sent along with a descriptor –Users are not able to forge credentials
TELE 402 Lecture 10: Unix domain … 15 Distributed Shared Memory Use local duplicate of the shared memory Consistency maintenance –Message passing based on UDP –Stop and wait protocol –Client/server model –Two connections between any pair of nodes
TELE 402 Lecture 10: Unix domain … 16 Blocking and nonblocking 1 Input operations: read, readv, recv, recvfrom, and recvmsg –Blocking: if there is no data available in the socket receive buffer, the process is put to sleep –Nonblocking: if there is no data available, the process is returned an error of EWOULDBLOCK
TELE 402 Lecture 10: Unix domain … 17 Blocking and nonblocking 2 Output operations: write, writev, send, sendto, and sendmsg –Blocking: if there is no room in the socket send buffer, the process is put to sleep –Nonblocking: if there is no room at all in the socket send buffer, the process is returned an error of EWOULDBLOCK –In general UDP does not block since it does not have a socket send buffer. Some implementations might block in the kernel due to buffering and flow control.
TELE 402 Lecture 10: Unix domain … 18 Blocking and nonblocking 3 Accepting incoming connections: accept –Blocking: if there is no new connection available, the process is put to sleep –Nonblocking: if there is no new connection available, the process is returned an error of EWOULDBLOCK Initiating outgoing connections: connect –Blocking: the process is blocked for at least the round trip time (RTT) to the server –Nonblocking: if a connection cannot be established immediately, the connection establishment is initiated but the error of EINPROGRESS is returned Some connections can be established immediately, e.g. when the server and the client are on the same host
TELE 402 Lecture 10: Unix domain … 19 Example1 Nonblocking reads and writes for str_cli function refer to strclinonb.c
TELE 402 Lecture 10: Unix domain … 20 Example1 (cont.) Buffer for data from standard input going to the socket
TELE 402 Lecture 10: Unix domain … 21 Example1 (cont.) Buffer for data from the socket going to standard output
TELE 402 Lecture 10: Unix domain … 22 Example Nonblocking Timeline
TELE 402 Lecture 10: Unix domain … 23 Simple version of example1 Use fork to remove blocking factors –Refer to strclifork.c
TELE 402 Lecture 10: Unix domain … 24 Nonblocking connect 1 To set a socket nonblocking, use fcntl to set O_NONBLOCK flag Three uses for nonblocking connect –Overlap other processing with the three-way handshake Should use select to test the connection later –Establish multiple connections at the same time –Shorten the timeout for connect using select with a specified time limit Example –Overlap other processing with three-way handshake –Use select to shorten timeout.
TELE 402 Lecture 10: Unix domain … 25 Nonblocking connect 2 There are a couple of details to attend if we use this technique: –If the server is on the same host, the connection is normally established immediately. We need to handle this. –Berkeley derived have the following rules: The descriptor is writable when the connection completes successfully If connection establishment encounters an error, the descriptor becomes both readable and writable
TELE 402 Lecture 10: Unix domain … 26 Web client Use multiple connections to send requests and to receive response. (refer to nonblock/web.c) Control flow –Associate each request with a nonblocking socket whose connection is initiated, depending on the maximum allowable connections. –Use select to wait for any socket to be ready –Scan the request array to find out if their sockets are readable or writable, and react to each situation accordingly Writable: send request. readable: receive response. –Repeat the above until all requests are processed
TELE 402 Lecture 10: Unix domain … 27 Nonblocking accept Normally nonblocking accept is not necessary if we use select, since when select returns there must be a completed connection However, there is a possibility (because the server is doing something else) that between the call select and the call accept, the client sends a RST to close the connection, which will cause accept to block To fix the problem –Always set a listening socket nonblocking –Ignore the following errors on the call to accept: EWOULDBLOCK, ECONNABORTED, EPROTO, and EINTR
TELE 402 Lecture 10: Unix domain … 28 ioctl ioctl has traditionally been the system interface used for everything that did not fit into some other nicely defined category Posix is getting rid of ioctl However, numerous ioctls remain for implementation-dependent features related to network programming –Obtaining the interface information –Accessing the routing table –Accessing the ARP cache The ioctls introduced here are implementation dependent and may not be supported by Linux
TELE 402 Lecture 10: Unix domain … 29 ioctl int ioctl(int fd, int request, void *arg) Requests can be divided into six categories –Socket operations –File operations –Interface operations –ARP cache operations –Routing table operations –Streams system
TELE 402 Lecture 10: Unix domain … 30 Interface configuration Get interface configuration information –Use SIOCGIFCONF, SIOCGIFFLAGS, and SIOCGIFBRDADDR requests –And others ifconf structure is used as the argument
TELE 402 Lecture 10: Unix domain … 31 SIOCGIFCONF
TELE 402 Lecture 10: Unix domain … 32 SIOCGIFCONF