More on Socket API
More on Socket I/O Functions Scatter read and gather write
More Advanced Socket I/O Functions
Ancillary data - cmsghdr Structure
Socket I/O Summary
Socket and Standard I/O Buffering in Standard I/O library fully buffered: all stream except for terminal devices line buffered : terminal devices unbuffered: stderr Caution Socket 에 standard I/O functions(fgets, fputs) 를 쓰면 fully buffered 됨
Name and Address Conversions
Domain Name System A lookup mechanism for translating objects into other objects A globally distributed, loosely coherent, scalable, reliable, dynamic database Comprised of three components Name space Servers making that name space available Resolvers (clients) which query the servers about the name space
Domain Name Space DNS's distributed database is indexed by domain names. Each domain name is essentially just a path in a large inverted tree, called the domain name space.
Domain, Delegation, Zone
Name Server Architecture Master server Zone transfer Zone data file From disk Authoritative Data (primary master and slave zones) Agent (looks up queries on behalf of resolvers) Cache Data (responses from other name servers) Name Server Process
Authoritative Data Resolver Query Response Authoritative Data (primary master and slave zones) Agent (looks up queries on behalf of resolvers) Cache Data (responses from other name servers) Name Server Process
Using Other Name Servers Arbitrary name server Response Resolver Query Authoritative Data (primary master and slave zones) Agent (looks up queries on behalf of resolvers) Cache Data (responses from other name servers) Name Server Process Response
Cached Data Query Response Authoritative Data (primary master and slave zones) Agent (looks up queries on behalf of resolvers) Cache Data (responses from other name servers) Name Server Process Resolver
Name Resolution A DNS query has three parameters: A domain name (e.g., ice.hufs.ac.kr), Remember, every node has a domain name! A class (e.g., IN), and A type (e.g., A) DNS message format Additional Authority Answer Question Header : the question for the name server : RRs answering the question : RRs pointing toward an authority : RRs holding additional information
Mapping Addresses to Names
Resource Records 주요 RRs SOA Record (Start Of Authority): 해당 도메인에 대해 네임서버가 인증 (authoritative) 된 자료를 갖고 있음을 의미 NS(Name Server) Record: 해당 도메인에 대한 delegation 하는 네임서버를 지시 A Record: 도메인에 IPv4 주소를 mapping AAAA Record: 도메인에 IPv6 주소를 mapping CNAME Record: 도메인에 대한 또 다른 이름 설정 MX(Mail eXchanger) Record: 해당 호스트의 메일 라우팅 경로를 조정 PTR(Pointer) Record: IP 주소를 domain name 으로 reverse mapping 해 주며, Reverse Zone 파일에서 사용 NAME (Dynamic) TYPE(2 Byte) CLASS(2 Byte) TTL(4 Byte) RDLENGTH(2 Byte) RDATA(Dynamic)
Name and Address Conversion Functions Domain name IPv4 address IPv4 address domain name gethostbyname/gethostbyaddr are not reentrant !! static struct hostent host; /* result stored here */ struct hostent * gethostbyname(const char *hostname) { /* call DNS functions for A or AAAA query */ /* fill in host structure */ return(&host); }
Service Name Conversion Functions Service name port ( See /etc/services ) Port service name
Example: name/daytimetcpcli1.c Network-related information
New Name/Address Conversion Function (1) hostname : hostname or address string service : service name or decimal port number string result : addrinfo data structure is dynamically allocated Re-entrant, thread-safe, and protocol independent functions (support IPv4, IPv6)
New Name/Address Conversion Function (2) hints : NULL or pointer to addrinfo data structure The following member can be set by caller ai_flag /* AI_PASSIVE for server, AI_CANNONNAME */ ai_family/* AF_xxx */ ai_socktype/* SOCK_xxx */ ai_protocol/* 0 or IPPROTO_xxx for IPv4 and IPv6 */ Example
getaddrinfo Actions and Results
UNP Library Functions #include “unp.h” struct addrinfo *host_serv(const char *hostname, const char *service, int family, int socktype); Returns: pointer to addrinfo structure if OK, NULL on error int tcp_connect(const char *hostname, const char *service); int tcp_listen(const char *hostname, const char *service, socklen_t *lenptr); Both returns: connected socket descriptor if OK, no return on error int udp_client(const char *hostname, const char *service, void **saptr, socklen_t *lenp); Returns: unconnected socket descriptor if OK, no return on error saptr : address of a pointer to a socket address structure that stores destination IP addr/port # for future calls to sendto int udp_connect(const char *hostname, const char *service); Returns: connected socket descriptor if OK, no return on error int udp_server(const char *hostname, const char *service, socklen_t *lenptr); Rerurns: unconnected socket descriptor if OK, no return on error
tcp_connect and tcp_listen Functions
Protocol-indep Daytime TCP Client/Server names/daytimetcpsrv1.cnames/daytimetcpcli.c
Protocol-indep Daytime UDP Client/Server names/daytimeudpsrv2.c names/daytimeudpcli1.c
getnameinfo Function If the caller does not want to return host ( serv ) string, specify hostlen ( servlen ) of 0 flags
Socket Options
getsockopt and setsockopt Functions
Generic Socket Options(1) SO_BROADCAST Enable the ability of the process to send broadcast messages Application must set this option before broadcasting SO_DEBUG Kernel keeps track of detailed information about all the packets sent or received by TCP for the socket SO_DONTROUTE Bypass the normal routing mechanism The packet is directed to the appropriate local interface. If the local interface cannot be determined (i.e. destination is not on the same network), ENETUNREACH is returned. SO_ERROR Get the value of pending error Socket 에 error 가 발생하면, so_error (pending error) 에 먼저 setting 하고 나서 errno 에 setting 함
Generic Socket Options(2) SO_KEEPALIVE Enforce TCP to send a keepalive probe automatically to the peer if no data has been exchanged for 2 hours (to detect if the peer host crashes) Peer’s response is either ACK: everything is OK (application is not notified) RST: the peer host has crashed and rebooted. Socket’s pending error = ECONNRESET and socket is closed No response: send 8 additional probes, 75 sec apart (totally, 11mim and 15sec). Listen the peer’s response. Still no response ETIMEOUT and socket is closed Receives ICMP error, EHOSTUNREACH or … To shorten probe period Use TCP_KEEPALIVE option (Not all implementation support this option). But, this will affect all sockets on the host. Or, implement your own heartbeat mechanism.
Generic Socket Options(3) SO_LINGER: linger when the socket is closed Data type for option value If l_onoff = 0, default close(). i.e. return immediately, but the remaining data is delivered to the peer else linger when the socket is closed If l_linger==0, TCP aborts the connection. i.e. TCP discards any data still remaining in the socket send buffer and sends an RST to the peer. else linger when the socket is closed. Any data still remaining in the socket send buffer is sent. But, if linger time expires, return EWOULDBLOCK and any remaining data is discarded.
Generic Socket Options(4) SO_OOBINLINE OOB data will be placed in the normal input queue SO_RCVBUF and SO_SNDBUF Set or get socket send buffer size / socket receive buffer size TCP: advertise receive buffer(window) to the peer sliding window mechanism(flow control) buffer size 3 * MSS UDP: typical send buffer size 9000, receive buffer size 40,000, If data received > available receive buffer, datagram is discarded Capacity of full-duplex pipe (bandwidth-delay product) bandwidth * RTT (<= 64KB) SO_RCVLOWAT, SO_SNDLOWAT: socket low water mark data in socket receive buffer >= SO_RCVLOWAT readable available space in socket send buffer >= SO_SNDLOWAT writable SO_RCVLOWAT = 1, SO_SNDLOWAT = 2048 by default SO_RCVTIMEO, SO_SNDTIMEO: socket time out for receiving and sending message
Generic Socket Options(6) SO_REUSEADDR: reuse the established port Allows a listening server to start and bind its well- known port even if previously established connections exist that use this port Allows multiple instances of the same server to be started on the same port as long as each instance binds a different local IP address hosting multiple HTTP servers using IP alias technique Allows a single process to bind the same port to multiple socket as long as each bind specifies a different local IP address Allows completely duplicate bindings when same IP address and port are already bound to another socket. Used with multicasting to allow the same application to be run multiple times on the same host when the SO_REUSEPORT option is not supported
TCP Socket Options TCP_KEEPALIVE: specifies the idle time in seconds for the connection before TCP starts sending keepalive probes 7200 sec (2 hours), default Effective only when SO_KEEPALIVE socket option is enabled TCP_MAXSEG: fetch or set MSS for TCP connection TCP_NODELAY e.g) rlogin, telnet Nagle algorithm / delayed algorithm disabled TCP’s Nagle algorithm No small packet( < MSS ) will not be sent until the existing data is ACKed TCP’s delayed ACK algorithm piggyback: TCP sends an ACK after some small amount of time( 50 ~ 200msec)
Nagle Algorithm Enabled/Disabled Nagle Algorithm Enabled No small packet( < MSS ) will not be sent until the existing data is ACKed Nagle Algorithm Disabled