TDC561 Network Programming Camelia Zlatea, PhD Week 8: Multicasting; Socket Options;
Page 2 Network Programming (TDC561) Winter 2003 References W. Richard Stevens, Network Programming : Networking API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN X) –Chap. 7, 11, 19, 21, 22
Page 3 Network Programming (TDC561) Winter 2003 Addressing in the Internet Addressing tied to reachability –Every host interface has its own IP address –Router interfaces usually have their own IP addresses IP is version 4 (IPv4 addresses) –4 bytes long –two part hierarchy »network number and host number –different types of boundary indicator »class, subnet mask, prefix –Goal of boundaries is address aggregation
Page 4 Network Programming (TDC561) Winter 2003 Address classes Historical first choice –fixed network-host partition, with 8 bits of network number Generalization –Class A addresses have 8 bits of network number –Class B addresses have 16 bits of network number –Class C addresses have 24 bits of network number Distinguished by leading bits of address –leading 0 => class A (first byte < 128) –leading 10 => class B (first byte in the range ) –leading 110 => class C (first byte in the range ) –leading 1110 => class D (multicast) –leading 1111 => Class E (reserved)
Page 5 Network Programming (TDC561) Winter 2003 Address evolution Class based scheme was too inflexible Two problems –Too many routes –Too few addresses Four extensions –Subnetting (flexible boundaries within network) –CIDR (flexible grouping of networks- Classless Inter- domain Routing) –Dynamic host configuration (reuse of addresses) –A bigger address (IPv6) One issue –Network address translation
Page 6 Network Programming (TDC561) Winter 2003 What is Multicast? Multicast is a communication paradigm –1 source, multiple destination Applications: –bulk-data distribution to subscribers »(e.g., newspaper, software, and video tapes distribution), –connection-time-based charging data distribution »(e.g., financial data, stock market information, and news tickets broadcasting), –streaming (e.g., video/audio real-time distribution), –push applications, web-casting, –distance learning, conferencing, collaborative work, distributed simulation, and interactive games.
Page 7 Network Programming (TDC561) Winter 2003 The Internet group model –multicast/group communications means... »1 nas well asn m –a group is identified by a class D IP address ( to ) »abstract notion does not identify any host host_ source host_2 receiver host_3 receiver multicast group multicast router Ethernet multicast router host_1source host_2 Ethernet receiver host_3 site 1 site 2 Internet receiver multicast distribution tree from logical view......to physical view
Page 8 Network Programming (TDC561) Winter 2003 IP Multicast: Basic Idea Multicast groups: abstract “rendez-vous” points. Set up optimal spanning tree spanning participants for each group. Make it cheap by not providing strong guarantees: send out packets and hope for the best.
Page 9 Network Programming (TDC561) Winter 2003 The Internet group model (cont’) the group model is an open model –anybody can belong to a multicast group »no authorization is required –a host can belong to many different groups »no restriction –a source can send to a group, no matter whether it belongs to the group or not »membership not required –the group is dynamic, a host can subscribe to or leave at any time –a host (source/receiver) does not know the number/identity of members of the group
Page 10 Network Programming (TDC561) Winter 2003 Mapping IP Multicast onto Ethernet Multicast IP Multicast (class D IP address): –Class D: 224.x.x.x-239.x.x.x (in HEX: Ex.xx.xx.xx): 28 bits –No further structure (like Class A, B, or C) –Not addresses but identifiers of groups –Some of them are assigned by the IANA to permanent host groups Mapping a class D IP adr. into an Ethernet multicast adr. –The least 23 bits of the Class D address are inserted into the 23 bits of Ethernet multicast address –Many to one mapping: 5 bits are not used –More filtering has to be done at IP level
Page 11 Network Programming (TDC561) Winter 2003 Ethernet Multicast Ethernet is a broadcast medium –Every frame can potentially be seen by every host Ethernet cards have a unique Ethernet address Broadcast address: –ff:ff:ff:ff:ff:ff Ethernet Multicast address range for IP: –01:00:5e:00:00:00 -to- 01:00:5e:7f:ff:ff Mapping IP Multicast onto Ethernet Multicast
Page 12 Network Programming (TDC561) Winter 2003 The Internet group model (cont’) local-area multicast »use the potential diffusion capabilities of the physical layer (e.g. Ethernet) »efficient and straightforward wide-area multicast »requires to go through multicast routers, use IGMP/multicast routing/... »routing in the same administrative domain is simple and efficient »inter-domain routing is complex, not fully operational
Page 13 Network Programming (TDC561) Winter 2003 Multicast and the TCP/IP layered model TCP UDP IP / IP multicast device drivers ICMPIGMP Application Socket layer congestion control reliability mgmt other building blocks multicast routing higher-level services user space kernel space
Page 14 Network Programming (TDC561) Winter 2003 What is Multicast? Several applications need efficient means to transmit data to multiple destinations with: –less bandwidth –higher throughput –lower delay –higher reliability Classification –Data dissemination –Transactions –Large Scale Virtual Environments Build on top of the existing Internet and take into account group communication constraints –Manage groups –Create and maintain multicast routes –Efficient end-to-end delay (reliability, flow control, time constraints)
Page 15 Network Programming (TDC561) Winter 2003 Ideal Multicast Senders (S) and Receivers (R) not aware of each other’s position in the network. Scalable. Low latency (join, data propagation). Low bandwidth and processing overhead. “Reliable”, if this is cheap (“end-to-end”?) Easy to join/leave.
Page 16 Network Programming (TDC561) Winter 2003 Why IP multicast? scalability... –scales to an unlimited number of users reduced costs... –cheaper equipment and access line increased speed... –increases the delivery speed...or multicast? content server ISP and Internet access line client content server ISP and Internet access line client use unicast?
Page 17 Network Programming (TDC561) Winter 2003 Multicast Features: Multicast Scope Control Who gets which packets? –Send everything to everybody.. TTL scope –To keep multicast traffic within an administrative domain by setting ttl thresholds on interfaces on the border router Administratively scoped addresses –A multicast boundary can be setup on the borders for addresses in range of – –Better than ttl scope
Page 18 Network Programming (TDC561) Winter 2003 Multicasting: Receiving multicast message For a process to receive multicast messages it needs to perform the following steps: 1. Create a UDP socket msd msd = socket(AF_INET,SOCK_DGRAM, 0); 2. Bind it to a UDPport, e.g., All processes must bind to the same port in order to receive the multicast messages. struct sockaddr_in groupHost; groupHost.sin_family = AF_INET; groupHost.sin_port = htons(UDPport); groupHost.sin_addr.s_addr = htonl(INADDR_ANY); bind(msd, (struct sockaddr *) &groupHost, sizeof(groupHost))
Page 19 Network Programming (TDC561) Winter 2003 Multicasting: Receiving multicast message (cont’) 3. Join a multicast group address GroupIPaddress, e.g., joinGroup (msd, GroupIPaddress); 4. Use recv or recvfrom to read the messages, e.g., nbytes = recv(msd, recvBuf, BufLen,0);
Page 20 Network Programming (TDC561) Winter 2003 Multicast Groups and Addresses Every IP multicast group has a group address. IP multicast provides only open groups –it is not necessary to be a member of a group in order to send datagrams to the group. Multicast address are like IP addresses used for single hosts, and is written in the same way: A.B.C.D. – Multicast addresses will never clash with host addresses because a portion of the IP address space is specifically reserved for multicast to –Multicast addresses from to are reserved for multicast routing information; –Application programs should use multicast addresses outside this range.
Page 21 Network Programming (TDC561) Winter 2003 Multicasting: Receiving multicast message /* This function sets the socket option to make the local host join the multicast group */ void joinGroup(int s, char *group) { struct sockaddr_in groupStruct; struct ip_mreq mreq; /* multicast group info structure */ if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n"); /* check if group address is indeed a Class D address */ mreq.imr_multiaddr = groupStruct.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY; if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq, sizeof(mreq)) == -1 ) { printf("error in joining group \n"); exit(-1); }
Page 22 Network Programming (TDC561) Winter 2003 Receiving Multicast Datagrams Join a particular multicast group. This is done using another call to setsockopt: struct ip_mreq mreq; setsockopt(sock,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq)); The definition of struct ip_mreq is as follows: struct ip_mreq { struct in_addr imr_multiaddr; /* multicast group to join */ struct in_addr imr_interface; /* interface to join on */ }
Page 23 Network Programming (TDC561) Winter 2003 Multicasting: Receiving multicast message /* This function removes the process from the group */ void leaveGroup(int recvSock,char *group) { struct sockaddr_in groupStruct; struct ip_mreq dreq; /* multicast group info structure */ if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n"); dreq.imr_multiaddr = groupStruct.sin_addr; dreq.imr_interface.s_addr = INADDR_ANY; if( setsockopt(recvSock,IPPROTO_IP,IP_DROP_MEMBERSHIP, (char *) &dreq,sizeof(dreq)) == -1 ) { printf("error in leaving group \n"); exit(-1); } printf("process quitting multicast group %s \n",group); }
Page 24 Network Programming (TDC561) Winter 2003 Multicasting: Sending multicast message For a process to send multicast messages it needs to perform the following: 1. use the UDP socket msd for sending multicast messages struct sockaddr_in dest; dest.sin_family = AF_INET; dest.sin_port = UDPport; dest.sin_addr.s_addr = inet_addr(GroupIPaddress); sendto (msd, sendBuf, BufLen,0, (struct sockaddr *) &dest, sizeof(dest)) ;
Page 25 Network Programming (TDC561) Winter 2003 Multicasting: Sending multicast message (cont’) 2. Join a multicast group address GroupIPaddress, e.g., joinGroup (msd, GroupIPaddress); 3. Use recv or recvfrom to read the messages, e.g., nbytes = recv(msd, recvBuf, BufLen,0);
Page 26 Network Programming (TDC561) Winter 2003 Multicasting: Sending multicast message /* This function sets the socket option to make the local host join the multicast group */ void joinGroup(int s, char *group) { struct sockaddr_in groupStruct; struct ip_mreq mreq; /* multicast group info structure */ if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n"); /* check if group address is indeed a Class D address */ mreq.imr_multiaddr = groupStruct.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY; if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq, sizeof(mreq)) == -1 ) { printf("error in joining group \n"); exit(-1); }
Page 27 Network Programming (TDC561) Winter 2003 Multicasting Time-to-live –control how far the messages can go, e.g., 2 means at most 2 routers away. (default is 1- which will result in multicast packets going only to other hosts on the local network. ) u_char TimeToLive; TimeToLive = 2; setTTLvalue (s, &TimeToLive); /* This function sets the Time-To-Live value */ void setTTLvalue(int s,u_char *ttl_value) { if( setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, (char *) ttl_value, sizeof(u_char)) == -1 ) { printf("error in setting loopback value\n"); }
Page 28 Network Programming (TDC561) Winter 2003 Multicasting Time-to-live –To provide meaningful scope control, multicast routers enforce the following "thresholds" on forwarding based on the TTL field: 0 restricted to the same host 1 restricted to the same subnet 32 restricted to the same site 64 restricted to the same region 128 restricted to the same continent 255 unrestricted
Page 29 Network Programming (TDC561) Winter 2003 Multicasting Loop-back –allow the process to get a copy of its own transmission we use: u_char loop; loop = 1; setLoopback (s, &loop); void setLoopback(int s,u_char loop) { if( setsockopt(s,IPPROTO_IP,IP_MULTICAST_LOOP,(char *) &loop, sizeof(u_char)) == -1 ) { printf("error in disabling loopback\n"); } By default, messages sent to the multicast group are looped back to the local host. this function disables that. loop = 1 /* means enable loopback (default) loop = 0 /* means disable loopback
Page 30 Network Programming (TDC561) Winter 2003 Multicasting Reuse-port –allow multiple multicast processes to to run on the same host: reusePort (s); /* This function sets a socket option that allows multiple processes to bind to the same port */ void reusePort(int s) { int one=1; if ( setsockopt(s,SOL_SOCKET,SO_REUSEADDR,(char *) &one,sizeof(one)) == - 1 ) { printf("error in setsockopt,SO_REUSEPORT \n"); exit(-1); }
Page 31 Network Programming (TDC561) Winter 2003 Multicasting - Example multicast.h multicast.h multicastUtilities.c multicastUtilities.c multicastChat.c multicastChat.c
Page 32 Network Programming (TDC561) Winter 2003 Reliable One-One Communication Use reliable transport protocols (TCP) or handle at the application layer Client/Server semantics in the presence of failures Possibilities –Client unable to locate server –Lost request messages –Server crashes after receiving request –Lost reply messages –Client crashes after sending request
Page 33 Network Programming (TDC561) Winter 2003 Reliable One-Many Communication Reliable multicast –Lost messages => need to retransmit Possibilities –ACK-based schemes »Sender can become bottleneck –NACK-based schemes
Page 34 Network Programming (TDC561) Winter 2003 Atomic Multicast Reliable Group Communication –Processes can fail –Atomicity of Multicast is required »Atomicity? Group Membership –Multicast and a corresponding group of recipients –Failures of processes can be viewed as changes to group membership. System Model –Separating receiving a message and delivering it to a application –Group View: a list of processes associated with a message View Change –A special multicast message –Race between m and vc Condition –Either m is delivered to all processes before a process is delivered a new vc –Or, m is not delivered at all.
Page 35 Network Programming (TDC561) Winter 2003 Atomic Multicast Atomic multicast: a guarantee that all process received the message or none at all –Replicated database example Problem: how to handle process crashes? Solution: group view –Each message is uniquely associated with a group of processes »View of the process group when message was sent »All processes in the group should have the same view (and agree on it) Virtually Synchronous Multicast
Page 36 Network Programming (TDC561) Winter 2003 Reliable Mcast Transport Protocol S, R use windows Designated Receivers eliminate ACK implosion ACK’s sent to DR’s DR’s and S cache data and retransmit it when needed. Smart “session manager” elects DR’s and sets parameters. How? Just like that...
Page 37 Network Programming (TDC561) Winter 2003 RMTP(2) After set up S starts sending data. Receivers send periodic ACK’s after first packet received. If no ACK’s for a long time, connection terminates. DR’s or S retransmit info using unicast or multicast, depending on number of errors. Immediate TX request sent to DR’s, for receivers that join the session. Sender window advance determined by slowest receiver. ACK’s must not be repeated too often. Measure RTT to AP. S adjusts (decreases) send window to 1 if many errors; then increases linearly. DR’s are fixed, but each R chooses its DR. (DR sends SND_ACK_TOME with TTL fixed to a known value).
Page 38 Network Programming (TDC561) Winter 2003 Socket Options Various attributes that are used to determine the behavior of sockets. Setting options tells the OS/Protocol Stack the behavior we want. Support for generic options (apply to all sockets) and protocol specific options.
Page 39 Network Programming (TDC561) Winter 2003 Option types Many socket options are Boolean flags indicating whether some feature is enabled (1) or disabled (0). Other options are associated with more complex types including int, timeval, in_addr, sockaddr, etc. Read-Only Socket Options –Some options are readable only (we can’t set the value).
Page 40 Network Programming (TDC561) Winter 2003 Setting and Getting option values getsockopt() gets the current value of a socket option. setsockopt() is used to set the value of a socket option. #include
Page 41 Network Programming (TDC561) Winter 2003 int getsockopt( int sockfd, int level, int optname, void *opval, socklen_t *optlen); level specifies whether the option is a general option or a protocol specific option (what level of code should interpret the option). getsockopt()
Page 42 Network Programming (TDC561) Winter 2003 int setsockopt( int sockfd, int level, int optname, const void *opval, socklen_t optlen); setsockopt()
Page 43 Network Programming (TDC561) Winter 2003 General Options Protocol independent options. Handled by the generic socket system code. Some general options are supported only by specific types of sockets (SOCK_DGRAM, SOCK_STREAM).
Page 44 Network Programming (TDC561) Winter 2003 Some Generic Options SO_BROADCAST SO_DONTROUTE SO_ERROR SO_KEEPALIVE SO_LINGER SO_RCVBUF,SO_SNDBUF SO_REUSEADDR
Page 45 Network Programming (TDC561) Winter 2003 SO_BROADCAST Boolean option: enables/disables sending of broadcast messages. Underlying DL layer must support broadcasting! Applies only to SOCK_DGRAM sockets. Prevents applications from inadvertently sending broadcasts (OS looks for this flag when broadcast address is specified).
Page 46 Network Programming (TDC561) Winter 2003 SO_DONTROUTE Boolean option: enables bypassing of normal routing. Used by routing daemons.
Page 47 Network Programming (TDC561) Winter 2003 SO_ERROR Integer value option. The value is an error indicator value (similar to errno). Readable only Reading (by calling getsockopt() ) clears any pending error.
Page 48 Network Programming (TDC561) Winter 2003 SO_KEEPALIVE Boolean option: enabled means that STREAM sockets should send a probe to peer if no data flow for a “long time”. Used by TCP - allows a process to determine whether peer process/host has crashed. Consider what would happen to an open telnet connection without keepalive.
Page 49 Network Programming (TDC561) Winter 2003 SO_LINGER Value is of type: struct linger { int l_onoff; /* 0 = off */ int l_linger; /* time in seconds */ }; Used to control whether and how long a call to close will wait for pending ACKS. connection-oriented sockets only.
Page 50 Network Programming (TDC561) Winter 2003 SO_LINGER usage By default, calling close() on a TCP socket will return immediately. The closing process has no way of knowing whether or not the peer received all data. Setting SO_LINGER means the closing process can determine that the peer machine has received the data (but not that the data has been read() !).
Page 51 Network Programming (TDC561) Winter 2003 shutdown() vs SO_LINGER How you can use shutdown() to find out when the peer process has read all the sent data [R.Stevens, 7.5]
Page 52 Network Programming (TDC561) Winter 2003 FIN SN=X FIN SN=X Client Server ACK=X+1 ACK=Y FIN SN=Y FIN SN=Y 3... TCP Connection Termination write close close returns Data queued By TCP App. Reads queued data and FIN close
Page 53 Network Programming (TDC561) Winter 2003 FIN SN=X FIN SN=X Client Server ACK=X+1 ACK=Y FIN SN=Y FIN SN=Y 3... TCP Connection Termination close w/ SO_LINGER write close close returns Data queued By TCP App. Reads queued data and FIN close
Page 54 Network Programming (TDC561) Winter 2003 FIN SN=X FIN SN=X Client Server ACK=X+1 ACK=Y FIN SN=Y FIN SN=Y 3... TCP Connection Termination w/ shutdown write shutdown WR read blocks read returns 0 Data queued By TCP App. Reads queued data and FIN close
Page 55 Network Programming (TDC561) Winter 2003 SO_RCVBUF and SO_SNDBUF Integer values options - change the receive and send buffer sizes. Can be used with STREAM and DGRAM sockets. With TCP, this option effects the window size used for flow control - must be established before connection is made.
Page 56 Network Programming (TDC561) Winter 2003 SO_REUSEADDR Boolean option: enables binding to an address (port) that is already in use. Used by servers that are transient - allows binding a passive socket to a port currently in use (with active sockets) by other processes. Can be used to establish separate servers for the same service on different interfaces (or different IP addresses on the same interface). Virtual Web Servers can work this way.
Page 57 Network Programming (TDC561) Winter 2003 IP Options (IPv4) IP_HDRINCL: used on raw IP sockets when we want to build the IP header ourselves. IP_TOS: allows us to set the “Type-of-service” field in an IP header. IP_TTL: allows us to set the “Time-to-live” field in an IP header.
Page 58 Network Programming (TDC561) Winter 2003 TCP socket options TCP_KEEPALIVE: set the idle time used when SO_KEEPALIVE is enabled. TCP_MAXSEG: set the maximum segment size sent by a TCP socket. TCP_NODELAY: can disable TCP’s Nagle algorithm that delays sending small packets if there is unACK’d data pending. TCP_NODELAY also disables delayed ACKS (TCP ACKs are cumulative).