Linux TCP/IP Stack.

Slides:



Advertisements
Similar presentations
IP Forwarding Relates to Lab 3.
Advertisements

Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
Introduction1-1 message segment datagram frame source application transport network link physical HtHt HnHn HlHl M HtHt HnHn M HtHt M M destination application.
CP476 Internet Computing TCP/IP 1 Lecture 3. TCP / IP Objective: A in-step look at TCP/IP Purposes and operations Header specifications Implementations.
CSCI 4550/8556 Computer Networks Comer, Chapter 23: An Error Reporting Mechanism (ICMP)
Chapter 20 Network Layer: Internet Protocol Stephen Kim 20.1.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP/IP Stack Introduction: Looking Under the Hood! Shiv Kalyanaraman Rensselaer Polytechnic Institute.
Chapter 5 The Network Layer.
CS335 Networking & Network Administration Tuesday, May 11, 2010.
CSEE W4140 Networking Laboratory Lecture 6: TCP and UDP Jong Yul Kim
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 03/04/2007.
Chapter 3 Review of Protocols And Packet Formats
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Gursharan Singh Tatla Transport Layer 16-May
CMPT 471 Networking II Address Resolution IPv6 Neighbor Discovery 1© Janice Regan, 2012.
Petrozavodsk State University, Alex Moschevikin, 2003NET TECHNOLOGIES Internet Control Message Protocol ICMP author -- J. Postel, September The purpose.
ICMP (Internet Control Message Protocol) Computer Networks By: Saeedeh Zahmatkesh spring.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Network Layer ICMP and fragmentation.
IP (Internet Protocol) –the network level protocol in the Internet. –Philosophy – minimum functionality in IP, smartness at the end system. –What does.
G64INC Introduction to Network Communications Ho Sooi Hock Internet Protocol.
Exploring the Packet Delivery Process Chapter
1 IP: putting it all together Part 2 G53ACC Chris Greenhalgh.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
1 Chapter Overview TCP/IP DoD model. 2 Network Layer Protocols Responsible for end-to-end communications on an internetwork Contrast with data-link layer.
January 9, 2001 Router Plugins (Crossbow) 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS NetBSD Kernel Topics: IP Processing mbuf structure Loadable Kernel.
LWIP TCP/IP Stack 김백규.
TCP/IP Essentials A Lab-Based Approach Shivendra Panwar, Shiwen Mao Jeong-dong Ryoo, and Yihan Li Chapter 5 UDP and Its Applications.
1 IP: putting it all together Part 1 G53ACC Chris Greenhalgh.
CMPT 471 Networking II Address Resolution IPv4 ARP RARP 1© Janice Regan, 2012.
10/8/2015CST Computer Networks1 IP Routing CST 415.
10/13/2015© 2008 Raymond P. Jefferis IIILect 07 1 Internet Protocol.
10/13/20151 TCP/IP Transmission Control Protocol Internet Protocol.
Fall 2005Computer Networks20-1 Chapter 20. Network Layer Protocols: ARP, IPv4, ICMPv4, IPv6, and ICMPv ARP 20.2 IP 20.3 ICMP 20.4 IPv6.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
ECE 526 – Network Processing Systems Design Networking: protocols and packet format Chapter 3: D. E. Comer Fall 2008.
TCOM 515 IP Routing. Syllabus Objectives IP header IP addresses, classes and subnetting Routing tables Routing decisions Directly connected routes Static.
Hyung-Min Lee ©Networking Lab., 2001 Chapter 8 ARP and RARP.
1 IP : Internet Protocol Computer Network System Sirak Kaewjamnong.
Chapter 81 Internet Protocol (IP) Our greatest glory is not in never failing, but in rising up every time we fail. - Ralph Waldo Emerson.
Internet Protocol ECS 152B Ref: slides by J. Kurose and K. Ross.
1 Internet Control Message Protocol (ICMP) Used to send error and control messages. It is a necessary part of the TCP/IP suite. It is above the IP module.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
CS4550 Computer Networks II IP : internet protocol, part 2 : packet formats, routing, routing tables, ICMP read feit chapter 6.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
TCP/IP Honolulu Community College Cisco Academy Training Center Semester 2 Version 2.1.
CPSC 441 TUTORIAL – FEB 13, 2012 TA: RUITNG ZHOU UDP REVIEW.
Lecture 4 Overview. Ethernet Data Link Layer protocol Ethernet (IEEE 802.3) is widely used Supported by a variety of physical layer implementations Multi-access.
1 Introduction to TCP/IP. 2 OSI and Protocol Stack OSI: Open Systems Interconnect OSI ModelTCP/IP HierarchyProtocols 7 th Application Layer 6 th Presentation.
Understanding IPv6 Slide: 1 Lesson 12 IPv6 Mobility.
CSC 600 Internetworking with TCP/IP Unit 5: IP, IP Routing, and ICMP (ch. 7, ch. 8, ch. 9, ch. 10) Dr. Cheer-Sun Yang Spring 2001.
1 CSE 5346 Spring Network Simulator Project.
Sem1 - Module 10 Routing Fundamentals and Subnets
© Jörg Liebeherr (modified by M. Veeraraghavan) 1 ICMP The PING Tool Traceroute program IGMP.
1 Internetworking: IP Packet Switching Reading: (except Implementation; pp )
UDP : User Datagram Protocol 백 일 우
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Network Layer Protocols COMP 3270 Computer Networks Computing Science Thompson Rivers University.
Packet Switch Network Server client IP Ether IPTCPData.
IP Forwarding Covers the principles of end-to-end datagram delivery in IP networks.
TCP/IP Transmission Control Protocol / Internet Protocol
Net 323: NETWORK Protocols
CS 457 – Lecture 10 Internetworking and IP
Internetworking Outline Best Effort Service Model
Net 323 D: Networks Protocols
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Networking and Network Protocols (Part2)
IP Forwarding Relates to Lab 3.
Network Architecture Models: Layered Communications
ITIS 6167/8167: Network and Information Security
Presentation transcript:

Linux TCP/IP Stack

TCP / IP vs. OSI model 7: Application 6: Presentation 5: Session Process Socket layer 4: Transport 3: Network 2: Data Link Protocol Layer (TCP / IP) Interface Layer (Ethernet, etc.) 1: Physical Layer

TCP/IP Stack Overview Physical Media Process 1: sosend (……………... ) 5: recvfrom(……….) Socket Layer 2: tcp_output ( ……. ) 4: tcp_input ( ……... ) Protocol Layer (TCP Layer) 3: ip_input ( ……... ) 3: ip_output ( ……. ) Protocol Layer (IP Layer) 4: ethernet_output ( ……. ) 2: ethernet_input ( …….. ) Interface Layer (Ethernet Device Driver) Physical Media Output Queue Input Queue

Process Layer to TCP Layer send (int socket, const char *buf, int length, int flags) Process Kernel sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length) sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size) uipc_syscalls.c sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags ) uipc_socket.c tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control ) tcp_userreq.c TCP Layer tcp_output (struct tcpcb *tp) tcp_output.c

Socket Layer sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length) MBUF Chain m_next m_next = NULL m_nextpkt = NULL m_nextpkt = NULL m_len = 100 m_len = 50 28 Bytes m_data 20 Bytes m_data m_type = MT_DATA m_type = MT_DATA data_buffer m_flags = M_PKTHDR m_flags = 0 m_pkthdr.len = 150 Data 128 Bytes mBuf m_pkthdr.recvif =NULL 50 Bytes Data Unused Space 150 Bytes Data 100 Bytes 58 Bytes

sbspace(s->sb_snd) Socket Layer -sosend passes data and control information to the protocol layer sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags ) Initialize a new memory buffer and variables to hold flags Is there enough space in the buffer sbspace(s->sb_snd) no yes Copy data_buffer mbuf int error = tcp_usrreq(s, flags, mbuf, addr, control) More buffers to send? yes error Free the memory buffers received 1 no Return value of error to sendto ( )

TCP Layer - tcp_usrreq(struct socket. s, int request, struct mbuf TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control) Initialize internet protocol control block inp and TCP control block tp to store information useful for TCP Convert Socket to Internet Protocol Control Block inp = sotoinpcb(so) Convert the internet protocol control block to a tcp control block tp = intopcb(inp) request PRU_SEND return error to tcp_userreq( ) int error = tcp_output(tp)

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp) Called by tcp_usrreq for one of the following reasons: To send the initial SYN To send a finished_sending message To send data To send a window update after data has been received. tcp_ouput ( ) functionality: 1. determines whether TCP can send a segment or not depending on: flags in the data sent by the socket layer to send an ACK, etc. Size of window advertised by the receiver’s end. Amount of data ready to send whether unacknowledged data already exists for the connection 2. Calculate the amount of data to be sent depending on: size of receiver’s window number of bytes in the send buffer 3. Check for window shrink 4. Send a segment Allocate a buffer for the TCP and IP header from the header template Copy the TCP and IP header template into the the buffer to be sent. Fill the fields in the TCP header. Decrement the number of buffers to tbe sent, so that the end can be checked. Set sequencenumber and acknowledgement field. Set three fields in the IP header - IP length, TTL and Tos. Pass the datagram to IP

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp) struct socket *so = tp -> t_inpcb -> inp_socket Initialize a tcp header tcp_header Idle is true if the max sequence number equals the oldest unacknowledged sequence number, if an ACK is not expected from the other end. int idle = (tp -> snd_max == tp -> snd_una) false idle Check ACK Flag Acknowledgement is not expected, set the congestion window to one segment tp -> snd_cwnd = tp -> t_maxseg; true

TCP Layer - tcp_output(struct tcpcb *tp) Acknowledgement is not expected, set the congestion window to one segment tp -> snd_cwnd = tp -> t_maxseg; off is the offset in bytes from the beginning of the send buffer of the first data byte to send. off bytes have already been sent and acknowledgement on those is awaited. int off = tp -> snd_nxt - tp -> snd_una Determine length of data that should be transmitted and the flags to be used. len is the minimum number of bytes in the send buffer, win (the minimum of the receiver’s window) and the congestion window. len = min(so -> so_snd.sb_cc, win) - off Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYN flags = tcp _outflags [ tp -> t_state ]

TCP Layer - tcp_output(struct tcpcb *tp) Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYN flags = tcp _outflags [ tp -> t_state ] tp -> t_flags & TF_ACKNOW true Send acknowledgement false tp -> t_flags & TF_SYN || TH_RST true Send sequence number or reset false tp -> t_flags & TH_FIN true Finished sending false

Length of data < 44 Bytes Ckeck flags to determine the type of message: window probe retransmission normal data transmission Allocate an mbuf for the TCP & IP header and data if possible. MGETHDR ( m, M_DONTWAIT, MT_HEADR) M_DONTWAIT indicates that if memory is not available for mbuf then come out of the routine and return an error state. Length of data < 44 Bytes 100 - 40 - 16 no Create a new mbuf chain, copy the surplus data and point it to the first mbuf chain. yes Copy the data from the socket send buffer into the new packet header mbuf ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route, so -> so_options & SO_DONOTROUTE, 0)

ip_output.c ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo) 1. Header initialization 2. Route Selection 3. Source address selection and Fragmentation 1. Header initialization Packets damaged? Check if there were any errors while adding headers in higher layers. Most of the fields of the IP header are pre defined by higher layer protocols. ERROR yes no if ((flags == IP_FORWARDING ) || (flags == IP_RAWOUTPUT )) The value of “flags” decides what’s to be done with the data IP_FORWARDING : Forward packet IP_ROUTETOIF : Route directly to Interface IP_ALLOWBROADCAST : Allow broadcasting of packet IP_RAWOUTPUT : Packet contains pre-constructed header yes If the packet has to be forwarded to another host, i.e if the machine is acting as a router, then the IP header for forwarded packets should not be modified by ip_output. no Save header length in hlen for fragmentation algorithm Construct and initialize IP header set ip_v = 4, clear ip_off assign unique identifier to ip_id length, offset, TTL, protocol, TOS etc are set by higher layers. If the packet is not being forwarded and has to be sent to another host then initialize the IP header.

Verify Cached Route for destination address 2. Route Selection A cached route may be provided to ip_output as an argument. UDP and TCP maintain a route cache associated with each socket. Verify Cached Route for destination address Check if the cached route is the correct destination. If a route has not been provided, ip_output sets a temporary route structure called iproute. If (cached_route == destination) yes Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If the cached route is provided, find the interface on which the frame has to be sent. no If the packet is being routed, rtalloc locates a route to the address specified by dst. If rtalloc fails, an EHOSTUNREACH error is generated. If ip_forward called ip_output the error is converted to an ICMP error. If the address is found then ifp is made to point to thr ifnet structure for the interface. If the next hop is not the packets final destination, then dst is changed to point to the next hop router. Locate route : Call rtalloc(dst_ip) to locate a route to the destination. Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If rtalloc(dst_ip) fails to find a route, return host unreachable error.

3. Source address selection and Fragmentation The final section of the ip_output ensures that the IP header has a valid source IP address. This couldn’t have been done earlier because the route hadn’t been selected yet. If there is no source IP then the IP address of the outgoing interface is used as the source IP. Check if valid source address is specified. no Select the IP address of the outgoing interface as the source address. yes Does the packet have to be fragmented ? yes Fragment the packet if it’s size is greater than the MTU. Larger packets (packets that exceed the MTU) must be fragmented before they can be sent. no In either case (fragmented or not) the checksum is computed (in_cksum). If no errors are found, the data is sent to if_output function of the output interface. If there are no check_sum errors, send the data to if_output function of the selected interface.

Interface Layer (if_ethersubr.c) ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry) 1. Verification 2. Protocol-Specific Processing 3. Frame Construction 4. Interface Queuing. 1. Verification Ethernet port up and running ? ifp -> if_flags & (IF_UP | IF_RUNNING ) no senderr (ENETDOWN) yes

Interface Layer(if_ethersubr. c) - ether_output(struct ifnet Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *rt_entry) Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue. Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing. Arguments - ifp points to outgoing interface’s ifnet structure mbuf is the data to be sent destination is the destination address rt_entry points o the routing entry Initialize- Ethernet header - struct eth_header *eh Ethernet port up and running ? ifp -> if_flags & (IF_UP | IF_RUNNING ) Verification no senderr (ENETDOWN) yes

rt_entry = rtalloc1 (destination, 1) Route valid ? rt_entry = rtalloc1 (destination, 1) senderr (EHOSTUNREACH) 1 Next hop a gateway ? rt = rt -> rt_gwroute 1 Destination responding to ARP requests? If not then do not send more packets to avoid flooding. rt -> rt_flags & RTF_REJECT no Verification Protocol Specific Processing

destination -> sa_family Functionality: Finds Ethernet address corresponding to the IP address of the destination. Protocol Specific Processing destination -> sa_family AF_INET Send ARP broadcast to find the ethernet address corresponding to the destination IP address Use m_copy( ) to keep the packet till an ack. Is recvd. Frame Preparartion

Protocol Specific Processing Frame Preparartion Make sure there is room for the 14 byte ethernet header M_PREPEND ( m, sizeof(ethernet_header), M_DONOTWAIT) Form the Ethernet header from ethernet frame type, ethernet MAC address, unicast ethernet address associated with the output interface. e.g. the default gateway for a host

Is the output queue full Frame Preparartion Interface Queuing Is the output queue full Discard the frame Free the memory buff senderr ( ENOBUFS ) yes no if_snd Place the frame on the interface’s send queue lestart ( ifp ) lestart ( ifp )

Interface Layer(if_le.c) - lestart(struct ifnet *ifp) Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card. struct le_softc *le = & le_softcl [ ifp -> if_unit ] le -> sc_if.if_flags & IFF_RUNNING return error 1 Copy the the frame in mbuf to the hardware buffer Set the IFF_OACTIVE on to indicate that the device is busy transmitting.

ip_input.c void ipintr( ) 1. Verification of incoming packets 2. Option processing and forwarding 3. Packet reassembly 4. Demultiplexing Storing IP packets: ip packets are stored in a chain of mbuf structs in a linked list. Theheader must be stored in one mbuf. Unable to reassemble a complete datagram get IP header from ipintrq in first mbuf 1. If no ip addresses set yet but the interfaces are receiving, can’t do anything with incoming packets yet. This occurs during system initialization when interfaces have not been configured. 2. If length of packet in mbuf < length of struct ip increment ipstat.ips_toosmall. 3. Check ip version 4. Check header length 5. Ip_sum = in_cksum() (ip_sum should be = 0) (used by all protocols although on different parts.) 6. Convert from network byte order to host byte order. 7. Ip_len > m_pkthdr.len indicates that some bytes are missing. 8. Trim buffers if longer than expected. 9. Drop if shorter than expected. Dequeue packets Packets damaged? yes discard Verification no ip_dooptions() ip_dst found? yes host in the same subnet Forwarding 1. Is ip_dst a local address? Look for ip_dst in_ifaddr (list of configured addresses) Ip_dooptions() == 1 ICMP error message no Goto next buffer ip_forwarding == 0 yes no ip_forward ( ) Discard & free mem

ip_forward (struct mbuf *m, int srcrt ) Phase I: Is the packet eligible for forwarding Multicast packet no yes Is packet a link level broadcast packet loopback packet network 0 and class E addresses Ip_mforward ( ) yes no TTL == 1 yes ICMP error message discard Locate next hop m points to the packet to be forwarded. If srcrt == 0, packet is being forwarded because of a source route option. struct route { struct rtentry *ro_rt; // pointer to struct with information struct sockaddr ro_dst; //destination associated with the route entry pointed to by ro_rt. }; no Cache most recent route usually consecutive packets have same destination. Decrement TTL save at most 64 bytes of the packet in case ICMP message has to be sent

ip == null yes Unable to reassemble a complete datagram no Ip points to a full datagram Goto next Map ip_p to a protocol number in ip_protox array Transport Demultiplexing 1 UDP 2 T CP 3 IP(raw) 4 ICMP 5 IGMP Ip_protox [ ]