LINUX NETWORK IMPLEMENTATION Jianyong Zhang. Introduction The layer structure of network: 1) BSD socket layer: general data structure for different protocols.

Slides:



Advertisements
Similar presentations
Florida State UniversityCOP Advanced Unix Programming Raw Sockets Datalink Access Chapters 25, 26.
Advertisements

Introduction to Sockets Jan Why do we need sockets? Provides an abstraction for interprocess communication.
Device Layer and Device Drivers
Device Drivers. Linux Device Drivers Linux supports three types of hardware device: character, block and network –character devices: R/W without buffering.
COMS W6998 Spring 2010 Erich Nahum
Computer Net Lab/Praktikum Datenverarbeitung 2 1 Overview Sockets Sockets in C Sockets in Delphi.
Socket Programming with IPv6. Why IPv6? Addressing and routing scalability Address space exhaustion Host autoconfiguration QoS of flow using flowlabel.
The Journey of a Packet Through the Linux Network Stack
Socket Options. abstraction Introduction getsockopt and setsockopt function socket state Generic socket option IPv4 socket option ICMPv6 socket option.
Taekyung Kim 0x410 ~ 0x International Standards Organization (ISO) is a multinational body dedicated to worldwide agreement on international.
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
Programming with UDP – I Covered Subjects: IPv4 Socket Address Structure Byte Ordering Functions Address Access/Conversion Functions Functions: 1.socket()
1 SpaceWire Update NASA GSFC November 25, GSFC SpaceWire Status New Link core with split clock domains complete (Much faster) New Router core.
Introduction1-1 message segment datagram frame source application transport network link physical HtHt HnHn HlHl M HtHt HnHn M HtHt M M destination application.
Chapter 7 – Transport Layer Protocols
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP/IP Stack Introduction: Looking Under the Hood! Shiv Kalyanaraman Rensselaer Polytechnic Institute.
Socket Programming.
Tutorial 8 Socket Programming
1 Application TCPUDP IPICMPARPRARP Physical network Application TCP/IP Protocol Suite.
Chapter 3 Review of Protocols And Packet Formats
3-1 Transport services and protocols r provide logical communication between app processes running on different hosts r transport protocols run in end.
IP-UDP-RTP Computer Networking (In Chap 3, 4, 7) 건국대학교 인터넷미디어공학부 임 창 훈.
Gursharan Singh Tatla Transport Layer 16-May
Introduction to Linux Network 劉德懿
Process-to-Process Delivery:
Sockets CIS 370 Fall 2009, UMassD. Introduction  Sockets provide a simple programming interface which is consistent for processes on the same machine.
1 Introduction to Raw Sockets 2 IP address Port address MAC address TCP/IP Stack 67 Bootp DHCP OSPF protocol frame type UDP Port # TCP Port.
1 Networking (Stack and Sockets API). 2 Topic Overview Introduction –Protocol Models –Linux Kernel Support TCP/IP Sockets –Usage –Attributes –Example.
Socket Programming. Introduction Sockets are a protocol independent method of creating a connection between processes. Sockets can be either – Connection.
University of Calgary – CPSC 441.  UDP stands for User Datagram Protocol.  A protocol for the Transport Layer in the protocol Stack.  Alternative to.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
LWIP TCP/IP Stack 김백규.
LWIP TCP/IP Stack 김백규.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Chapter 2 Applications and Layered Architectures Sockets.
The Socket Interface Chapter 22. Introduction This chapter reviews one example of an Application Program Interface (API) which is the interface between.
Network Programming Eddie Aronovich mail:
An initial study on Multi Path Routing Over Multiple Devices in Linux 2.4.x kernel Towards CS522 term project By Syama Sundar Kosuri.
Networking Tutorial Special Interest Group for Software Engineering Luke Rajlich.
CPSC 441 TUTORIAL – FEB 13, 2012 TA: RUITNG ZHOU UDP REVIEW.
Networking Basics CCNA 1 Chapter 11.
Lecture 4 Overview. Ethernet Data Link Layer protocol Ethernet (IEEE 802.3) is widely used Supported by a variety of physical layer implementations Multi-access.
CSE/EE 461 Getting Started with Networking. 2 Basic Concepts A PROCESS is an executing program somewhere. –Eg, “./a.out” A MESSAGE contains information.
Socket Programming.
Linux Operations and Administration Chapter Eight Network Communications.
1 CSE 5346 Spring Network Simulator Project.
CSCI 330 UNIX and Network Programming Unit XIV: User Datagram Protocol.
1 Spring Semester 2008, Dept. of Computer Science, Technion Internet Networking recitation #7 Socket Programming.
UDP : User Datagram Protocol 백 일 우
UDP: User Datagram Protocol Chapter 12. Introduction Multiple application programs can execute simultaneously on a given computer and can send and receive.
Lecture 3 TCP and UDP Sockets CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.
1 Socket Interface. 2 Client-Server Architecture The client is the one who speaks first Typical client-server situations  Client and server on the same.
1 Socket Interface. 2 Basic Sockets API Review Socket Library TCPUDP IP EthernetPPP ARP DHCP, Mail, WWW, TELNET, FTP... Network cardCom Layer 4 / Transport.
Lecture 3: Stateless Packet Filtering. 2 Agenda 1 1 Linux file system - networking sk_buff 2 2 Stateless packet filtering 3 3 About next assignment 4.
Introduction to TCP/IP networking
Socket Option.
LWIP TCP/IP Stack 김백규.
TCP/IP Transmission Control Protocol / Internet Protocol
Introduction of Transport Protocols
Transmission Control Protocol (TCP)
Linux Networks TCP/IP Networking Layers BSD Socket Interface
Process-to-Process Delivery:
Chapter 5 Transport Layer Introduction
Transport Protocols: TCP Segments, Flow control and Connection Setup
Chapter 5 Transport Layer Introduction
Internet Networking recitation #8
Transport Protocols: TCP Segments, Flow control and Connection Setup
Process-to-Process Delivery: UDP, TCP
Transport Layer 9/22/2019.
Presentation transcript:

LINUX NETWORK IMPLEMENTATION Jianyong Zhang

Introduction The layer structure of network: 1) BSD socket layer: general data structure for different protocols. 2) INET socket layer: end points for the IP-based protocols TCP and UDP 3) ARP layer 4) Link layer: Ethernet, SLIP, PLIP 5) Hardware: NIC, serial port, parallel port-

Socket system call C interface system call routines: Socket(), bind(), listen(), connect(), accept(), send(), sendto(), recv(), recvfrom(), getsockopt(), setsockopt(). All are based on the system call socketcall(). Socket() return a file descriptor, read(), write(), select(), ioctl() use struct file: file  f_op  sock_read Socket inode: struct socket *sock_alloc(void) {… inode->i_mode = S_IFSOCK|S_IRWXUGO; inode->i_sock = 1; inode->i_uid = current->fsuid; inode->i_gid = current->fsgid; sock->inode = inode; … }

Generic system call socketcall() function: asmlinkage int sys_socketcall(int call, unsigned long *args) {… unsigned long a0,a1; /* copy_from_user should be SMP safe. */ if (copy_from_user(a, args, nargs[call])) return -EFAULT; a0=a[0]; a1=a[1]; switch(call) { case SYS_SOCKET: err = sys_socket(a0,a1,a[2]); break; case SYS_BIND: err = sys_bind(a0,(struct sockaddr *)a1, a[2]); break; … } …. }

Important structures 1. struct socket { socket_state state; /* SS_FREE, SS_UNCONNECTED, SS_CONNECTING, SS_CONNECTED, SS_DISCONNECTIN*/ unsigned long flags; struct proto_ops *ops; struct inode *inode; struct fasync_struct *fasync_list; /* Asynchronous wake up list*/ struct file *file; /* File back pointer*/ struct sock *sk; struct wait_queue *wait; short type;//SOCK_STREAM, SOCK_DGRAM, SOCK_RAW unsigned char passcred; unsigned char tli; };

Important structures 2. struct proto_ops { int family; int (*dup) (struct socket *newsock, struct socket *oldsock); int (*release) (struct socket *sock, struct socket *peer); int (*bind) (); int (*connect) (); int (*socketpair) (struct socket *sock1, struct socket *sock2); int (*accept) (); int (*getname) (); unsigned int (*poll) (); int (*ioctl) (); int (*listen) (struct socket *sock, int len); int (*shutdown) (struct socket *sock, int flags); int (*setsockopt) (struct socket *sock, int level, int optname, int (*getsockopt) (); int (*fcntl) (); int (*sendmsg) (); int (*recvmsg) (); };

Important structures 3. Struct sk_buff {... }: manage individual communication packets, a doule-link list 4. Struct sock { … } INET socket 5. Struct device {…} contols an abstract network device: network interface.

Getting the data from A to B 1. A,B call socket(), then are connected by calling connect(), accept(). 2. A: write(socket,data.len): verify_area(). {… file = fget(socket); inode = file->f_dentry->d_inode; if (!file->f_op || !(write= file->f_op->write)) goto out; down(&inode->i_sem); ret = write(file, data, len, &file->f_pos); up(&inode->i_sem);… } 3. Sock_write() { …struct socket *sock; sock = socki_lookup(file->f_dentry->d_inode); … msg.msg_iov=&iov; iov.iov_base=(void *)ubuf; … return sock_sendmsg(sock, &msg, size); } 4. For INET socket, it will call inet_sendmsg().

Getting the data from A to B 5. inet_sendmsg() { struct sock *sk = sock->sk; … return sk->prot->sendmsg(sk, msg, size); } /* call tcp_v4_sendmsg() */ 6. Call tcp_do_sendmsg(sk, msg) {… struct sk_buff *skb; tmp = MAX_HEADER + sk->prot->max_header; skb = sock_wmalloc(sk, tmp, 0, GFP_KERNEL); skb_reserve(skb, MAX_HEADER + sk->prot- >max_header); skb->csum = csum_and_copy_from_user(from, skb_put(skb, copy), copy, 0, &err); /*TCP data bytes are SKB_PUT() on top, later TCP+IP+DEV headers are SKB_PUSH()'d beneath. */ tcp_send_skb(sk, skb, queue_it); …}

Getting the data from A to B 5. tcp_send_skb() call tcp_transmit_skb(sk, skb_clone(skb, GFP_KERNEL)); 6. tcp_transmit_skb(struct sock *sk, struct sk_buff *skb) {… struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp); /* Build TCP header and checksum it. */ … tp->af_specific->queue_xmit(skb); 7. Ip_queue_xmit() /* Queues a packet to be sent, and starts the transmitter if necessary. This routine also needs to put in the total length and compute the checksum. */ {… /* Make sure we can route this packet. */ skb->dst = dst_clone(sk->dst_cache); /* OK, we know where to send it, allocate and build IP header. */… /* Do we need to fragment. Again this is inefficient. We need to somehow lock the original buffer and use bits of it. */… /* Add an IP checksum. */…

Getting the data from A to B skb->dst->output(skb); … } 7. Bh synchronization with barrier: start_bh_atomic(void), end_bh_atomic(void) 8. Dev_queue_xmit() {… start_bh_atomic(); q = dev->qdisc; if (q->enqueue) { q->enqueue(skb, q); qdisc_wakeup(dev); end_bh_atomic(); … return;} if (dev->flags&IFF_UP) { dev->hard_start_xmit(skb, dev); end_bh_atomic(); return;} } 9. For the WD8013 card, call ei_start_xmit(), pass the data to network adaptor, which in turn sends the packet to the Ethernet.

Getting the data from A to B 10. The data, embedded in an Ethernet packet, are received by NIC in B. (NIC is assumed WD8013) 11. NIC trigger an interrupt. This is handled by ei_interrupt(). Call ei_receive() (ei_* functions are chip- specific code for many 8390-based ethernet adaptors) 12. Ei_receive() { … struct sk_buff *skb; skb = dev_alloc_skb(pkt_len+2);…. netif_rx(skb); …} 13 netif_rx() receive a packet from a device driver and queue it for the upper (protocol) levels. Call {skb_queue_tail(&backlog,skb); mark_bh(NET_BH); } 14. There is only one list of backlog in the entire system. 15. Do_bottom_half() calls net_bh()

Getting the data from A to B 10. net_bh() {… skb = skb_dequeue(&backlog); /* Bump the pointer to the next structure. skb->data and skb->nh.raw point to the MAC and encapsulated data */ skb->h.raw = skb->nh.raw = skb->data; /* Fetch the packet protocol ID. */ type = skb->protocol; /* We got a packet ID. Now loop over the "known protocols" list. There are two lists. The ptype_all list of taps (normally empty) and the main protocol list which is hashed perfectly for normal protocols. */… if (ptype->type == type && (ptype->dev==skb->dev)) {/*We already have a match queued. Deliver to it*/ skb2=skb_clone(skb, GFP_ATOMIC); pt_prev->func(skb2, skb->dev, pt_prev);…}

Getting the data from A to B 10. Call ip_rcv() {… /* check the header for correctness and deal with all the IP options. Ip_forward() and ip_defrag() */ … return skb->dst->input(skb); } 11 ip_local_deliver() {… /* Reassemble IP fragments.*/ skb = ip_defrag(skb); /*Deliver to raw sockets. This is fun as to avoid copies we want to make no surplus copies. */ … /* Pass on the datagram to each protocol that wants it, based on the datagram protocol. */... ipprot->handler(skb2, ntohs(iph->tot_len) - (iph->ihl * 4)); …} 12 tcp_v4_rcv(), udp_rcv(),icmp_rcv()

Getting the data from A to B 13. tcp_v4_rcv() {… /* check the header for correctness */ … if (!atomic_read(&sk->sock_readers)) return tcp_v4_do_rcv(sk, skb); __skb_queue_tail(&sk->back_log, skb); do_time_wait: case TCP_TW_ACK: tcp_v4_send_ack(); …} 14. tcp_v4_do_rcv() call { …__skb_queue_tail(&nsk->back_log, skb); if (sk->state == TCP_ESTABLISHED) { /* Fast path */ if (tcp_rcv_established(sk, skb, skb->h.th, skb->len)) goto reset; return 0; } tcp_rcv_state_process(sk, skb, skb->h.th, skb->len);…}

Getting the data from A to B 15. TCP receive function for the ESTABLISHED state. * It is split into a fast path and a slow path. The fast path is disabled when: * - A zero window was announced from us - zero window probing * is only handled properly in the slow path. * - Out of order segments arrived. * - Urgent data is expected. * - There is no buffer space left * - Unexpected TCP flags/window values/header lengths are received (detected by checking the TCP header against pred_flags) * - Data is sent in both directions. Fast path only supports pure senders or pure receivers (this means either the sequence number or the ack value must stay constant) * When these conditions are not satisfied it drops into a standard * receive procedure patterned after RFC793 to handle all cases. * The first three cases are guaranteed by proper pred_flags setting, * the rest is checked inline. Fast processing is turned on in * tcp_data_queue when everything is OK.

Getting the data from A to B 16. Tcp_data() enter the buffer sk_buff in the list 17. Data_ready() wake up the waiting processes. 18 The former actions are carried up in the kernel, outside of any process. 19. B executes read(socket, data, len). 20. Through sys_read() --- sock_read() – inet_rcvmsg()– tcp_rcvmsg(). 21 This completes the data’s travels from process A to process B. 22 The data is copied only four times: 1) From the user space of A to kernel memory 2) From kernel memory to network card. 3) From network card to another computer’s kernel memory 4) From B’s kernel memory to B’s user space