COMS W6998 Spring 2010 Erich Nahum

Slides:



Advertisements
Similar presentations
Florida State UniversityCOP Advanced Unix Programming Raw Sockets Datalink Access Chapters 25, 26.
Advertisements

Discussion Monday ( ). ver length 32 bits data (variable length, typically a TCP or UDP segment) 16-bit identifier header checksum time to live.
The Journey of a Packet Through the Linux Network Stack
CS 457 – Lecture 16 Global Internet - BGP Spring 2012.
Introduction1-1 message segment datagram frame source application transport network link physical HtHt HnHn HlHl M HtHt HnHn M HtHt M M destination application.
Chapter 20 Network Layer: Internet Protocol Stephen Kim 20.1.
Internet Protocol (IP)
The Network Layer Chapter 5. The IP Protocol The IPv4 (Internet Protocol) header.
Internet Control Message Protocol (ICMP). Introduction The Internet Protocol (IP) is used for host-to-host datagram service in a system of interconnected.
Chapter 5 The Network Layer.
EEC-484/584 Computer Networks Lecture 10 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
11- IP Network Layer4-1. Network Layer4-2 The Internet Network layer forwarding table Host, router network layer functions: Routing protocols path selection.
CS335 Networking & Network Administration Tuesday, May 11, 2010.
TCP/IP Protocol Suite 1 Chapter 11 Upon completion you will be able to: User Datagram Protocol Be able to explain process-to-process communication Know.
Chapter 3 Review of Protocols And Packet Formats
Network Layer4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side,
Network Layer4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side,
1 Internet Control Message Protocol (ICMP) RIZWAN REHMAN CCS, DU.
IP-UDP-RTP Computer Networking (In Chap 3, 4, 7) 건국대학교 인터넷미디어공학부 임 창 훈.
Chapter 4 Queuing, Datagrams, and Addressing
ICMP (Internet Control Message Protocol) Computer Networks By: Saeedeh Zahmatkesh spring.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Network Layer ICMP and fragmentation.
TCOM 509 – Internet Protocols (TCP/IP) Lecture 03_a
資 管 Lee Lesson 5 IP Packets: Delivery and Routing IP Layer operation.
10/8/2015CST Computer Networks1 IP Routing CST 415.
1 CSE3213 Computer Network I Network Layer (7.1, 7.3, ) Course page: Slides modified from Alberto Leon-Garcia.
Fall 2005Computer Networks20-1 Chapter 20. Network Layer Protocols: ARP, IPv4, ICMPv4, IPv6, and ICMPv ARP 20.2 IP 20.3 ICMP 20.4 IPv6.
TCOM 515 IP Routing. Syllabus Objectives IP header IP addresses, classes and subnetting Routing tables Routing decisions Directly connected routes Static.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A note on the use of these.
Dr. John P. Abraham Professor UTPA
Chapter 81 Internet Protocol (IP) Our greatest glory is not in never failing, but in rising up every time we fail. - Ralph Waldo Emerson.
Internet Protocol ECS 152B Ref: slides by J. Kurose and K. Ross.
CS4550 Computer Networks II IP : internet protocol, part 2 : packet formats, routing, routing tables, ICMP read feit chapter 6.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
Internet Protocols. Address Resolution IP Addresses are not recognized by hardware. If we know the IP address of a host, how do we find out the hardware.
An initial study on Multi Path Routing Over Multiple Devices in Linux 2.4.x kernel Towards CS522 term project By Syama Sundar Kosuri.
Transport Layer3-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
Network Layer4-1 Datagram networks r no call setup at network layer r routers: no state about end-to-end connections m no network-level concept of “connection”
1 Kyung Hee University Chapter 8 Internet Protocol (IP)
CSC 600 Internetworking with TCP/IP Unit 5: IP, IP Routing, and ICMP (ch. 7, ch. 8, ch. 9, ch. 10) Dr. Cheer-Sun Yang Spring 2001.
Network Layer by peterl. forwarding table routing protocols path selection RIP, OSPF, BGP IP protocol addressing conventions datagram format packet handling.
The Internet Network layer
Data Communications and Computer Networks Chapter 4 CS 3830 Lecture 19 Omar Meqdadi Department of Computer Science and Software Engineering University.
© Jörg Liebeherr (modified by M. Veeraraghavan) 1 ICMP The PING Tool Traceroute program IGMP.
1 Kyung Hee University Chapter 11 User Datagram Protocol.
IP Fragmentation. Network layer transport segment from sending to receiving host on sending side encapsulates segments into datagrams on rcving side,
1 COMP 431 Internet Services & Protocols The IP Internet Protocol Jasleen Kaur April 21, 2016.
IP Internet Protocol. IP TCP UDP ICMPIGMP ARP PPP Ethernet.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Introduction to Networks
Internet Control Message Protocol (ICMP)
Computer Communication Networks
Internet Control Message Protocol (ICMP)
Chapter 11 User Datagram Protocol
Chapter 4 Network Layer All material copyright
Internet Control Message Protocol (ICMP)
Chapter 7: The Infamous IP
Net 323: NETWORK Protocols
Internet Control Message Protocol (ICMP)
CS 457 – Lecture 10 Internetworking and IP
Internet Control Message Protocol (ICMP)
Internet Control Message Protocol (ICMP)
Internet Control Message Protocol (ICMP)
Chapter 7: The Infamous IP
Internet Control Message Protocol (ICMP)
Wide Area Networks and Internet CT1403
Net 323 D: Networks Protocols
Chapter 15. Internet Protocol
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
32 bit destination IP address
Presentation transcript:

COMS W6998 Spring 2010 Erich Nahum Network Layer: IP COMS W6998 Spring 2010 Erich Nahum

Outline IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Recall what IP Does Encapsulate/ decapsulate transport-layer messages into IP datagrams Routes datagrams to destination Handle static and/or dynamic routing updates Fragment/ reassemble datagrams Unreliably IP-packet format 3 7 15 31 Version IHL Codepoint Total length Fragment-ID D F M F Fragment-Offset Time to Live Protocol Checksum Source address Destination address Options and payload

IP Implementation Architecture Higher Layers ip_input.c ip_output.c ROUTING Forwarding Information Base ip_queue_xmit ip_local_deliver_finish ip_local_out ip_route_input ip_route_output_flow NF_INET_LOCAL_INPUT ip_forward.c NF_INET_LOCAL_OUTPUT NF_INET_FORWARD ip_local_deliver ip_forward ip_forward_finish ip_output ip_rcv_finish MULTICAST NF_INET_POST_ROUTING ip_mr_input ip_finish_output NF_INET_PRE_ROUTING ip_rcv ip_finish_output2 ARP neigh_resolve_ output dev.c dev.c dev_queue_xmit netif_receive skb

Sources of IP Packets Packets arrive on an interface and are passed to the ip_rcv() function. TCP/UDP packets are packed into an IP packet and passed down to IP via ip_queue_xmit(). The IP layer generates IP packets itself: Multicast packets Fragmentation of a large packet ICMP/IGMP packets.

Outline IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

What is Netfilter? A framework for packet “mangling” A protocol defines "hooks" which are well-defined points in a packet's traversal of that protocol stack. IPv4 defines 5 Other protocols include IPv6, ARP, Bridging, DECNET At each of these points, the protocol will call the netfilter framework with the packet and the hook number. Parts of the kernel can register to listen to the different hooks for each protocol. When a packet is passed to the netfilter framework, it will call all registered callbacks for that hook and protocol.

Netfilter IPv4 Hooks NF_INET_PRE_ROUTING NF_INET_LOCAL_IN Incoming packets pass this hook in ip_rcv() before routing NF_INET_LOCAL_IN All incoming packets addressed to the local host pass this hook in ip_local_deliver() NF_INET_FORWARD All incoming packets not addressed to the local host pass this hook in ip_forward() NF_INET_LOCAL_OUT All outgoing packets created by this local computer pass this hook in ip_build_and_send_pkt() NF_INET_POST_ROUTING All outgoing packets (forwarded or locally created) will pass this hook in ip_finish_output()

Netfilter Callbacks Kernel code can register a call back function to be called when a packet arrives at each hook. and are free to manipulate the packet. The callback can then tell netfilter to do one of five things: NF_DROP: drop the packet; don't continue traversal. NF_ACCEPT: continue traversal as normal. NF_STOLEN: I've taken over the packet; stop traversal. NF_QUEUE: queue the packet (usually for userspace handling). NF_REPEAT: call this hook again.

IPTables A packet selection system called IP Tables has been built over the netfilter framework. It is a direct descendant of ipchains (that came from ipfwadm, that came from BSD's ipfw), with extensibility. Kernel modules can register a new table, and ask for a packet to traverse a given table. This packet selection method is used for: Packet filtering (the `filter' table), Network Address Translation (the `nat' table) and General preroute packet mangling (the `mangle' table).

Outline IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Naming Conventions Methods are frequently broken into two stages (where the second has the same name with a suffix of finish or slow, is typical for networking kernel code.) E.g., ip_rcv, ip_rcv_finish In many cases the second method has a “slow” suffix instead of “finish”; this usually happens when the first method looks in some cache and the second method performs a lookup in a more complex data structure, which is slower.

ip_local_deliver_finish Receive Path: ip_rcv Higher Layers ip_input.c Packets that are not addressed to the host (packets received in the promiscuous mode) are dropped. Does some sanity checking Does the packet have at least the size of an IP header? Is this IP Version 4? Is the checksum correct? Does the packet have a wrong length? If the actual packet size > skblen, then invoke skb_trim(skb,iphtotal_len) Invokes netfilter hook NF_INET_PRE_ROUTING ip_rcv_finish() is called ip_local_deliver_finish ROUTING ip_route_input NF_INET_LOCAL_INPUT ip_forward.c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev.c netif_receive skb

Receive Path: ip_rcv_finish Higher Layers ip_input.c If skb->dst is NULL, ip_route_input() is called to find the route of packet. Someone else could have filled it in skb->dst is set to an entry in the routing cache which stores both the destination IP and the pointer to an entry in the hard header cache (cache for the layer 2 frame packet header) If the IP header includes options, an ip_option structure is created. skb->input() now points to the function that should be used to handle the packet (delivered locally or forwarded further): ip_local_deliver() ip_forward() ip_mr_input() ip_local_deliver_finish ROUTING ip_route_input NF_INET_LOCAL_INPUT ip_forward.c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev.c netif_receive skb

Receive Path: ip_local_deliver Higher Layers ip_input.c The only task of ip_local_deliver(skb) is to re-assemble fragmented packets by invoking ip_defrag(). The netfilter hook NF_INET_LOCAL_IN is invoked. This in turn calls ip_local_deliver_finish ip_local_deliver_finish ROUTING ip_route_input NF_INET_LOCAL_INPUT ip_forward.c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev.c netif_receive skb

Recv: ip_local_deliver_finish Higher Layers ip_input.c Remove the IP header from skb by __skb_pull(skb, ip_hdrlen(skb)); The protocol ID of the IP header is used to calculate the hash value in the inet_protos hash table. Packet is passed to a raw socket if one exists (which copies skb) If transport protocol is found, then the handler is invoked: tcp_v4_rcv(): TCP udp_rcv(): UDP icmp_rcv(): ICMP igmp_rcv(): IGMP Otherwise dropped with an ICMP Destination Unreachable message returned. ip_local_deliver_finish ROUTING ip_route_input NF_INET_LOCAL_INPUT ip_forward.c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev.c netif_receive skb

Hash Table inet_protos net_protocol udp_rcv() inet_protos[MAX_INET_PROTOS] handler udp_err() err_handler gso_send_check gso_segment gro_receive gro_complete net_protocol igmp_rcv() 1 handler Null err_handler gso_send_check gso_segment gro_receive gro_complete MAX_INET_ PROTOS net_protocol

Outline IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Send Path: ip_queue_xmit (1) Higher Layers ip_output.c ip_queue_xmit skbdst is checked to see if it contains a pointer to an entry in the routing cache. Many packets are routed through the same path, so storing a pointer to an routing entry in skbdst saves expensive routing table lookup. If route is not present (e.g., the first packet of a socket), then ip_route_output_flow() is invoked to determine a route. ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Send Path: ip_queue_xmit (2) Higher Layers ip_output.c ip_queue_xmit Header is pushed onto packet skb_push(skb, sizeof(header + options); The fields of the IP header are filled in (version, header length, TOS, TTL, addresses and protocol). If IP options exist, ip_options_build() is called. Ip_local_out() is invoked. ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Send Path: ip_local_out Higher Layers ip_output.c The checksum is computed ip_send_check(iph) Netfilter is invoked with NF_INET_LOCAL_OUTPUT using skb->dst_output() This is ip_output() If the packet is for the local machine: dst->output = ip_output dst->input = ip_local_deliver ip_output() will send the packet on the loopback device Then we will go into ip_rcv() and ip_rcv_finish() , but this time dst is NOT null; so we will end in ip_local_deliver() . ip_queue_xmit ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Send Path: ip_output Higher Layers ip_output.c ip_output() does very little, essentially an entry into the output path from the forwarding layer. Updates some stats. Invokes Netfilter with NF_INET_POST_ROUTING and ip_finish_output() ip_queue_xmit ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Send Path: ip_finish_output Higher Layers ip_output.c Checks message length against the destination MTU Calls either ip_fragment() ip_finish_output2() Latter is actually a very long inline, not a function ip_queue_xmit ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Send Path: ip_finish_output2 Higher Layers ip_output.c Checks skb for room for MAC header. If not, call skb_realloc_headroom(). Send the packet to a neighbor by: dst->neighbour->output(skb) arp_bind_neighbour() sees to it that the L2 address (a.k.a. the mac address) of the next hop will be known. These eventually end up in dev_queue_xmit() which passes the packet down to the device. ip_queue_xmit ROUTING ip_local_out ip_route_output_flow NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ip_finish_output2 ARP neigh_resolve_ output dev.c dev_queue_xmit

Outline IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Forwarding: ip_forward (1) ROUTING Forwarding Information Base ip_route_input ip_route_output_flow ip_input.c ip_forward.c ip_output.c NF_INET_FORWARD ip_rcv_finish ip_forward ip_forward_finish ip_output Does some validation and checking, e.g.,: If skb->pkt_type != PACKET_HOST, drop If TTL <= 1, then the packet is deleted, and an ICMP packet with ICMP_TIME_EXCEEDED set is returned. If the packet length (including the MAC header) is too large (skb->len > mtu) and no fragmentation is allowed (Don’t fragment bit is set in the IP header), the packet is discarded and the ICMP message with ICMP_FRAG_NEEDED is sent back.

Forwarding: ip_forward (2) ROUTING Forwarding Information Base ip_route_input ip_route_output_flow ip_input.c ip_forward.c ip_output.c NF_INET_FORWARD ip_rcv_finish ip_forward ip_forward_finish ip_output skb_cow(skb,headroom) is called to check whether there is still sufficient space for the MAC header in the output device. If not, skb_cow() calls pskb_expand_head() to create sufficient space. The TTL field of the IP packet is decremented by 1. ip_decrease_ttl() also incrementally modifies the header checksum. The netfilter hook NF_INET_FORWARDING is invoked.

Forwarding: ip_forward_finish ROUTING Forwarding Information Base ip_route_input ip_route_output_flow ip_input.c ip_forward.c ip_output.c NF_INET_FORWARD ip_rcv_finish ip_forward ip_forward_finish ip_output Increments some stats. Handles any IP options if they exist. Calls the destination output function via skb->dst- >output(skb) – which is ip_output()

IP Backup

Recall the IP Header IP-packet format 3 7 15 31 Version IHL Codepoint 3 7 15 31 Version IHL Codepoint Total length Fragment-ID D F M F Fragment-Offset Time to Live Protocol Checksum Source address Destination address Options and payload

Recall the sk_buff structure sk_buff_head next sk_buff prev sk tstamp net_device struct sock dev ...lots.. ...of.. ...stuff.. Packetdata transport_header ``headroom‘‘ network_header MAC-Header mac_header IP-Header head UDP-Header data UDP-Data tail end ``tailroom‘‘ truesize dataref: 1 nr_frags users skb_shared_info ... destructor_arg linux-2.6.31/include/linux/skbuff.h

Recall pkt_type in sk_buff pkt_type: specifies the type of a packet PACKET_HOST: a packet sent to the local host PACKET_BROADCAST: a broadcast packet PACKET_MULTICAST: a multicast packet PACKET_OTHERHOST:a packet not destined for the local host, but received in the promiscuous mode. PACKET_OUTGOING: a packet leaving the host PACKET_LOOKBACK: a packet sent by the local host to itself.