Washington WASHINGTON UNIVERSITY IN ST LOUIS GigE for the MSR Fred Kuhns
2 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Ethernet Forwarding Scenario 1 Ethernet Switch Host IP: MAC: 08:00:20:7C:E3:25 Host IP: MAC: 08:00:20:7C:F2:45 Router Port 0: IP: MAC: 00:01:03:7C:23:03 Port 1: IP: MAC: 00:01:03:7C:56:34 Ethernet Switch Port 1: IP: MAC: 00:00:5E:04:00:01 MSR P1 Host IP: MAC: 00:40:33:A3:4C:04 P0 P1 Host IP: MAC: 08:00:20:54:6C:4A P3 Use the Address Resolution Protocol to Map to 08:00:20:7C:E3:25. Encapsulation datagram in Ethernet frame and send. Destination Addr: IP hdr data Packet arrives with destination host on local network. Output port must map destination IP address to MAC address.
3 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Ethernet Forwarding Scenario 2 Ethernet Switch Host IP: MAC: 08:00:20:7C:E3:25 Host IP: MAC: 08:00:20:7C:F2:45 Router Port 0: IP: MAC: 00:01:03:7C:23:03 Port 1: IP: MAC: 00:01:03:7C:56:34 Ethernet Switch Port 1: IP: MAC: 00:00:5E:04:00:01 MSR P1 Host IP: MAC: 00:40:33:A3:4C:04 P0 P1 Host IP: MAC: 08:00:20:54:6C:4A P3 Forwards to final destination host Next hop router IP address must be used in the ARP request: Map to 00:01:03:7C:23:03. Encapsulate datagram in Ethernet frame and send. Destination Addr: IP hdr data Packet arrives with destination host NOT on locally attached network. Output port must send to the next hop router.
4 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Ethernet Frame Format Transport Header Fragment offset VersionH-lengthTOSTotal length IdentificationFlags TTLProtocolIP Header checksum IP Source Address IP Destination Address Destination Address cont. Destination (6 B) Source Address cont. Source Address - (6 B) Ether Type (2 B) IP Header Ethernet Header IP Datagram
5 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP Encapsulation in Ethernet Frames FCS (4)Data ( ) type 0800 src address (6)dst address (6) len (2) src address (6)dst address (6)FCS (4)Data ( ) DSAP AA SSAP AA ctl 03 Org Code 00 type LLC802.2 SNAP LLC/SNAP Ethernet frame size: Bytes if type 1500, then IEEE frame, otherwise Ethernet V2. Ethernet Encapsulation, RFC 894 IEEE 803.2/802.2 encapsulation, RFC len 1500 Pad (0-46) Pad (0-46)
6 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 ARP Frame Destination Address (6B) Source Address (6B) Ether Type (2B) Hardware Address Space (2B) Protocol Address Space (2B) Byte length of Hardware address = 6 (1B) Byte length of Protocol address = 4 (1B) Hardware Address of Sender (6 B) Protocol Address of Sender (4 B) Hardware Address of Destination (6 B) Protocol Address of Destination (4 B) Operation Code 1/2(2B)
7 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 ARP Message Formats ARP Request type 0806 src address dst address ff:ff:ff:ff:ff:ff FCS xx has 0001 pas 0800 hl 6 pl 4 op 01 sha spa tha tpa type 806 src address dst address FCS xx has 1 pas 800 hl 6 pl 4 op 02 sha spa tha tpa ARP Reply Host B Eth Reply (02) Request (01) Host A Eth Host A IP Host B IP Ethernet Header (14 B) pad ARP Message (28 Bytes for Request or Reply) Ethernet Data - Pad with zeros to 46 Bytes FCS (4B) Ethernet Frame with ARP Request/Reply - 64 Bytes 18 Byte Pad
8 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP over ATM (rfc 791 and 2684) IP Header AAL5 Trailer IP Datagram Fragment offset VersionH-lengthTOSTotal length Identificationflags TTLprotocolHeader checksum Source Address Destination Address Options ?? IP data (transport header and transport data) AAL5 padding ( bytes) CPCS-UU (0) Length (IP packet + LLC/SNAP) CRC
9 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP Header Fields (rfc 791) Version - support IPv4 (4) Header Length - Length in 32 bit words (>= 5) TOS - Total Length - Length of datagram in octets Id - Assists in reassembling fragments Flags - Fragment Offset - Where fragment belongs, offset is in octets 0 DFDF MFMF TOS Precedense Field: Network Control Internetwork Control Critic/ECP Flash Override Flash Immediate Priority Routine Remaining TOS Fields: D - 1 = Low delay T - 1 = High Throughput R - 1 = High Reliability 0Prec.DTR0 DF - 1 = Don’t Fragment, MF - 1 = More Fragments
10 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP Header Fields TTL - router must decrement, if 0 then discard packet Protocol - UDP/TCP/ICMP/RSVP to name a few Header Checksum - 16 bit one’s complement of the one’s complement sum of all 16 bit words in header Source Address - Sending hosts IP address Destination Address - Destination hosts IP address
11 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 SPC shim update shim demux Packet Routing Within MSR WUGS... out port + IntBase ( ) InVC... IngressEgress ATM uses VCs as link layer address. Ethernet: Base VC used for directly attached hosts, subports are for hext hop routers From previous hop router or endstation add shim rem shim FIPL shim proc. FPX SPC shim demux shim update OutVC Outbound VC = SPI + ExtBase 0 <= SPI<= 15 currently support at most 4 Link Interface IP processing for FPX 1.Broadcast and Multicast destination address 2.IP options 3.ICMP messages 4.Packet not recognized Inbound VC = SPI + ExtBase 0 <= SPI <= 15 Currently support at most 4 Inbound VCs: One for Ethernet or Four for ATM Current VCI Support 1) 64 Ports (PN) 2) 16 sub-ports (SP) FIPL IP proc plugins FIPL IP proc plugins in port + IntBase ( )
12 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 GigE Link Interface ARP Table (M Entries) MACIP IP 1 MAC 1 IP M MAC M... Pkt VC = 50 Endsystem, broadcast or multicast address if VC != 50, Lookup VC in VIN table returns IP used for ARP lookup (support N = 4) Send to pkt->dst if bcast or mcast map to eaddr else resolve w/ARP IP Header data AAL5 trailer IP Header data Ethernet Add Ethernet header using the derived destination address and out source address. Protocol is IP. Software creates VIN table at boot time by writing to interface. From FPX/SPC To Next Hop or Endstation No ARP entry aging! To a next hop router NH #1 = Base + 1 = 51 NH #2 = Base + 2 = 52 NH #3 = Base + 3 = 53 VIN Table - 4 entries 50MyIP MyIP 2 NhIP 2 MyIPVCNhIP 52MyIP 1 NhIP 1 51MyIP 0 NhIP 0 Map multicast or broadcast to ethernet address If ARP table lookup fails, send ARP request to broadcast address, drop packet. No retries are made.
13 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Ethernet Assigned Numbers RFC1700 obsoleted by online database at IANA: – Ethernet Address - 6 octets: –3 high-order octets = Organizationally Unique Identifier (OUI) –3 low-order octets = the interface number Multicast bit = lsb of the MSB (xxxx xxx1) –first byte odd => multicast or broadcast –first byte even => unicast address –multicast address = ((OUI | 0x0100) << 24) & Group_ID Ethernet Broadcast: FF:FF:FF:FF:FF:FF
14 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP and Ethernet Multicast IANA has allocated address block with OUI = 00:00:5E –Used for unicast addresses for ” IETF standard track protocols “ –Half of Multicast addresses reserved for IP, remaining for “special use”. Leaves 23 bits for multicast addresses: 01:00:5E:00:00:00 to 01:00:5E:7F:FF:FF –Could use this block for our interface, see ethernet numbers IP Multicast –Class D address, 0xE Bit Group ID – to (0xE xEFFFFFFF) IP to Ethernet Mapping –RFC Host Extensions for IP Multicasting –Non-unique mapping: 28 bit IP group to 23 bit Ethernet group 32 IP multicast groups per mapped ethernet multicast address.
15 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Multicast: IP to Ethernet Mappings Network Byte Ordering, Internet Standard Bit order: (Big-Endian) xxx xxxx xxxx xxxx xxxx xxxx Multicast BitInternet Bit MSBLSB lsbmsb 1110 xxxx xxxx xxxx xxxx xxxx xxxx xxxx Class D (Multicast) Not Used in IP to Ethernet Mapping Block of Ethernet Multicast Address 08 LSB 23 bits
16 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 IP Broadcast No Direct Impact on GigE Interface IP Broadcast : default, we will not forward directed broadcasts. –limited versus: {-1, -1}. Must not be forwarded, Destination address only –Directed broadcast: {Network-Number, -1}, destination address only. –Subnet Directed Broadcast: {Network-Number, Subnet-Number, -1} –Directed Broadcast to all subnets: {Network-Number, -1, -1}
17 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Unicast - If we use the IANA Block xxxx xxxx xxxx xxxx Multicast Bit set to 0 MSBLSB IANA Block of Ethernet Addresses 16 bits ARLInterface Number
18 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 GigE Link Interface ARP Table (M Entries) MACIP IP 1 MAC 1 IP M MAC M... Base VC to FPX/SPC IP Header data Ethernet From Next Hop or Endstation To FPX/SPC receive ethernet frame: eth if (eth->type == ARP) if (eth->arp->has != Ethernet/0001) Drop Frame if (eth->arp->pas != IP/0800) Drop Frame update {eth->arp->spa, eth->arp->sha} in ARP table if (eth->arp->tpa NOT in {MyIP0, MyIP1, MyIP2}) Drop Frame // target IP not ours if (eth->arp->op == Request/01) { swap source and target ARP info set operation to Reply set ether header src and dst address send reply } // Already handled eth->arp->op == Reply/02 // when updated cache above else if (eth->type == IPv4) remove ethernet header, padding and CRC add AAL5 trailer and required padding break into cells and send on default Base VC else Error, drop packet *Unicast MAC address filtering IP Header data AAL5 trailer
19 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Notes Packet Received on ATM interface: –If received on Base_VC (i.e. 50) then map IP destination (ip->dst_addr) to ethernet representation. Unicast uses ARP table, multicast and broadcast use appropriate mapping. –Otherwise, lookup VC in VIN table: Table entry index = RX_VC - Base_VC. ARP the resulting Next Hop IP address. –This permits a simple mechanism for “tunneling” traffic to a gateway. This allows us to support directed broadcast and provides a convenient mechanism for testing. Packet received on Ethernet interface: –if IPv4 then send all (unicast, multicast and broadcast) to input port processor on the Base_VC (i.e. 50)
20 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 ARP Cache IP Address = Network_Prefix.Host or simply Net.Host –Assume a prefix length of at least 24 bits, leaves 8 bits for the host –An interface can have at most 3 unique IP addresses Interface may communicate with at most 256 hosts per network Implement ARP cache as a table with 768 entries (3 * 256) See next slide VIN Table Prefix Mask Local IP Address Next Hop IP Address Mask 0 MyIP 0 NH 0 Mask 1 MyIP 1 NH 1 Mask 2 MyIP 2 NH 2 Entry Number Ethernet IP IP 0,0... IP 0,255 Ether 0,255 Ether 0,0 IP 1,0... IP 1,255 Ether 1,255 Ether 1,0 IP 2,0... IP 2,255 Ether 2,255 Ether 2,0 ARP Table Net 0 Net 1 Net 2 Net 0 = Mask 0 & MyIP 0 Net 1 = Mask 1 & MyIP 1 Net 2 = Mask 2 & MyIP 2
21 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 ‘get next packet’: // received frame from ATM interface if (RX_VC == Base_VC) ipdst = ip->dst_addr; else ipdst = VIN_Table[RX_VC- Base_VC].NextHop // ipdst == IP Address of host we must send packet to // determine network for (i = 0; i < 3; i++) { if ((ipdst & Mask i ) == (MyIP i & Mask i )) { index = (i dst_addr & ~ Mask i ) break; } if i == 3 ; drop packet, goto get next packet // i corresponds to the Network Number (0 - 2) if (ArpTable[index].EtherAddress != 00:00:00:00:00:00) { construct ethernet frame send packet goto ‘get next packet’ } else { send ARP Request for ipdst drop packet, goto ‘get next packet’} Implementing the ARP Table VIN Table Ethernet IP IP 0,0... IP 0,255 Ether 0,255 Ether 0,0 IP 1,0... IP 1,255 Ether 1,255 Ether 1,0 IP 2,0... IP 2,255 Ether 2,255 Ether 2,0 ARP Table index Prefix Mask Local IP Address Next Hop IP Address Mask 0 MyIP 0 NH 0 Mask 1 MyIP 1 NH 1 Mask 2 MyIP 2 NH 2 Entry Number don’t need to store IP address
22 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Notes and Issues GigE Control Interface for Software configuration. 1.Reset interface to defaults 2.Clear ARP cache 3.Read ARP table 4.Read VIN table 5.Read ethernet address 6.set VIN table entries and other registers Set BASE VC (currently 50) Set Entries in the VIN table Add static ARP entries??
23 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 Notes and Issues Comprehensive testing scenarios need defining verify multicast and broadcast VC to control line card
24 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 References RFC Requirements for Internet Hosts –Must send and receive using RFC compliant –Should receive RFC-1042 mixed with RFC we do not –May send using RFC we do not –Must use ARP –Must flush out-of-date ARP cache entries - not compliant –Must prevent ARP floods - we only try once –Should have configurable ARP cache timeout - no –Should save at least one (latest) unresolved (by ARP) packet - no –Must report broadcasts to IP layer - compliant –IP layer Must pass TOS to link layer - via the header –Must Not report no ARP entry as “destination unreachable” - compliant
25 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 1/9/01 References RFC-826 : Address Resolution Protocol –Maps to 48 bit Ethernet address –our processing differs in minor ways RFC 1700 : Assigned Numbers –Ethertype values defined by RFC 1700 –IP to ethernet multicast address mapping defined RFC-1812 : Requirements for IPv4 Routers –Must not believe ARP reply if contains multicast or broadcast address - not compliant –Must be compliant with RFC Partial Support Ethernet V2 only –RFC 894: IP encapsulation in Ethernet V2 - Supported –RFC 1042: IP encapsulation in frames - Not Supported