Mart Haitjema SPP Version 1 NAT Daemon (natd). 2 - Mart Haitjema - 5/6/2015 NATD Overview Manages NAT connections for a Linecard (LC) in SPP »Creates.

Slides:



Advertisements
Similar presentations
CSC458 Programming Assignment II: NAT Nov 7, 2014.
Advertisements

Engineering Patrick Crowley, John DeHart, Mart Haitjema, Fred Kuhns, Jyoti Parwatikar, Ritun Patney, Jon Turner, Charlie Wiseman, Mike Wilson, Ken Wong,
CPSC Network Layer4-1 IP addresses: how to get one? Q: How does a host get IP address? r hard-coded by system admin in a file m Windows: control-panel->network->configuration-
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—2-1 BGP Transit Autonomous Systems Monitoring and Troubleshooting IBGP in a Transit AS.
Senior Project with the SPP Michael Williamson. Communicating with a Slice Slice-RMP library using a Unix Domain Socket ◦ RPC-Like ◦ Slice application.
1 Internet Networking Spring 2004 Tutorial 13 LSNAT - Load Sharing NAT (RFC 2391)
Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak.
Introduction. 2 What Is SmartFlow? SmartFlow is the first application to test QoS and analyze the performance and behavior of the new breed of policy-based.
ECE 526 – Network Processing Systems Design Packet Processing II: algorithms and data structures Chapter 5: D. E. Comer.
Chapter 2 Networking Overview. Figure 2.1 Generic protocol layers move data between systems.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #12 LSNAT - Load Sharing NAT (RFC 2391)
Gursharan Singh Tatla Transport Layer 16-May
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
Lab 5: NAT CS144 Review Session 7 November 13 th, 2009 Roger Liao.
1 Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)
Chapter 6: Packet Filtering
PA3: Router Junxian (Jim) Huang EECS 489 W11 /
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
Managing SPP Resources: System Resource Manager (SRM) Fred Kuhns Applied Research Laboratory Washington University in St. Louis.
Michael Wilson Block Design Review: ONL Header Format.
IP Forwarding.
Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
John DeHart Flow Stats Module -- Control Design. 2 - Flow Stats Module – John DeHart and James Moscola SPP V1 LC Egress with 1x10Gb/s Tx SWITCHSWITCH.
Chapter 6-2 the TCP/IP Layers. The four layers of the TCP/IP model are listed in Table 6-2. The layers are The four layers of the TCP/IP model are listed.
1 - Charlie Wiseman - 05/11/07 Design Review: XScale Charlie Wiseman ONL NP Router.
Michael Wilson Block Design Review: Line Card Key Extract (Ingress and Egress)
Washington WASHINGTON UNIVERSITY IN ST LOUIS Substrate Control: Overview Fred Kuhns Applied Research Laboratory.
Identify the traffic that should go across the VPN. Check the ACL configuration Try to ping across the tunnel using a ping that matches the ACL We should.
John DeHart Block Design Review: Lookup for IPv4 MR, LC Ingress and LC Egress.
Linux Operations and Administration Chapter Eight Network Communications.
1 CSE 5346 Spring Network Simulator Project.
Allow / express forward Drop NAT Policy Engine Enhancement Frame Ingress WebOS Policy Engine MAC source/dest address IP /not IP source/dest address /range.
Introduction to Mininet, Open vSwitch, and POX
ECE 526 – Network Processing Systems Design Network Address Translator.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—1-1 BGP Overview Monitoring and Troubleshooting BGP.
1 - Charlie Wiseman, Shakir James - 05/11/07 Design Review: Plugin Framework Charlie Wiseman and Shakir James ONL.
Washington WASHINGTON UNIVERSITY IN ST LOUIS LC/NPE Substrate Control: Substrate Control Daemon Fred Kuhns Applied.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
John DeHart and James Moscola (Original FastPath Design) August 2008 Flow Stats Module -- Control.
Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
1 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 3/15/2016 Allocate and free code option instance, NPE resources and interface bandwidth. Manage.
Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:
Supercharged PlanetLab Platform, Control Overview
CSC458 Programming Assignment II: NAT
Flow Stats Module James Moscola September 12, 2007.
or call for office visit,
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab
An NP-Based Ethernet Switch for the Open Network Lab Design
SPP Version 1 Router NAT John DeHart.
ONL NP Router Plugins Shakir James, Charlie Wiseman, Ken Wong, John DeHart {scj1, cgw1, kenw,
techX and ONL Summer 2008 Plans
Flow Stats Module James Moscola September 6, 2007.
An NP-Based Router for the Open Network Lab Overview by JST
Supercharged PlanetLab Platform, Control Overview
Next steps for SPP & ONL 2/6/2007
SPP Version 1 Router Traffic Examples
Design of a Diversified Router: November 2006 Demonstration Plans
Code Review for IPv4 Metarouter Header Format
Code Review for IPv4 Metarouter Header Format
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Meeting Notes
Implementing an OpenFlow Switch on the NetFPGA platform
A High Performance PlanetLab Node
Design of a High Performance PlanetLab Node: Line Card
Transport Layer 9/22/2019.
Virtual Private Network
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Mart Haitjema SPP Version 1 NAT Daemon (natd)

2 - Mart Haitjema - 5/6/2015 NATD Overview Manages NAT connections for a Linecard (LC) in SPP »Creates NAT connections: Manages UDP, TCP ports and ICMP IDs on a per-interface bases Translates board’s (GPE or CP) UDP/TCP port # or ICMP ID to an interface’s externally visible port or ICMP ID Enables connection by installing an ingress and egress filter in LC’s TCAM »Tracks connection state: UDP/ICMP: by hardware activity monitoring using TCAM aging bits (see Aging) TCP: by tracking connection state (see TCP State Machine) »Removes connections: Removes inactive UDP/ICMP connections whose filters have timed out Removes stale TCP connections that have timed out in a particular state Disables connection by removing ingress and egress filter for connection Supported NAT Connections: »Connections initiated from a board in SPP UDP - identified by two tuple, maps to public UDP port Ø [board MAC, board port] -> public port TCP - identified by 4 tuple, maps to public TCP port Ø [board MAC, board port, remote IP, remote port] –> public port ICMP echo-request (ping) - identified by 2 tuple, maps to public ICMP ID Ø [board MAC, board ICMP ID] -> public ID

3 - Mart Haitjema - 5/6/2015 NATD Overview Daemon can reside anywhere »Intended to run on LC Ingress XScale for performance Interacts with: »SCD Sends packet meta-data for NAT from datapath to natd natd sends back updated meta-data and instructs SCD to forward, drop, or ignore packet Receives write and remove filter instructions from natd Ingress SCD: Ø Polls TCAM for filters that have timed out see “Aging” Ø Informs natd of timed out filters »SRM Determines queue/scheduler for NAT to use on each link (board-interface mapping) see “Links” natd queries for this information at startup »Flow stats natd informs flow stats of new/removed NAT connections

4 - Mart Haitjema - 5/6/2015 NAT Message Exchange INGRESS TCAM EGRESS Egress SCD to NATD nat_egress: process packet requiring NAT from egress Ingress SCD to NATD nat_ingress: process packet requiring NAT from ingress timed_out_filters: the following filter IDs have timed out through aging NATD to Ingress SCD nat_filters: tells SCD which filter IDs to use aging with write_fltr: install a filter for NAT in LC’s TCAM rem_fltr_by_fid: remove a NAT filter from LC’s TCAM NATD to SRM: get_sched_map: get queue/scheduler information for use by NAT connections PCI BUS SCD NATD nat_ingress timed_out_filters nat_egress write_fltr rem_fltr_by_fid EGRESS Control Processor (CP) SRM get_sched_map TCAM XScale Line card nat_filters

5 - Mart Haitjema - 5/6/2015 NATD Interface result egress_natd(meta-data) valBuf_t meta-data dw4_t words[8]; // the meta-data as defined on meta-data slides valBuf_t result { dw4_t retCode; // code to scd to drop, forward, or ignore packet dw4_t words[7]; // updated meta-data as defined on meta-data slides } »Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns updated meta-data with instruction for SCD to drop, forward, or ignore packet result ingress_natd(meta-data) valBuf_t meta-data dw4_t words[8]; // the meta-data as defined on meta-data slides valBuf_t result { dw4_t retCode; // code to natd to drop, forward, or ignore packet dw4_t words[6]; // updated meta-data as defined on meta-data slides } »Sends packet meta-data to natd so natd can manage state for packet’s connection. Natd returns updated meta-data with instruction for SCD to drop, forward, or ignore packet

6 - Mart Haitjema - 5/6/2015 NATD Interface status timed_out_filters(ingStartFid, numIngFids, egrStartFid, numEgrFids, ingFids, egrFids) dw4_t ingStartFid // start of range of ingress filters dw4_t egrStartFid //“ egress filters dw4_t numIngFids // number of filters polled in ingress DB dw4_t numEgrFids // ““ egress DB valBuf_t ingFids { dw4_t fids[]// list of timed out filter IDs in ingress DB } valBuf_t egrFids{ dw4_t fids[]//“““ egress DB } » Sets/clears the timeout flag for all the filters that natd has state for in the range of the filters specified for each database » See “Aging” for how call is used

7 - Mart Haitjema - 5/6/2015 Links NAT Traffic: »Routed across links »One link between each SPP board and LC interface »Link specifies which queue manager, scheduler, queue, and VLAN should be used to route traffic both in and out of the LC »Mappings are retrieved at startup by querying the SRM using the get_sched_map(...) call See

8 - Mart Haitjema - 5/6/2015 SCD Changes Both SCDs: »New thread Periodically (10ms) polls for packets in datapath to XScale scratch ring Sends packet meta-data to natd to process Ø nat_ingress(...) call for ingress Ø nat_egress(...) call for egress Ø natd returns –updated meta-data if packet needs to be forwarded –instruction to drop, forward or ignore packet If hit bit is not set, XScale has a copy of the packet and must either drop or forward the packet Ingress only: »Starts when natd calls nat_filters(…) on ingress SCD »Periodically checks TCAM activity bits for nat filters (see Aging) »Uses timed_out_filters(...) to inform natd which filters have timed out and which have not

9 - Mart Haitjema - 5/6/2015 SCD to NATD: Packet meta-data Rsvd 3b Hit Egress: Ingress: TCP Flags 6b H 1b Rsvd 1b Hit TCP Flags 6b H 1b Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Flags (8b) IP_SAddr (32b) SrcMAC (8b) TCP/UDP SPort Or ICMP ID (16b) IP Proto (8b) ICMP Type(8b) IP_DAddr (32b) TCP/UDP DPort (16b) TCAM Hit Index (32b) IP Hdr 1 st Word (32b) IP Hdr Top 16 bits Of 2 nd Word (16b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) Flags (8b) IP DAddr (32b) Intf (4b) TCP/UDP DPort Or ICMP ID (16b) Protocol (8b) ICMP Type (8b) Rsv (4b) IP_SAddr (32b) TCP/UDP SPort (16b) TCAM Hit Index (32b) IP Hdr 1 st Word (32b) IP Hdr Top 16 bits Of 2 nd Word (16b) TCP State on XScale uses Full 5-tuple TCP state Updates Include TCAM Hit Index S 1b R 1b P 1b A 1b F 1b U 1b FIN SYNRST PSHACKURG S 1b R 1b P 1b A 1b F 1b U 1b FIN SYNRST PSHACKURG From:

10 - Mart Haitjema - 5/6/2015 NATD to SCD: updated meta-data Egress: Ingress: Buf Handle(24b) IP DAddr (32b) IP Pkt Length (16b) Reserved (8b) Eth Hdr Len (8b) IP Hdr 1 st Word (32b) Flags (8b) Translated SPort(16b) Stats Index (16b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b IP Hdr Top 16 bits Of 2 nd Word (16b) Reserved (16b) Reserved (8b) Buf Handle(24b) IP Pkt Length (16b) Translated DPort/ID (16b) Stats Index (16b) Eth Hdr Len (8b) IP Hdr 1 st Word (32b) Flags (8b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b Reserved 3b N 1b H 1b I 1b U 1b T 1b ICMP NAT Hit UDP TCP Reserved 3b N 1b H 1b I 1b U 1b T 1b ICMP NAT Hit UDP TCP Natd updates fields in dark blue Flags: »H: HIT - Lookup was a valid hit. »N: NAT - NAT translation is required »I: ICMP - ICMP pkt »U: UDP - UDP pkt »T: TCP - TCP pkt At most one of I/U/T should be set at any time If N is 0, then I/U/T will be ignored »HF does not need to do any protocol specific operations for packets that do not require NAT translation No need to send any H=0 pkts to HF. IP Hdr Top 16 bits Of 2 nd Word (16b) Reserved (16b) From:

11 - Mart Haitjema - 5/6/2015 NATD – Top Level Single threaded, uses event queue for timed events On start up retrieves scheduler information for board/interface mappings from srm using get_sched_map(...) call Main loop: »Process messages from SCDs until next scheduled timeout event i.e. nat_ingress(...), nat_egress(...), and timed_out_filters(...) Installs and removes connections by calling write_fltr(...) and rem_fltr_by_fid(...) on Ingress SCD »Service timeout events Events to remove UDP/ICMP connections with timed out filters Events to remove stale TCP connections See slides on Timeout Events

12 - Mart Haitjema - 5/6/2015 New NAT connection example SCD NATD nat_ingres/egress SCR Poll for packets Lookup XScale Hdr Format NN Packet meta-data TCAM drop/forward/ignore natd response updated meta-data SCR Datapath install filter install ingress filter write_fltr(...) install egress filter write_fltr(...)

13 - Mart Haitjema - 5/6/2015 Table Structure natTable IP Address: XXX.XXX.XXX.XXX Ifn: X tcpConnection filterTable tcpTable icmpTable icmpConnection udpConnection ingressFilter EgressFilter ingressFilter EgressFilter ingressFilter EgressFilter One NAT Table per interface All NAT tables share a pool of filters from the FilterTable

14 - Mart Haitjema - 5/6/2015 TCP State Machine 1 ESTABLISHED INGRESS CLOSED EGRESS CLOSED SYN-WAIT NULL FIN-WAIT syn syn ack 2 fin (ingress) fin (egress) fin (ingress) 3 3 Transition:Action: 1create connection instance, install filters, add tcpSynTout event 2remove tcpSynTout event, add tcpIdleTout event 3remove tcpIdleTout event, add tcpFinTout event 4remove tcpFinTout/tcpIdleTout, re-add tcpIdleTout event 5remove connection, filters, & all timeout events rst fin (egress) 2 syn 4 4

15 - Mart Haitjema - 5/6/2015 Timeout Events TCP TCP Timeouts »All timeouts remove connection when they fire »tcpSynTout: Period: 5 minutes Installed when connection transitions to SYN-WAIT state Removed when connection transitions to established state »tcpIdleTout: Period: 24 hours Installed when connection transitions to ESTABLISHED state Removed when connection transitions to FIN-WAIT state »tcpFinTout: Period: 5 minutes Installed when connection transitions to FIN-WAIT state Removed if connection is closed

16 - Mart Haitjema - 5/6/2015 Timeout Events UDP/ICMP UDP & ICMP Timeouts »udpAgeTout / icmpAgeTout Period: 5 minutes Remove connection if both ingress & egress filter for connection have timed out

17 - Mart Haitjema - 5/6/2015 Aging Hardware Aging: »Uses TCAM’s hardware activity bits »See “TCAM and Aging” in Algorithm: »SCD Polls TCAM for filters that have timed out Ø Uses the range of filter IDs specified by nat_filters(…) call. Range must be a multiple of 32 Ø Calls IdtSearchDatabaseSwAgeAndGetAgedEntries(...) to get timed out filters in subset of range of filter IDs in each database Ø Checks entire range of nat filters every 5 minutes Ø Checks the same range of filter IDs in ingress & egress database at the same time Informs natd which filters have timed out in each range via timed_out_filters(…) call »Natd Updates state of each filter in range of filters specified in timed_out_filters(...) Ø For each filter in specified range Ø Sets timed out flag associated with filter SCD clears timed out flag associated with each filter natd has state for Each UDP/ICMP connection has a timeout event that fires every 5 minutes Ø if both filters have timed out, connection removed

18 - Mart Haitjema - 5/6/2015 Status To do: »Finish TCAM aging – need to debug IDT call - FINISHED »Fix eventManager to allow events on queue to be removed »Send connection information to flow stats »Implement hash functions for faster connection state lookup Open issues »Burst of UDP packets not handled well

19 - Mart Haitjema - 5/6/2015 File Structure techX repository: wu_arl/dnet/npe/natd Files: »bitmap.{cc,h} bitmap/portmap class used for managing freelist of available ports/IDs »boards.{cc,h} defines board & link classes »connections.{cc,h} defines ICMP, UDP, and TCP connection data structures »events.{cc,h} all timeout events »filters.{cc,h} filter code and filter table includes calls to SCD to install/uninstall filters »natd.{cc,h} reads configuration file, gets scheduler mappings from SRM, includes main processing loop »statOp.{cc,h} code for natd interface calls [egress_nat(...), ingress_nat(...), and timed_out_filters(...)] »tables.{cc,h} defines all table data structures [natTable, icmpTable, udpTable, and tcpTable] manages all connection state (e.g. open/close connection, TCP state transitions, etc)

20 - Mart Haitjema - 5/6/2015 Configuration File Format myAddr = 0natd’s address myPort = 5050natd’s port scdAddr = 0scd’s address scdPort = 7070scd’s port srmAddr = srm’s address srmPort = 6060srm’s port loglvl = Loudlogging verbosity [GeneralParameters] tcpSynTimeOut = 300timeout in syn-wait state tcpFinTimeOut = 300timeout in fin-wait state tcpIdleTimeOut = 86400timeout in established state agingPollInterval = 300period for udp/icmp timeout ingressStartFid = 0first filter ID reserved for nat in ingress DB ingressEndFid = 8191last filter ID reserved for nat in ingress DB (range must be a multiple of 32) egressStartFid = 0““ “ egress DB egressEndFid = 8191““ “ egress DB (currently range must be same as ingress) [ Interface ]defined for each interface # Link name drn05 ifn = 0interface number IPAddress = 0x80fc99d1interface’s IP address udpStartPort = 30000first udp port reserved for nat udpEndPort = 30499last udp port reserved for nat tcpStartPort = 30000“ tcp“ tcpEndPort = 30499“ tcp“ icmpStartID = 0“ icmp“ icmpEndID = 65535“ icmp “ [ Board ]defined for each board # cp1, Slot 0 type=cpCP or GPE (not currently used) MACAddress = 00:1E:C9:FE:76:23 board’s MAC address