Solaris IPoIB (ibd) Implementation Kanoj Sarcar Sr Staff Engineer Sun Microsystems (presented by Bill Strahm)
Architecture Overview ● DLPI driver – Only supported for GID0 ● Interop (TopSpin, openib) ● Software structure: IP | SDP | uDAPL | ibd | | |... |IBTF / IBMF | | HCA driver | ● Performance
Tools changes (20 byte MAC) ● New SIOC*XARP ioctls and “struct xarpreq” – arp(1M) changes – PPPD proxyarp etc ● Undocumented helper function ala ether_ntoa, ether_aton – extern char *_link_ntoa(const unsigned char *, char *, int, int); – extern unsigned char *_link_aton(const char *, int *); ● ifconfig(1M), snoop(1M) ● RARP – /etc/ethers, NIS assume 6 bytes!
Tools changes (IPoIB specific) ● DHCP agent – Usage of BROADCAST bit – Setting IB hardware type in DHCP packet in htype – Client id format ● Code Len Type | | | 61 | 21 | 00 | 00 (4 octets) | 16 octet GID | ● DHCP Server – No explicit ARP cache setting with DHCP chaddr ● Snoop(1M) parses IPoIB, disallows MAC filters
Tools changes (IPoIB specific)Part II ● ifconfig(1M) ● hme0: flags= mtu 1500 index 2 ● inet netmask ff broadcast ● ether 0:3:ba:24:4:df ● ibd1: flags= mtu 2044 index 4 ● inet netmask ffffff00 broadcast ● ipib 0:0:4:a:0:0:0:0:0:0:12:34:0:2:c9:1:9:76:57:11 ● SNMP (net-snmp) – ifSpeed consistent with draft-ietf-ipoib-ibif-mib-07.txt – ifPhysAddress as MAC address (to match ARP) – octet/packet counts only for ipoib network interface
IPoIB oddities ● DLPI receiver indicates “src addr unavailable” for unicast packets – If transmitter guarantees GID0 usage, SM assisted reverse lookup possible ● RARP unsupported – Non persistant MAC – Server requires client's MAC address for 3 rd party lookup
IPoIB Configuration ● One time discovery – Reconfiguration boot or IO discovery command – Create possible ipoib instance per port/Pkey – SM must be up ● Interface initialization – IP “plumb” triggers MCG probe as validity check – SM must be up, fabric admin must create IP MCG ● Decisions – End nodes do not create IP MCG – End nodes do not have to know IB parameters ● Issues – IP multicast group membership requests – Can not wait, can not spool! – Solaris IP stack joins at interface init
Interesting Driver Details ● QPN/CQ/WQE/PathRecord ● Tx path copy vs registration ● GID/QPN to Pathrecord (hash/cache) ● Disallow (DLPI) change MAC address ● Async thread for (blocking) SM communication ● IB link up/down handling – ULP notification for IPMP failover ● MCG create/delete trap handling ● 2 MCG membership lists (full/sendonly, non) ● MCG pathrecord and membership implication