Telcordia - June 21, Internet data-plane signaling - revisiting RSVP Henning Schulzrinne Dept. of Computer Science Columbia University (with Robert Hancock, Hannes Tschofenig, S. van den Bosch, G. Karagiannis, A. McDonald, X. Fu and others)
Telcordia - June 21, Overview Signaling: application vs. data plane Resource control DiffServ vs. IntServ What’s wrong with RSVP? Components of a general solution NSIS = NTLP (GIMPS) + {NSLP} + Route change detection
Telcordia - June 21, Signaling – the big picture session signaling datapath signaling AS#1 AS#2 off-path NE SIP proxy server off-path signaling on-path signaling data
Telcordia - June 21, Need for data plane state establishment Differentiated treatment of packets QoS firewall (loss = 100% vs. loss = 0%) Mapping state network address translation (NAT) Counting packets accounting Other state establishment setting up active network capsules MPLS paths pseudo-wire emulation (PWE) – T1 over IP Related: visit subset of data path nodes, but don’t leave state behind diagnostics better traceroute link speeds, load, loss, packet treatment, …
Telcordia - June 21, On-path vs. off-path signaling On-path (path-coupled): visit subset of routers on data path Off-path (path-decoupled): anything else, but presumably roughly along data path one proposal: one “touch point” for each AS bandwidth broker difficult part is resource tracking, not signaling No fundamental differences in protocol separate out next-hop discovery to allow re-use
Telcordia - June 21, Differentiated packet handling Not just QOS, but also firewall network address translation accounting and measurement filter management traffic filtering traffic shaping, handling & measurement IntServ DiffServ
Telcordia - June 21, DiffServ IntServ Filter always uses packet characteristic 5-tuple (protocol, source/destination address + port) + global label (TOS) multiple “flows” can be mapped to one treatment mechanism DiffServIntServin-band identification TOS 5-tuple? 5-tuple mappingfixedsignaledTCP SYN
Telcordia - June 21, The scaling bogeyman Networks routinely handle large-scale per-flow state firewalls NATs scaling = cost per flow is constant (or decreasing) flow numbers are modest: OC-48 can handle 31,875 DS-0 voice calls Mean call duration = 9 min 60 requests/second probably about 3 MB of data partially explained by poor initial RSVP explanations where flow search time ~ O(N) rather than O(1) likely limitations are in AAA, not router signaling It doesn’t scale!
Telcordia - June 21, RSVP characteristics soft-state = state vanishes if not refreshed two-pass signaling = path discovery + reservation receiver-based resource reservation separation of QoS signaling from routing with some router feedback
Telcordia - June 21, The problem with RSVP Designed for QoS establishment, used mostly for other things (RSVP-TE) Designed for large-scale IP multicast customer never materialized adds significant complexity: receiver-based PATH + RESV designed for ASM (any-source) rather than SSM (source-specific) receiver-based motivated by receiver diversity – not very useful in practice Designed in simpler days (1997): does not work well with mobile nodes (IP mobility or changing IP addresses) no support for NATs security mostly bolted on – non-standard mechanisms single-purpose, with no clear extensibility model very primitive transport mechanism either refresh or exponential decay (refresh reduction, RFC 2961)
Telcordia - June 21, The cost of multicast for RSVP reservation styles multiple senders in same group: shared vs. distinct sender selection: explicit vs. wildcard receiver-oriented motivated by heterogeneous can do leaf-initiated join rather than root- initiated but still need periodic PATH to visit new sub-tree three different flow specs Sender_TSpec, ADSpec, (TSpec, RSpec) fairly tightly woven into core protocol state merging and management killer reservation (KR-II) generally, error handling problematic draft-fu-rsvp-multicast-analysis ResvErr!
Telcordia - June 21, IETF NSIS working group chartered in Dec. 2001, after BOF in March 2001 Motivated by Braden’s two-layer model (draft- lindell-waypoint, draft-braden-2level-signal-arch) Active participation from Roke Manor, Siemens, NEC Europe, Nokia, Samsung, Columbia Based partially on CASP protocol designed by Columbia/Siemens group and prototyped at UKy
Telcordia - June 21, NSIS protocol structure client layer does the real work: reserve resources open firewall ports … messaging layer: establishes and tears down state negotiates features and capabilities transport layer: reliable transport NSLP (C) NTLP (GIMPS) transport layer QoS, NAT/FW, … UDP, TCP, SCTP IP router alert GIMPS
Telcordia - June 21, NSIS properties Network friendly congestion-controlled re-use of state across applications application-neutral add more applications later transport neutral any reliable protocol initially, TCP and SCTP also, UDP for initial probing policy neutral no particular AAA policy or protocol interaction with COPS, DIAMETER needs work soft state per-node time-out explicit removal of state extensible data format negotiation
Telcordia - June 21, NSIS properties, cont'd. Topology hiding not recommended, but possible Light weight implementation complexity security associations (re-use) may not need kernel implementation
Telcordia - June 21, What is GIMPS? Generic signaling transport service establishes state along path of data one sender, typically one receiver can be multiple receivers multicast (not in initial version) can be used for QoS per-flow or per-class reservation but not restricted to that avoid restricting users of protocol (and religious arguments): sender vs. receiver orientation more or less closely tied to data path initially, router-by-router (path-coupled) later, network (AS) path (path-decoupled)
Telcordia - June 21, NSIS network model – path- coupled NTLP nodes form NTLP chain not every node processes all client protocols: non-NTLP node: regular router omnivorous: processes all NTLP messages selective: bypassed by NTLP messages with unknown client protocols QoS midcom QoS selective omnivorous NTLP chain
Telcordia - June 21, Network model – path-decoupled Also route network-by-network can combine router-by-router with out-of- path messaging AS 1249 AS15465 AS17 Bandwidth broker NAC NTLP data
Telcordia - June 21, GIMPS messages Regular NTLP messages establish or tear down state carry client protocol datagram (“D”) or connection (“C”) mode Hop-by-hop reliability Generated by any node along the chain
Telcordia - June 21, NSIS transport protocol usage Most signaling messages are small and infrequent but: not all applications e.g., mobile code for active networks digital signatures re-"dialing" when resources are busy Need: reliability to avoid long setup delays flow control avoid overloading signaling server congestion control avoid overloading network fragmentation of long signaling messages in-sequence delivery avoid race conditions transport-layer security integrity, privacy This defines standard reliable transport protocols: TCP SCTP Avoid re-inventing wheel see SIP experience
Telcordia - June 21, GIMPS transport protocol usage One transport connection many NSLP sessions may use multiple TCP/SCTP ports can use TLS for transport-layer security compared to IPsec, well-exercised key establishment not quite clear what the principal is re-use of transport no overhead of TCP and SCTP session establishment avoid TLS session setup better timer estimates SCTP avoids HOL blocking
Telcordia - June 21, Message forwarding Route stateless or state-full: stateless: record route and retrace state-full: based on next-hop information in ‘C’ node Destination: address look at destination address address + record record route route based on recorded route state forward based on next-hop state state backward based on previous-hop state State: no-op leave state as is ADD add message (and maybe client) state DEL delete message state
Telcordia - June 21, Message format No GIMPS distinction between requests and responses just routed in different directions client protocol may define requests and responses Common header defines: destination flag state flag session identifier traffic selector: identify traffic "covered" by this session message sequence number response sequence number message cookie avoid IP address impersonation origin address may not be data source or sink destination address or scope common headerextensionsclient protocol data
Telcordia - June 21, Message format, cont'd Limit session lifetime Avoid loops hop counter Mobility: dead branch removal flag branch identifier Record route: gathers up addresses of NSIS nodes visited Route: addresses that NSIS message should visit
Telcordia - June 21, Capability negotiation NSIS has named capabilities including client protocols Three mechanisms: discovery: count capabilities along a path "10 out of 15 can do QoS" record: record capabilities for each node require: for scout message, only stop once node supports all capabilities (or-of-and) avoid protocol versioning
Telcordia - June 21, Next-hop discovery Next-in-path service enhanced routing protocols distribute information about node capabilities in OSPF routing protocol with probing service discovery, e.g., SLP first hop, e.g., router advertisements DHCP scout protocol Next AS service (not in current version): touch down once per autonomous system (AS) new DNS name space: ASN.as.arpa, e.g., 17.as.arpa use new DNS NAPTR and SRV for lookup similar to SIP approach
Telcordia - June 21, Next-hop discovery scout messages are special NSIS messages limited < MTU size addressed to session destination UDP with router alert option get looked at by each router reflected when matching NSIS node found next IP hop NSIS-aware? existing transport connection? use D mode to find next NSIS hop establish transport connection N Y N Y done
Telcordia - June 21, Mobility and route changes avoids session identification by end point addresses avoid use of traffic selector as session identifier remove dead branch discovers new route on refresh ADD B=2 DEL (B=2) B=1
Telcordia - June 21, QoS-NSLP: resource reservation NSLP for signaling QoS reservations in the Internet both sender- and receiver-initiated reservations soft-state peer-to-peer signaling and refresh (rather than end-to-end) bundled sessions (e.g., video + audio) agnostic about QoS models (IntServ, DiffServ, RMD, …)
Telcordia - June 21, QoS-NSLP node architecture GIMPS QoS-NSLP resource management policy control packet scheduler packet classifier outgoing interface selection (forwarding) input packet processing traffic control select GIMPS packets API forwarding table manipulation
Telcordia - June 21, QoS-NSLP actors flow sender QNIQNRQNE (…) flow receiver IP address = flow source address QoS unaware NSLP nodes not shown IP address = flow destination data GIMPS+ QoS-NSLP e.g., access router
Telcordia - June 21, QoS-NSLP: sender-initiated reservation RESERVE RESPONSE QNIQNE QNR ( RSN #3) ( RSN #17) ( RSN #4)
Telcordia - June 21, QoS-NSLP: receiver-initiated reservation QUERY RESERVE QNIQNE QNR RESPONSE
Telcordia - June 21, QoS flow aggregation aggregate QoS-NSLP style (RFC 3175) traffic sink (LAN) sinktree style (BGRP)
Telcordia - June 21, The weight of NSIS NSIS state = transport state + GIMPS state + NSLP state GIMPS state = two sockets transport state = O(100) bytes 10,000 users consume 1 MB
Telcordia - June 21, Route change detection Don’t want to wait for periodic rediscovery – delay of 30s+ Not all route changes matter e.g., only changes between NSIS routers Data plane detection TTL change of arriving data packets propagation delay change for data packets monitoring propagation delay (~ min(e2e delay)) increases in packet loss or jitter
Telcordia - June 21, Route change measurements 12 measurement sites (looking glass) one traceroute every 15’ 2.75 hours per pair availability: 99.8% 0.1% repeated IP addresses 4.4% single hop with multiple IP addresses 422 route changes observed after data cleanup (13,074 records) 67 out of 422 also showed AS changes often, indicates multi-homing
Telcordia - June 21, Route changes
Telcordia - June 21, On-going and planned work Finish NTLP (GIMPS) and NSIS clients (NAT-FW and QoS) Longer term: off-path signaling (new WG?) New applications: diagnostics Mobility support
Telcordia - June 21, Conclusion NSIS = unified infrastructure for data-affiliated sessions avoid making assumptions except that sessions wants to "visit" data nodes or networks not just mobility, but also mobility route change detection challenging protocol framework in place but need to work out packet formats