Supporting Real-time Applications in An Integrated Service Packet Network CSZ Sigcomm 92
Problem How to support realtime applications over pkt switching networks? These application require bound on per- packet delay. Packet networks do not actively manage their resources, i.e. no active scheduling –Can not limit the delay –To support realtime applications, network architecture should be extended
Key Components for New Arch Nature of network commitment –Guaranteed or predicted Service interface: parameters pass between src and the network Packet scheduling Admission control: criteria by which network decides to accept or deny a request
Properties of RT Traffic Playback apps (most of today apps) –Each pkt should arrive before its playout time Network introduces jitter => buffering Other characteristics of RT apps –Have some inherent pkt generation process that lasts longer than e2e delay, exp? –Pkt/traffic generation can be modeled by a filter Such model/info can be used bu network for resource allocation
Service requirements for RT Apps Interaction => App is sensitive to delivery delay, App needs (absolute or statistical) info about per pkt delay to set its playback point Buffering => exact pkt arrival does not matter => PACKETS CAN TRADE DELAY IN THE NETWORK Some losses can be tolerated
Delay Delay = Propagation (fix.) + Queuing (var.) –Queuing delay => jitter => must be bounded Q. delay <= stat. mux <= resource sharing –Key benefit of packet switching networks Bursty tx can be accommodated by stat. mux, but degree of burstiness should be bound –Behavior of aggregate traffic matters Idea: benefit from sharing, but provide protection against its potentially negative effects.
Dealing with Delay To predict perf, app needs to know: –How to set the playback point? –What portion of packets arrive late? Network service is defined as –A bound on delay & How often this bound is missed Application can use: – a priori advertised delay (rigid apps) –Measured/observed delay (adaptive apps) Adaptive apps have earlier playback (better perf) but could experience higher losses (tradeoff!) – limit to adaptability, e.g. delay in interactive voice Two dominant classes: –rigid-intolerant, adaptive-tolerant
Service Commitment Basic idea –Network must know characteristics of traffic in order to manage its resources –Client must meet its traffic commitment –What other factors affect provided service? 1) Guaranteed service –No other assumption/factor affects perf, even behavior of other flows/clients => isolation –Appropriate for intolerant-rigid applications –Should consider worst case delay (delay bound)
Service commitment (cont) 2) Predictive service –Near future behavior is similar to recent past –Adaptive apps make this assumption to adjust their playout time => net can provide this svc –Two components: If past predicts future, net ’ll meet its commitment, (in contrast to considering worst case behavior) Net attempts to minimize delay => app can minimize their playout => Only Implicit assumption about other clients: overall network conditions do not change No explicit assumption about behavior of individual clients 3) best-effort service: no commitment from net
Scheduling for Guaranteed Traffic Token bucket for traffic characterization –Params: rate r and dept b –A token bucket that fills with rate r, with max size b. each pkt removes p token from bucket –Given a pkt generation process => derive r and b(r) (the min bucket size required) Many scheduling algorithm exist, WFQ, Virtual clock, that follow this principle
Scheduling for guaranteed svc assign clock-rate to a flow, i.e. relative share of link bw WFQ: It was shown that for any topology if –Clk rate for each flow is the same on all sw & –Sum of all clock rate on each sw < link bw Then, queuing delay is bounded by max delay in a bucket due to a burst If the traffic src goes through a leaky bucket then no further delay is added in the net All queuing dely occurs in the leaky bucket Delay bound of each flow is independent of other flows characteristics Network should provide isolation among flows for overloaded condition, should not depend on src filters Drawback: any r that causes good delay (b(r)/r) is much higher than avg rate => network utilization is low < 50%
Scheduling for Predictive Service WFQ provides max isolation at the price of low link utilization. If avg flow rate conforms leaky bucket and net is not over-committed => median delay is low, but burst could cause jitter Adaptive applications allow some room to delay delivery of a pkt, within a deadline –Deadline-driven scheduling –FIFO scheduling achieve some of this goal Delay caused by a flow due to bursts is lower in FIFO, why? FIFO evenly dist total delay among pkt, WFQ does not Post facto jitter is smaller -- Can extend to priority-based version of FIFO => jitter shifting
Multi-hop Sharing Issue with FIFO scheduling is that jitter will increase with uncorrelated queuing delays at multiple hops More hops => more opportunities for sharing –How to correlate sharing experience on mul hops? => FIFO+ FIFO+: Use FIFO style sharing on all hops –diff(i) = Avg_delay – observed_delay_for_a_pkt –Put cumulative diff(i) in pkt header –diff(i) shows relative service observed by a pkt compare to avg service in its class –Use diff(i) to place the pkt in queue to adjust observed service by the pkt => Q management is not trivial any more!!
Unified Scheduling Putting diff scheduling scheme together –Guaranteed + predictive + best-effort –Isolate guaranteed flows from each other and from predicted svc => use WFQ –Each guaranteed client is a flow with rate r –All predicted flows + datagram is assigned a pseudo WFQ flow (flow 0) –No of priority classes within flow 0, within each priority class use FIFO+ –Higher priorities can “steal” BW from lower priorities, => datagram suffers from accumulative jitter
Service Interface Guaranteed Service, only specify rate r –If resulting worst case delay is high, request for higher rate –Network does not require to check for conformance Predicted Service, specify traffic & service –Traffic: filter rate and size => used by network for resource management –Service: delay and loss characterization –Network must check for conformance at the edge
Admission Control Problem: how should network decide whether to admit or reject a new flow? –No strategy is proposed, just requirements Two criteria: 1) leave 10% of BW for datagram (this is ad hoc) => datagram traffic can get through, & some oscillation in load is absorbed, 2) Admitting new flow does not increase predicted delay beyond the predicted bound Measurement-based admission control