France Télécom R&D Engineering for QoS and the limits of service differentiation Jim Roberts IWQoS June 2000
quality of service transparency response time accessibility service model resource sharing priorities,... network engineering provisioning routing,... feasible technology a viable business model The central role of QoS
Engineering for QoS: a probabilistic point of view è statistical characterization of traffic notions of expected demand and random processes for packets, bursts, flows, aggregates è QoS in statistical terms transparency: Pr [packet loss], mean delay, Pr [delay > x],... response time: E [response time],... accessibility:Pr [blocking],... è QoS engineering, based on a three-way relationship: performance capacity demand
Outline è traffic characteristics è QoS engineering for streaming flows è QoS engineering for elastic traffic è service differentiation
Internet traffic is self-similar è a self-similar process variability at all time scales è due to: infinite variance of flow size TCP induced burstiness è a practical consequence difficult to characterize a traffic aggregate Ethernet traffic, Bellcore 1989
Traffic on a US backbone link (Thomson et al, 1997) è traffic intensity is predictable... è... and stationary in the busy hour
Traffic on a French backbone link è traffic intensity is predictable... è... and stationary in the busy hour 12h 18h 00h 06h tue wed thu fri sat sun mon
IP flows è a flow = one instance of a given application a "continuous flow" of packets basically two kinds of flow, streaming and elastic è streaming flows audio and video, real time and playback rate and duration are intrinsic characteristics not rate adaptive (an assumption) QoS negligible loss, delay, jitter è elastic flows digital documents ( Web pages, files,...) rate and duration are measures of performance QoS adequate throughput (response time)
Flow traffic characteristics è streaming flows constant or variable rate compressed audio (O[10 3 bps]) compressed video (O[10 6 bps]) highly variable duration a Poisson flow arrival process (?) è elastic flows infinite variance size distribution rate adaptive a Poisson flow arrival process (??) variable rate video
Modelling traffic demand è stream traffic demand arrival rate x bit rate x duration è elastic traffic demand arrival rate x size è a stationary process in the "busy hour" eg, Poisson flow arrivals, independent flow size busy hour traffic demand Mbit/s time of day
Outline è traffic characteristics è QoS engineering for streaming flows è QoS engineering for elastic traffic è service differentiation
Open loop control for streaming traffic è Open loop control -- a "traffic contract" QoS guarantees rely on traffic descriptors + admission control + policing è time scale decomposition for performance analysis packet scale burst scale flow scale user-network interface network-network interface user-network interface
Packet scale: a superposition of constant rate flows è constant rate flows packet size/inter-packet interval = flow rate maximum packet size = MTU è buffer size for negligible overflow? over all phase alignments... assuming independence between flows è worst case assumptions: many low rate flows MTU-sized packets è buffer sizing for M/D MTU /1 queue Pr [queue > x] ~ C e -r x buffer size log Pr [saturation] M/D MTU /1 increasing number, increasing pkt size
The "negligible jitter conjecture" è constant rate flows acquire jitter notably in multiplexer queues è conjecture: if all flows are initially CBR and in all queues: flow rates < service rate they never acquire sufficient jitter to become worse for performance than a Poisson stream of MTU packets è M/D MTU /1 buffer sizing remains conservative
Burst scale: fluid queueing models è assume flows have an instantaneous rate eg, rate of on/off sources è bufferless or buffered multiplexing Pr [arrival rate > service rate] < E [arrival rate] < service rate packets bursts arrival rate
Pr [rate overload] log Pr [saturation] buffer size 0 0 Buffered multiplexing performance: impact of burst parameters
log Pr [saturation] buffer size 0 0 burst length shorter longer
Buffered multiplexing performance: impact of burst parameters burst length less variable more variable log Pr [saturation] buffer size 0 0
Buffered multiplexing performance: impact of burst parameters burst length short range dependence long range dependence log Pr [saturation] buffer size 0 0
Choice of token bucket parameters? r b è the token bucket is a virtual Q service rate r buffer size b è non-conformance depends on burst size and variability and long range dependence è a difficult choice for conformance r >> mean rate... ...or b very large b' non- conformance probability b
output rate C combined input rate t time Bufferless multiplexing: alias "rate envelope multiplexing" provisioning and admission control to ensure Pr [ t >C] < è performance depends only on stationary rate distribution loss rate E [( t -C) + ] / E [ t ] è insensitivity to self-similarity
Efficiency of bufferless multiplexing è small amplitude of rate variations... peak rate << link rate (eg, 1%) è... or low utilisation overall mean rate << link rate è we may have both in an integrated network priority to streaming traffic residue shared by elastic flows
Flow scale: admission control è accept new flow only if transparency preserved given flow traffic descriptor current link status è no satisfactory solution for buffered multiplexing (we do not consider deterministic guarantees) unpredictable statistical performance è measurement-based control for bufferless multiplexing given flow peak rate current measured rate (instantaneous rate, mean, variance,...) è uncritical decision threshold if streaming traffic is light in an integrated network
Provisioning for negligible blocking è "classical" teletraffic theory; assume M/M/m/m Poisson arrivals, rate constant rate per flow r mean duration 1/ mean demand, A = ( r bits/s è blocking probability for capacity C B = E(C/r,A/r) E(m,a) is Erlang's formula: E(m,a)= m=C/r, a=A/r scale economies è generalizations exist: for different rates for variable rates utilization ( =a/m) for E(m,a) = 0.01 m
Outline è traffic characteristics è QoS engineering for streaming flows è QoS engineering for elastic traffic è service differentiation
Closed loop control for elastic traffic è reactive control end-to-end protocols (eg, TCP) queue management è time scale decomposition for performance analysis packet scale flow scale
è a multi-fractal arrival process but loss and bandwidth related by TCP (cf. Padhye et al.) thus, p = B -1 (p): ie, loss rate depends on bandwidth share ( B ~ p -1/2 ) ? B(p) loss rate p congestion avoidance Packet scale: bandwidth and loss rate
Packet scale: bandwidth sharing è reactive control (TCP, scheduling) shares bottleneck bandwidth unequally depending on RTT, protocol implementation, etc. and differentiated services parameters è optimal sharing in a network: objectives and algorithms... max-min fairness, proportional fairness, max utility,... è... but response time depends more on traffic process than the static sharing algorithm! route 0 route 1route L Example: a linear network
Flow scale: performance of a bottleneck link è assume perfect fair shares link rate C, n elastic flows each flow served at rate C/n è assume Poisson flow arrivals an M/G/1 processor sharing queue load, = arrival rate x size / C è performance insensitive to size distribution Pr [n transfers] = n (1- ) E [response time] = size / C(1- ) instability if > 1 i.e., unbounded response time stabilized by aborted transfers... ... or by admission control Throughput C a processor sharing queue fair shares link capacity C
Generalizations of PS model è non-Poisson arrivals Poisson sessions Bernoulli feedback è discriminatory processor sharing weight i for class i flows service rate i è rate limitations (same for all flows) maximum rate per flow (eg, access rate) minimum rate per flow (by admission control) Poisson session arrivals p 1-p flows think time transfer processor sharing infinite server
Admission control can be useful... to prevent disasters at sea !
Admission control can also be useful for IP flows è improve efficiency of TCP reduce retransmissions overhead... ... by maintaining throughput è prevent instability due to overload ( > 1)... ...and retransmissions è avoid aborted transfers user impatience "broken connections" è a means for service differentiation...
Choosing an admission control threshold è N = the maximum number of flows admitted negligible blocking when è M/G/1/N processor sharing system min bandwidth = C/N Pr [blocking] = N (1 - )/(1 - N+1 ) (1 - 1/ for > è uncritical choice of threshold eg, 1% of link capacity (N=100) N E [Response time]/size = 0.9 = N Blocking probability = 0.9 = 1.5
backbone link (rate C) access links (rate<<C) throughput C access rate Impact of access rate on backbone sharing è TCP throughput is limited by access rate... modem, DSL, cable è... and by server performance è backbone link is a bottleneck only if saturated! ie, if > 1
Provisioning for negligible blocking for elastic flows è "elastic" teletraffic theory; assume M/G/1/m Poisson arrivals, rate mean size s è blocking probability for capacity C utilization = s/C m = admission ctl limit B( ,m) = m (1- )/(1- m+1 ) è impact of access rate C/access rate = m B( ,m) E(m, m) - Erlang utilization ( ) for B = 0.01 m E(m, m)
Outline è traffic characteristics è QoS engineering for streaming flows è QoS engineering for elastic traffic è service differentiation
Service differentiation è discriminating between stream and elastic flows transparency for streaming flows response time for elastic flows è discriminating between stream flows different delay and loss requirements ... or the best quality for all? è discriminating between elastic flows different response time requirements ... but how?
Integrating streaming and elastic traffic è priority to packets of streaming flows low utilization negligible loss and delay è elastic flows use all remaining capacity better response times per flow fair queueing (?) è to prevent overload flow based admission control... ...and adaptive routing è an identical admission criterion for streaming and elastic flows available rate > R elastic streaming
Differentiation for stream traffic è different delays? priority queues, WFQ,... but what guarantees? è different loss? different utilization (CBQ,..) "spatial queue priority" partial buffer sharing, push out è or negligible loss and delay for all elastic-stream integration.. ... and low stream utilization loss delay loss
Differentiation for elastic traffic è different utilization separate pipes class based queuing è different per flow shares WFQ impact of RTT,... è discrimination in overload impact of aborts (?) or by admission control throughput C access rate 1 st class 3 rd class 2 nd class throughput C access rate
Different accessibility è block class 1 when N 1 =100 flows in progress - block class 2 when N 2 flows in progress è class 1: higher priority than class N 1 0 Blocking probability = 0.9 = 1.5
Different accessibility è block class 1 when N 1 =100 flows in progress - block class 2 when N 2 =50 flows in progress underload: both classes have negligible blocking (B 1 B 2 0) è in overload: discrimination is effective if 1 < 1 < 1 + 2, B 1 0, B 2 ( 1 + 2 -1)/ 2 if 1 < 1, 2 B 1 ( 1 -1)/ 1, B 2 1 B1B1 B2B 1 = 2 = N2N2 B2B2 B1B 1 = 2 = N2N2 1 B2B10B2B10 0 1 = 2 = N2N2 1
Service differentiation and pricing è different QoS requires different prices... or users will always choose the best è...but streaming and elastic applications are qualitatively different choose streaming class for transparency choose elastic class for throughput è no need for streaming/elastic price differentiation ? è different prices exploit different "willingness to pay"... bringing greater economic efficiency è...but QoS is not stable or predictable depends on route, time of day,.. and on factors outside network control: access, server, other networks,... è network QoS is not a sound basis for price discrimination
Pricing to pay for the network è fix a price per byte to cover the cost of infrastructure and operation è estimate demand at that price è provision network to handle that demand with excellent quality of service demand time of day $$$ capacity $$$ demand time of day capacity optimal price revenue = cost
Outline è traffic characteristics è QoS engineering for streaming flows è QoS engineering for elastic traffic è service differentiation è conclusions
Conclusions è a statistical characterization of demand a stationary random process in the busy period a flow level characterization (streaming and elastic flows) è transparency for streaming flows rate envelope ("bufferless") multiplexing the "negligible jitter conjecture" è response time for elastic flows a "processor sharing" flow scale model instability in overload (i.e., E [demand]> capacity) è service differentiation distinguish streaming and elastic classes limited scope for within-class differentiation flow admission control in case of overload C