Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ph.D. Final Examination August 8, 2006

Similar presentations


Presentation on theme: "Ph.D. Final Examination August 8, 2006"— Presentation transcript:

1 Ph.D. Final Examination August 8, 2006
Analysis and Implementation of Multiplexing Techniques in Connection-Oriented Communication Networks Ph.D. Final Examination August 8, 2006 Tao Li Department of Electrical and Computer Engineering SEAS, University of Virginia

2 References T. Li, D. Logothetis, M. Veeraraghavan, “Analysis of a polling system for telephony traffic with application to wireless LANs,” IEEE Transactions on Wireless Communications, vol. 5, pp , June 2006. T. Li, M. Veeraraghavan, “Resource allocation for a polling system with application to wireless LANs,” to be submitted for journal publication. H. Wang, M. Veeraraghavan, R. Karri, T. Li, “Design of a High-Performance RSVP-TE Signaling Hardware Accelerator,” IEEE Journal on Selected Areas in Communications (JSAC), vol. 23, no. 8, pp , August 2005. H. Wang, M. Veeraraghavan, R. Karri, T. Li, “Hardware-Accelerated Implementation of the RSVP-TE Signaling Protocol,” in Proc. of IEEE ICC2004, June 20-24, 2004, Paris, France. Ph.D. Final Examination

3 Outline Background Problem statement and contributions
Study a polling system with vacations Implementation of a signaling control card Conclusions Ph.D. Final Examination

4 Background Applications have diverse Quality of Service (QoS) requirements (bandwidth, delay, loss, etc.) deterministic QoS guarantees: mission-critical control statistical QoS guarantees: most audio/video applications No specific requirements: best-effort applications Two types of networking technologies Connectionless (CL): Internet, best-effort type of service Connection-Oriented (CO): support of QoS Circuit-switched networks: SONET, WDM, etc. Packet-switched networks: ATM, MPLS, etc. Ph.D. Final Examination

5 Architecture of a CO switch
Background (more) Chief characteristics of CO networks Resources are reserved prior to data transfer in a call admission control (CAC) phase Resources are left idle during connection setup phase Per-connection state maintenance at control-plane How to reserve resources? – through signaling protocols RSVP-TE, PNNI, SS7, etc. Architecture of a CO switch Ph.D. Final Examination

6 Background (more) In circuit-switched networks
Reserve a dedicated circuit for a connection In packet-switched networks Reserve bandwidth, buffer space, etc., for a connection Data plane: packet classification, policing, scheduling, buffer management How much resources should be reserved? Depends on service model (hard QoS or soft QoS), traffic characteristics (burstiness), buffer size, scheduling algorithms Ph.D. Final Examination

7 Background (more) Multiplexing techniques in shared-medium based access Connection-Oriented Circuit-switched networks: FDMA, TDMA Packet-switched networks: Polling, scheduling-based access Connectionless Random access Ph.D. Final Examination

8 Problem statement Our mission:
Study a polling system for QoS provisioning With application to IEEE Target real-time application: telephony A data-plane problem Demonstrate that signaling protocols, can, in spite of their complexity, be implemented in hardware Performance gain in terms of call-handling capacity and message process delay A control-plane problem Supported by NSF, DOE Ph.D. Final Examination

9 Contributions Study of a polling scheme
CDF of delay in a single queue scenario Assume a continuous-time Markov Modulated Fluid model Can be used to approximate the CDF of delay in certain multiple-queue case Voice capacity and delay bounds (deterministic service) For the MMF model or a discrete-time Markov ON/OFF model Allow heterogeneity Voice capacity (statistical service) MMF model: results obtained by simulations Resource allocation (statistical service) Assume a discrete-time Markov ON/OFF model Derive approximations for tradeoff between service degradation measure (overflow probability, or packet loss ratio) and resource allocation Ph.D. Final Examination

10 Contributions (more) Implementation of a signaling control card
Schematic design at a later stage Power regulation module Prior work completed by collaborators (Haobo Wang, Liji Wu) Collaborated with Appli-CAD Inc. for PCB design Provided a reference design for 1.25Gbps signal path Examination of placement and route Design or VHDL implementation of some functional modules Configuration module, PCI interface module, FIFO interface unit, switch-fabric interface unit Software design (device driver; contributed to a message generator) Debugging (board and VHDL) Ph.D. Final Examination

11 Overview Background Problem statement and contributions
Study of a polling system with vacations Motivation and related work System model Analysis with a continuous-time MMF model Analysis with a discrete-time Markov model Implementation of a signaling control card Conclusions Ph.D. Final Examination

12 Motivation In CO mode: scheduling-based channel access
Several communication systems simultaneously support CO and CL modes of operation IEEE polling and random access DOCSIS and IEEE Extended Real-Time Variable Rate and Best-Effort services In CO mode: scheduling-based channel access Scheduler downstream upstream Ph.D. Final Examination

13 Motivation (more) Problem: queue status info is distributed among stations for the upstream direction Instantaneous queue status not available to scheduler Can not directly use scheduling algorithms that need arrival times, queue occupancy, or packet size Continuous exchange of queue status info can be expensive Wireless bandwidth is scarce Ph.D. Final Examination

14 Motivation (more) Polling emerges as a choice
Serve all queues in a round-robin order does not require queue status information Easy to implement: O(1) time complexity Trade efficiency for timeliness (hard) Transmission of a poll signal consumes bandwidth If interpoll time bounded, delay also bounded suitable for delay-sensitive applications, like telephony Question: how many calls can be admitted? Or how much resource should be allocated for voice calls? Ph.D. Final Examination

15 Related work Papers on general polling systems
Poisson arrival process; do not consider voice traffic Papers on QoS provisioning in wired and wireless networks Do not specifically address the polling scheme considered in our work Papers on voice support over MAC protocols Do not specifically address the polling scheme Papers on voice support over IEEE polling mode Largely simulation-based Ph.D. Final Examination

16 System model Assume a superframe structure
Polling period: supports voice calls Vacation period: other resource sharing schemes Partition between polling and vacation: vacation is at least θ×TS VS: vacation stretch Frame VS Polling period Vacation Vacation Foreshortened polling period Superframe length: TS Ph.D. Final Examination

17 System model (more) Polling order Walk time – Twalk
Round-robin with a restriction: each queue can be served at most once in a polling period Walk time – Twalk Time needed for the server to move from one queue to another; models physical and MAC layer overheads Service discipline – gated-service Pack all voice packets into one MAC frame when responding to a poll Ph.D. Final Examination

18 Overview Analysis with a continuous-time MMF model Source model
Background Problem statement and contributions Study of a polling system with vacations Motivation and related work System model Analysis with a continuous-time MMF model Source model Delay analysis in a single queue case Multiple-queue analysis and simulation Analysis with a discrete-time Markov model Implementation of a signaling control card Conclusions Ph.D. Final Examination

19 Source model Markov Modulated Fluid model QoS requirements a
Continuous in time a and b are transition rates When ON, a bit stream is created at a constant-rate c; when OFF, silence Average ON time: 352ms Average OFF time: 650ms May and Zebo 1968 model QoS requirements Stringent in delay Can tolerate a small loss ratio ON OFF a b Ph.D. Final Examination

20 Delay analysis in a single queue case
Delay of interest: DW=DQ+DS DQ: queueing delay DS: service time, depends on service rate R and data size DS=0: empty packet, not of interest First, compute the PDF of TI given TI=TS+(stretch2 - stretch1) Assume: stretch1 and stretch2 are i.i.d. R.V.s with known PDF Second, compute P{DQ≤q|TI=t, nonempty packet}, and then obtain P{DQ≤q| nonempty packet} by unconditioning Ph.D. Final Examination

21 Delay analysis (more) Third, compute P{Z≤z|DQ=q} and P{DS≤s|DQ=q}
Z: total time spent in the ON state during DQ Can be solved with a uniformization technique Z can be linked to DS by DS=Zc/R c: source rate; R: service rate Finally, combine all together, given DW=DQ+DS P{DQ≤q| nonempty packet} obtained in the second step P{DS≤s|DQ=q} obtained in the third step Ph.D. Final Examination

22 All numerical results: assume IEEE 802.11b PHY
Delay analysis (more) All numerical results: assume IEEE b PHY CDF of DW with TS as a parameter Twalk and C are set to 0.23ms and 8.5Kbps, respectively Ph.D. Final Examination

23 Multiple-queue case Deterministic service
Each queue is guaranteed to be polled in a superframe Number of queues N ≤ Np (voice capacity) Referred to as small-N regime of operation Statistical service (when N > Np) Service degradation: not guaranteed to be polled in each superframe; statistical QoS guarantees Statistical multiplexing gain since N > Np Referred to as large-N regime of operation Ph.D. Final Examination

24 Computation of Np: worst-case analysis
Polls in the kth interval: empty packets Polls in the (k+1)th interval: maximum-sized packets Vacation stretch: VSmax Admission condition Np can be computed iteratively Delay bound DWmax,i Ph.D. Final Examination

25 Delay in small-N regime of operation
Simulation results: CCDF of DW; θ, codec rate, and Twalk are set to 0.5, 64Kbps, and 0.23ms, respectively Implication: delay analysis in the single queue case is a fair approximation, given the range of parameter values under consideration Ph.D. Final Examination

26 Cost of large-N regime of operation
Simulation: CCDF of delay with N' as a parameter. TS, θ, and codec rate are equal to 30ms, 0.5, and 8.5Kbps, respectively. Implication of delay spikes: use DWmax as delay threshold, and P{DW>DWmax} as performance measure (Ploss) Ph.D. Final Examination

27 Statistical multiplexing
Codec rate, Twalk, and stretch are respectively set to 8.5Kbps, 0.23ms, and VSmax Capacities increases with TS: payload size vs. Twalk Multiplexing gain is small: large Twalk, small codec rate Simulation results Ph.D. Final Examination

28 Statistical multiplexing
Codec rate, Twalk, and stretch are respectively set to 64Kbps, 0.13ms, and VSmax Multiplexing gain is significant: small Twalk, large codec rate Small Twalk is attainable Simulation results Ph.D. Final Examination

29 Overview Background Problem statement and contributions
Study of a polling system with vacations Motivation, related work System architecture Analysis with a continuous-time MMF model Analysis with a discrete-time Markov model Implementation of a signaling control card Conclusions Ph.D. Final Examination

30 Assume a discrete-time Markov model
Motivation Voice traffic needs to be packetized for transmission in a packet-switched network A discrete-time Markov model is more realistic Tractability in analysis Extend worst-case analysis for small-N regime of operation to discrete-time Markov model We derive voice capacity Nl and delay bound Dbound Details are omitted Delay performance is studied through simulations Ph.D. Final Examination

31 Resource allocation for large-N
Tsrv: the total time spent on N queues in a superframe Performance criteria: overflow probability The smallest x satisfying the above criteria is the amount of time that should be allocated for polling period, denoted as Tp(ε) Difficulty in exact analysis of P{Tsrv}: correlation Key approximation: correlation between DS,i, i=1,2,…,N, is small. Approximate DS,i, i=1,2,…,N, as i.i.d. R.V.s Ph.D. Final Examination

32 Reference service discipline: serve 1, 2, 3, but not 4
Analytical approach Consider a reference service discipline Does not incur correlation between DS,i Perform an exact analysis for this reference service discipline View the results as approximations for the gated-service discipline Reference service discipline: serve 1, 2, 3, but not 4 Ph.D. Final Examination

33 Analytical approach Other assumptions: TS=KL; synchronization
First, compute PK(m) for one queue the probability of m arrivals in K time slots Using a recursive approach Then overflow probability Computational complexity: O(NlogN) with FFT Ph.D. Final Examination

34 Computation of loss ratio
If the waiting time is too long, packet will be dropped Define loss ratio as Ploss=E{Nloss}/E{Ntotal} Nloss : number of lost packets in a superframe Ntotal : number of created packets in a superframe Ploss can be linked to overflow probability Ploss ≤ P{Tsrv>x}/PON, where PON is the probability of a voice source being in the ON state This approximation of Ploss is not very accurate Ph.D. Final Examination

35 Computation of loss ratio (more)
For the reference service discipline, an exact computation of Ploss is possible Computational complexity O(N2) with direct convolution For Ω: Ph.D. Final Examination

36 Numerical results The approximation of P{Tsrv>x} is satisfactory
Tp: polling period length TS : 30ms Dbound is set to TS+L+2ms L: packetization interval, 10ms Simulation: assume the gated-service discipline; drop the synchronization assumption; allow clock skew and phase error The approximation of P{Tsrv>x} is satisfactory Ploss can better approximate the “actual” loss ratio Cost: computational complexity Implication: Use P{Tsrv>x} as the QoS measure if computational complexity is a major concern Ph.D. Final Examination

37 Overview Background Problem statement and contributions
Study of a polling system with vacations Implementation of a signaling control card Motivation, Related work, and Solution approach System architecture, block diagram, and picture Modules of the signaling control card Performance Conclusions Ph.D. Final Examination

38 Motivation Signaling protocols Characteristics
Complex (parameters, timers, data-table lookups, keep state information) Requirement for flexibility Traditionally implemented in software Call-handling capacities: 1K calls/second ~ 10K calls/second Call-setup delay: in the order of hundreds of milliseconds Sycamore SN16000 switch: per message processing delay 90ms Ph.D. Final Examination

39 Motivation (more) Problems with software implementation
Call-setup delay impacts utilization Hard to meet the requirement for high call-handling capacities in future CO networks Objective: demonstrate that signaling protocols can be implemented in hardware in spite of their complexity Reduce call-setup delay by at least two-to-three orders of magnitude Increase call-handling capacity significantly Target signaling protocol and switch RSVP-TE with extensions for GMPLS SONET switch Ph.D. Final Examination

40 Related work TCP offloading engine Software implementations of RSVP-TE
Observation: Overhead of TCP/IP processing overwhelms server’s CPU Solution: Moving TCP/IP processing to a dedicated h/w Software implementations of RSVP-TE E.g.: Sycamore SN16000 switch with a per-message processing-delay of about 90ms Ph.D. Final Examination

41 Solution approach Manage the complexity of signaling protocols
By only supporting basic and most frequently used messages/parameters in hardware and relegating the rest to software Define a subset of the signaling protocol for hardware implementation (RSVP-TE with extensions for GMPLS) Four messages related to connection setup and release: Path, Resv, PathTear, and ResvTear Support all mandatory objects/parameters and optional parameters needed for SONET switch Ph.D. Final Examination

42 Solution approach (more)
Meet the flexibility requirement using reconfigurable Field Programmable Gate Array FPGA can be reloaded with updated versions Achieve fast data-table lookups and state maintenance by using Ternary Content Addressable Memory (TCAM) TCAM: a special memory device designed for data-table lookups Complexity of a lookup operation: one clock cycle Ph.D. Final Examination

43 System architecture Focus on signaling control card
Backplane: often proprietary. We assume PCI bus. Switch fabric card: assume Vitesse 64x64 STS-12 Cross-connection rate: STS-1 (51.8Mbps), total bandwidth: 40Gbps Ph.D. Final Examination

44 Block diagram of implementation
Optical fiber PCI bus Gbit Ethernet module Hardware signaling accelerator PCI interface module 5v, 3.3v 5v, 3.3v Power regulation module Configuration module 1.5v, 1.8v, 2.5v Ph.D. Final Examination

45 Top view of the card Ph.D. Final Examination

46 Gigabit Ethernet module
Optical-fiber transceiver: convert between optical signals and differential PECL signals SerDes: convert between serial PECL signals to parallel TTL signals Ethernet controller: 8B/10B encoding/decoding, MAC layer operations Ph.D. Final Examination

47 Hardware signaling accelerator module
Hardware signaling accelerator core: all major functions such as message parsing, creating commands for route lookup, state maintenance, and switch-fabric programming, etc. MAC/Switch fabric/FIFO/TCAM_SRAM interface units: data path, control/timing signals FIFO: temporary storage of unsupported signaling messages TCAM/SRAM: Route lookup operation, state maintenance operations Ph.D. Final Examination

48 PCI interface module CPU card interface unit: move messages from FIFO to host memory space through Direct Memory Access (DMA) Switch-fabric control unit: transmit programming command using DMA Access arbiter: give switch-fabric control unit higher priority Configuration interface unit: facilitate management of the card PCI core: provide commonly used functions for PCI accessing Ph.D. Final Examination

49 Configuration Module Enable configuration of MAC address, IP addresses, routing table and other data tables Initialize the GbE controller, SRAM, and TCAM Create clock and control signals needed for each device Ph.D. Final Examination

50 Performance Call-handling capacity Processing delay
400K calls/second (Hardware signaling accelerator module) Software-based implementation: 1K~10K calls/second 250K calls/second, limited by the 1Gbps link rate Load on the TCAM: about 6% Processing delay Per-message processing delay ≤ 2.4 microsecond Sycamore SN16000 switch: ≈ 90 ms Ph.D. Final Examination

51 Performance (more) Concurrent connections
64 ports, each consisting of 12 STS-1 circuits 768 connections (total data rate: 768 × 51.8Mbps ≈ 40Gbps) Maintaining state for 768 connections consumes 1/32 of TCAM’s memory space Better performance can be obtained in future implementation Call-handling capacity, processing delay, number of concurrent connections Ph.D. Final Examination

52 Performance (more) Define per-call utilization ratio as : U=Tfile/(Tfile+Tsetup), where Tsetup = Tprocessing+Tpropagation&emission Condition: U ≥ x % Assume: Tpropagation&emission is fixed Assume: software-based Tprocessing >> hardware-based Tprocessing Operational region with zero processing delay (ideal) Circuit rate Operational region with hardware implementation Operational region with software implementation Average file size: Determined by call-handling capacity Avg. file size for h/w Avg. file size for s/w File size Ph.D. Final Examination

53 Summary Developed analytical models for polling-based access scheme
For delay performance, deterministic service, and statistical service Can be used for CAC Limitations: simple traffic model; no consideration for channel variations; polling can be inefficient if traffic is extremely bursty (e.g., connection request) Implemented a subset of RSVP-TE with extensions for GMPLS in hardware 2-3 orders of performance gain in magnitude Enable circuit-switched networks to efficiently support a wider range of applications Ph.D. Final Examination

54 Questions? Thank you! Ph.D. Final Examination


Download ppt "Ph.D. Final Examination August 8, 2006"

Similar presentations


Ads by Google