Packet-Cell Switching Lecture 4: Packet-Cell Switching
Memory bandwidth Basic OQ switch: Consider an OQ switch with N different physical memories, and all links operating at rate R bits/s. In the worst case, packets may arrive continuously from all inputs, destined to just one output. Maximum memory bandwidth requirement for each memory is (N+1)R bits/s. Shared Memory Switch: Maximum memory bandwidth requirement for the memory is 2NR bits/s.
Queue Terminology S,m A(t), l Q(t) D(t) Arrival process, A(t): In continuous time, usually the cumulative number of arrivals in [0,t], In discrete time, usually an indicator function as to whether or not an arrival occurred at time t=nT. l is the arrival rate; the expected number of arriving packets (or bits) per second. Queue occupancy, Q(t): Number of packets (or bits) in queue at time t. Service discipline, S: Indicates the sequence of departure: e.g. FIFO/FCFS, LIFO, … Service distribution: Indicates the time taken to process each packet: e.g. deterministic, exponentially distributed service time. m is the service rate; the expected number of served packets (or bits) per second. Departure process, D(t): In continuous time, usually the cumulative number of departures in [0,t], In discrete time, usually an indicator function as to whether or not a departure occurred at time t=nT.
More terminology Customer: queueing theory usually refers to queued entities as “customers”. In class, customers will usually be packets or bits. Work: each customer is assumed to bring some work which affects its service time. For example, packets may have different lengths, and their service time might be a function of their length. Waiting time: time that a customer waits in the queue before beginning service. Delay: time from when a customer arrives until it has departed.
Arrival Processes Examples of deterministic arrival processes: E.g. 1 arrival every second; or a burst of 4 packets every other second. A deterministic sequence may be designed to be adversarial to expose some weakness of the system. Examples of random arrival processes: (Discrete time) Bernoulli i.i.d. arrival process: Let A(t) = 1 if an arrival occurs at time t, where t = nT, n=0,1,… A(t) = 1 w.p. p and 0 w.p. 1-p. Series of independent coin tosses with p-coin. (Continuous time) Poisson arrival process: Exponentially distributed interarrival times.
Adversarial Arrival Process Example for “Knockout” Switch Memory write bandwidth = k.R < N.R 1 2 3 N If our design goal was to not drop packets, then a simple discrete time adversarial arrival process is one in which: A1(t) = A2(t) = … = Ak+1(t) = 1, and All packets are destined to output t mod N.
Bernoulli arrival process Memory write bandwidth = N.R A1(t) 1 R R 2 A2(t) R R A3(t) 3 R R N AN(t) R R Assume Ai(t) = 1 w.p. p, else 0. Assume each arrival picks an output independently, uniformly and at random. Some simple results follow: 1. Probability that at time t a packet arrives to input i destined to output j is p/N. 2. Probability that two consecutive packets arrive to input i is the same as the probability that packets arrive to inputs i and j simultaneously, equals p2. Questions: 1. What is the probability that two arrivals occur at input i in any three time slots? 2. What is the probability that two arrivals occur for output j in any three time slots? 3. What is the probability that queue i holds k packets?
Simple deterministic model Cumulative number of bits that arrived up until time t. A(t) A(t) Cumulative number of bits D(t) Q(t) R Service process time D(t) Properties of A(t), D(t): A(t), D(t) are non-decreasing A(t) >= D(t) Cumulative number of departed bits up until time t.
Simple Deterministic Model Cumulative number of bits d(t) A(t) Q(t) D(t) time Queue occupancy: Q(t) = A(t) - D(t). Queueing delay, d(t), is the time spent in the queue by a bit that arrived at time t, (assuming that the queue is served FCFS/FIFO).
Packet Switching Packet: varying length sequence of bytes. Cell: fixed length sequence of bytes. Frame is synonym for packet in Ethernet. A switch must derive at least the following 5 pieces of information from each incoming frame and from tables stored within the switch: The length of the incoming frame Whether the incoming frame has been corrupted Which egress port the frame should be directed to What new label, if any, should be put on the frame when it exits the egress port With what priority the frame should compete with other frames for the required egress port.
Frame Length Can be determined in several ways. Explicit - length field in the frame header Embedding of the frame in a lower - layer protocol that controls and specifies frame starting positions In some protocols or combinations of protocols, both explicit and frame starting exist for self – checking.
Self-Checking Most frame protocols carry redundancy fields that allow the correctness of the received frame to be probabilistically verified. In some protocols, only the header information is covered because it is assumed that the payload is protected by some end - to - end, higher layer of protocol. In other protocols, the entire frame is self - checked (e.g., TCP). Also it is common that the frame header carry information that can be checked for internal consistency: for instance, only a subset of the bit combinations of certain command or type fields may be used, in which case a check can be made for illegal codes. Failure of these self - checking mechanisms as the frame is received causes the frame to be dropped, counted, and reported as a statistic to the management system, and possibly negatively acknowledged.
Next Hop Determination In any N - ported frame switch, each arriving frame must specify in some manner at which of the N egress ports it should be directed. There are a large number of ways in which this is done: 1. Ethernet switches use a 48 - bit unique Ethernet address as an argument to a table - lookup function that returns the desired egress port. The number of possible addresses is huge, so Ethernet switches use sparce table - lookup techniques such as content addressable memory (CAM), hash tables, or binary trees. 2. IPv4 routers use a 32 - bit address as their table - lookup argument. Commonly, routers will have the final routes for a relatively small number of addresses, and default routes for the rest of the huge address space. Again, sparce table - lookup techniques are useful. 3. ATM cells carry much smaller routing fields. Depending on where in the ATM network the cells are found, their effective routing fields can be 12bits or 24 bits. Direct memory - lookup techniques are appropriate for ATM. 4. When MPLS frames are created by label edge routers (LERs), arbitrary aspects of the incoming frame are inspected to determine a forwarding equivalence class and the initial MPLS label to be appended in the new MPLS header. This process can be complex, so it is restricted to occurring only once per packet lifetime in an MPLS network. 5. Once MPLS frames have been classified and tagged by an LER, they carry a 20 - bit label which is used to determine the next hop (and the next label) in each label swap router. These fields are intended to be small enough to allow direct memory lookup to determine the egress port.
Next Label -Some protocols use only end - point labels. Such labels are held constant during the lifetime of the frame as it is forwarded from switch to switch. IP, with its destination address label, is the most important example of this sort of protocol. With such protocols, the switch does not find a new label for each frame emitted. -Some protocols determine a new label for each switch - to - switch hop that a frame takes. MPLS is an example of this approach. New labels are found by table lookup on the old, incoming label. The reason to swap labels is to avoid having to reserve labels all the way across a potentially large network. Since labels are unique only to individual switch pairs, each pair of connected switches can make their local agreement without concern for use of the label elsewhere in the network. This allows label reuse and the considerably smaller labels required by MPLS.
Class of Service Many of the protocols that are of interest in the transport networks have some mechanism of specifying or determining a class of service for each frame. The details of these mechanisms vary widely and are sometimes quite complex. The key objective is to distinguish classes of traffic that require different levels of services. These different levels of service translate, through the queuing and arbitration mechanisms, into different priorities in being routed through the frame switch.
Flows and Queuing Systems A flow is an ordered stream of frames from some ingress I, at some priority class C, to some egress E. We will notation FICE to denote a particular flow from ingress port I with class of service C and egress port E. Example: F1*3 : all the flows from ingress 1 to egress 3 Each queue used to hold a flow or a particular set of flows. We will notate QICE similarly.
Example
Support for priorities Higher-priority frames are emitted first, when both higher – and lower - priority frames are present within a switch. Within an ingress - to – egress flow, this means that it must be possible for higher - priority frames to pass lower - priority frames. Thus, when a lower - priority frame has been in a switch for some time and a higher - priority cell arrives before the lower - priority cell is emitted, the newly arrived higher - priority cell should be emitted first. Recall that our queues are strictly FIFO. This property prevents higher - priority cells from passing lower - priority frames within any single queue. Similarly, all transfers on the connecting wires of our designs are inherently FIFO. Thus, it is not possible for higher - priority traffic to pass lower - priority traffic when all traffic from some ingress port A to some egress port B passes through one common queue anywhere in its ingress - to - egress path, regardless of its priority.
Multiple queues Different priority frames are directed to different queues. The lower - priority queues can move slower than the higher - priority queues. At some point in the overall data paths, the differentiated priority flows must merge at a multiplexer that enforces the priority policy.
Queuing System QI** means that there is a queue per each ingress port that contains traffic for all classes and all egress ports. Q*CE means that there is a queue for each combination of class and egress port.
Queuing Systems Lattice Si /Se denote a need for ingress/egress speed up P denotes the inability support priorities H denotes head of line blocking
Queuing System *** Speedup: factor of N Priority Blocking: since there is only one queue, priorities cannot be recognized Head-of-Line Blocking: if a particular egress is busy, and the HOL frame is destined for that busy egress, all following frames for other idle ports must wait.
Queuing System I**
Queuing System ICE
Exercise 3 Draw each of the queuing systems. 2. Explain S,P and H.
Ingress ICE queue
Egress ICE queue
Central ICE queue
Sharing Queues If we have N2C separate queues than: If maximal frame has M bytes. We need to have N2C WM × 8 bits of queue space. For a modest switch with N = 32, C = 8, W = 8, and M = 2048, we find that we require 1 gigabit of storage! Solution: sharing of queues!
Possibilities to sharing queues Share among ingresses at each unique class - X - egress. Share among classes at each unique ingress - X - egress. Share among egresses at each unique ingress - X - class. Share among ingresses and classes at each unique egress. Share among ingress and egress at each unique class. Share among class and egress at each unique ingress. Share among ingress, class, and egress over the entire queue structure. For ingress buffer location only 2,3,6 are possible For egress buffer location only 1,2,4 are possible
No central buffer design Ideally, we would like to have one central buffer pool for all. The problem is that central buffering demands that all the frame buffers and all the switching data paths and logic be co - located. For nontrivial switch sizes, this is not feasible. We are forced to move something off the central switching chip. The most rewarding and feasible thing to move off the central switch chip(s) are the buffers. That leads us to ingress or egress -buffered designs.