Download presentation
Presentation is loading. Please wait.
Published byClementine May Modified over 9 years ago
1
Queue Manager and Scheduler on Intel IXP John DeHart Amy Freestone Fred Kuhns Sailesh Kumar
2
2 - Sailesh Kumar - 2/5/2016 Overview n Both QM and Scheduler runs on a single ME n Packet discard policy also runs here »There is a separate interface for discarded packets n Deficit round robin scheduling policy n Uses Q-array hardware and exploits LRU based eviction policy n Aggregated scheduling based architecture »Runs on a single thread »Designed to operate in batch mode to hide memory latency »Issues a batch of up to 8 memory requests at a time »Data structures are also designed to support batch mode
3
3 - Sailesh Kumar - 2/5/2016 Overall Queuing Subsystem n A set of parallel QM+SCH modules n Each set handles different sets of meta-links
4
4 - Sailesh Kumar - 2/5/2016 Queue Data Structure
5
5 - Sailesh Kumar - 2/5/2016 Queue Caching Structure The Q-array and Local memory data shown above are parallel data structures with the same index SRAM Local
6
6 - Sailesh Kumar - 2/5/2016 Scheduling Data Structure
7
7 - Sailesh Kumar - 2/5/2016 Enqueue and Dequeue Threads n Enqueue and dequeue runs on two threads. n Enqueue and Dequeue are synchronized using signals. n All data structures, like CAM, Q-array, Local memory space are shared between the enqueue and dequeue threads n Enqueue and dequeue processes consist of multiple phases »At the end of each phase, a batch of 8 commands are dispatched to SRAM/Q-array n Expectation is that, generation of the batch of 8 commands will take as much time as the SRAM read latency »Else MEs will have idle cycles
8
8 - Sailesh Kumar - 2/5/2016 Enqueue Process Enqueue process phase 1 1. Grab up to 8 requests from enqueue command FIFO 2. Filter out the queues whose tail is already present in the Q-array 3. For queues whose head is already cached, send a rd_qdesc_other command at the same Q-array entry 4. For other queues, eviction of the LRU entry from Q-array is needed. However, make sure that evict a queue entry from Q- array which is not being dequeued currently. 5. While doing the eviction, make sure that queue length is written back from local memory to SRAM. Also update CAM bits. 6. Send a rd_qdesc_tail command at this entry. 7. Read in the queue length/max length from SRAM into local memory at index = Q-array entry. 8. Update CAM (set bits of tail valid in the Q-array) 9. Switch to dequeue process until the Q-array and queue length in local memory are loaded
9
9 - Sailesh Kumar - 2/5/2016 Enqueue Process Enqueue process phase 2 10. Check if packet is discarded. 11. If admitted, send an enqueue command to the Q-array for all queues for which an enqueue command was received 12. Subsequent actions are based upon four situations a. Queue is inactive (count = 0) b. Queue is active and presently being dequeued by the dequeue process (count > 0 and dequeue flag set in CAM) c. Queue is active but is not being dequeued d. Queue is active and packet is discarded
10
10 - Sailesh Kumar - 2/5/2016 Case I – Queue is inactive Note that the next pointer of the tail segment will be set to the free segment just allocated. Newly allocated segment tail is always kept in the local memory SRAM Read SRAM Write Here we assume that free segments in local memory are not available
11
11 - Sailesh Kumar - 2/5/2016 Case II – Active but not dequeued Do Nothing
12
12 - Sailesh Kumar - 2/5/2016 Case III – Active and being dequeued Update the queue length stored in the local memory indexed by the Q-array index, by adding the current packet’s length into it.
13
13 - Sailesh Kumar - 2/5/2016 Enqueue Process Enqueue process phase 2 continued. 13. Update CAM bits accordingly. 14. Switch to the dequeue process. 15. Start over again.
14
14 - Sailesh Kumar - 2/5/2016 Dequeue Process Dequeue process phase 1 1. Begin from the head segment of the active list. If head segment has some queues then start as follows. 2. Skip queues whose head descriptor is already cached in Q-array 3a. For queues, whose tail is already cached, send a rd_qdesc_other command at the same Q-array entry 3b. For other queues, eviction of the LRU entry from Q-array is needed. However, make sure that evict a queue entry from Q- array which is not being enqueued currently. 3c. While doing the eviction, make sure that queue length is also written back from local memory to SRAM. Also update CAM bits. 4a. Send a rd_qdesc_head command at this entry. This will supply the queue entry in active list with credit, and weight. 4b. Read in the queue length/max length from SRAM into local memory at index = Q-array entry. 5. Update CAM (set bits of head valid in Q-array) 6. Switch to enqueue process until the Q-array is loaded.
15
15 - Sailesh Kumar - 2/5/2016 Dequeue Process Dequeue process phase 2 8. Send up to 8 dequeue requests for the queues in head segment 9. Switch to enqueue process Dequeue process phase 3 10. After dequeue are complete, send SRAM reads to read the packet lengths. 11. Switch to enqueue process
16
16 - Sailesh Kumar - 2/5/2016 Dequeue Process Dequeue process phase 4 12. Once length of dequeued packets is known, update the credits. 13. Queues which becomes inactive, send a wr_qdesc_count command and evict from Q-array if it is not being enqueued. 14. Queues whose credit gets over, are moved from the head segment to the tail segment. If tail segment doesn’t have any space left, allocate a new segment from free list as described in the enqueue process. 15. Update the CAM bits. 16. If head segment becomes empty a. Add it to the tail of the free segment list. b. Read the next head of the active list into local memory. c. Switch to enqueue process. d. Go to step 1. 17. Else a. Switch to enqueue process. b. Go to step 8.
17
17 - Sailesh Kumar - 2/5/2016 Enqueue Dequeue synchronization Phase I : Send up to 8 SRAM read Queue descriptors (tail) into Q-array. Write back LRU entry from Q-array and write back the associated queue lengths. Special treatment for queues whose head is already cached. Update CAM entries. Phase II : Send enqueue command to the Q-array. Update the queue length in local memory indexed by the Q-array index. Update the scheduling data structure (may involve one SRAM read and one SRAM write) Phase I : Send up to 8 SRAM read to read Queue descriptor of queues at the head of active list. Write back LRU entries and queue lengths to SRAM. Update CAM entries. Phase II : Send up to 8 dequeue requests for the cached queues. Phase III : Read the length of the dequeued packets Phase IV : Update queue credits and the scheduling data structure. May involve 1 SRAM read and 1 SRAM write access. Enqueue Dequeue
18
18 - Sailesh Kumar - 2/5/2016 Life of a single active queue Free segment HeadTail0 Head 8, w, c, x... HeadTail Lets say 8 enqueue requests for an inactive queue, x arrives (weight w ) 1 Free list Active list Enqueue process SRAM Local
19
19 - Sailesh Kumar - 2/5/2016 Life of a single active queue Free segment HeadTail1 Head 3, w, c, x... HeadTail After enqueue, dequeue process sends 5 packets and credit is exhausted Free list Active list Dequeue process Credit is exhausted Allocate a free segment SRAM Local
20
20 - Sailesh Kumar - 2/5/2016 Life of a single active queue Free segment HeadTail1 Head 3, w, c, x Tail... HeadTail Put queue in the allocated tail segment Make head of the active list = the next head Move the now empty head segment to free segment pool Free segment
21
21 - Sailesh Kumar - 2/5/2016 Life of multiple active queues Free segment HeadTail100 x, y, z, w … … … Head a, w, c, x … … … Tail a, b, c, d … … …... Let’s say all queue’s credits are exhausted Free segment x, y, z, w … … … Free segment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.