Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington.

Similar presentations


Presentation on theme: "CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington."— Presentation transcript:

1 CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington University Fall 2002 http://www.arl.wustl.edu/~lockwood/class/cs536/ John Lockwood Copyright 2002 Lockwood@arl.wustl.edu

2 CS/CoE 536 : Lockwood 2 Detailed Machine Problem 3 Structure Layered Protocol Wrappers Content- based Match (regex) (MP2) Expanded CAM-based Firewall (MP1 w/extra entries & FlowID) Flow Buffer Queue Manager (MP 3) Input Traffic Data From Linecard Firewall on a Chip ( Implemented on the RAD on the FPX, a VirtexE 2000 FPGA ) Output Traffic Data To Linecard or switch p p p p Off-Chip Synchronous Random Access Memory (SDRAM) Match vector Flow# from CAM Identify packets Based on Head Pointers Tail Pointers SDRAM Free List Manager SDRAM Free pointers 16 Off-Chip Static Random Access Memory (SRAM) SRAM Controller SDRAM Controller SRAM Interface Scheduler (RR, DRR, 3DQ)

3 CS/CoE 536 : Lockwood 3 MP3 Assignment : Part 1 (Flow Classifier) –Extend Number of CAM entries Expanded to 4 entries –Add Additional FlowID Field in CAM FlowID returns 16-bit value –Augment Control Packet to allow updating of multiple CAM entries Offset and Number of CAMs can be specified –Implement Priority Encoder Output the value from the CAM that has the FlowID from the lowest matching CAM entry. If no CAM matches, set FlowID = 01 –Set TTL=0 for FlowID = 0 –Design FSM for Queue Manager of Part 2

4 CS/CoE 536 : Lockwood 4 MP3 Assignment : Part 2 (Queue Manager) –Instantiate FIFOs Size = 512 deep by 16 bits wide Could be deeper to allow for more backlogged flows –Initialize SRAM for all flows Write Count = 0 Read Count = 0 Head[i].tail = FlowID Tail[i].head = FlowID –Implement EnQueue FSM –Implement DeQueue FSM –Implement Scheduler Read the FlowID from the highest priority service list Can be combinational logic that controls FIFOs

5 CS/CoE 536 : Lockwood 5 Packet Classifier w/extra entries and FlowID CAM_MASK_1 CAM_VALUE_1 0 0 111 0 Src IP Dest IP Src Port Dest Port Proto Con- tent CAM_MASK_2 CAM_VALUE_2 0 111 0 CAM_MASK_3 CAM_VALUE_3 0 111 0 CAM_MASK_4 CAM_VALUE_4 0 111 0 Output = 0 : match, 1 : mismatch FlowID_1 112 bits 16 bits FlowID_2 16 bits Output = FlowID : if Highest Priority, All 0’s otherwise FlowID_3 FlowID_4 Output = 0 : match and Highest Priority, 1 : mismatch or Lower priority Output = All 0’s : match, 0 : Otherwise Output = FlowID

6 CS/CoE 536 : Lockwood 6 Updated Control Packet Control Packet now can program up to N CAM entries –Not just the first two New Header Fields allocated for –Transmit enable –Base CAM Number –Number of CAMs New CAM Value Field Allocated for –FlowID ATM Header Packet Len Source IP address 192.168.30.13 ( 0xC0A81E0D ) CAM_1_SRC_IP CAM_MASK_1 … (if necessary) ToSHLVer FragmentIP ID Src Port Dest Port ( 0x0320 ) LengthChecksum #CAMs AAL5 Pad CPS-UU & CPI AAL5 Frame Checksum Frame Len CAM_1_DEST_IP CAM_1_PORTS CAM_2_SRC_IP CAM_MASK_2 CAM_2_DEST_IP CAM_2_PORTS CAM_VALUE_1 CAM_MASK_1 CAM_VALUE_2 CAM_MASK_2 ChecksumProtoTTL Base_CAMX CAM_1_ PROTO (PAD) Match_ vector... CAM_2_ PROTO (PAD) Match_ vector Flow_ID

7 CS/CoE 536 : Lockwood 7 Protocol Interface to Flow Buffer Data { In, Out } –PktData –Valid –Start of Packet (SOPkt) –End of Packet (EOPkt) Control Pointers { Head, NextHead, Tail, NextTail } –Value –Valid SDRAM Memory Interface –Given to you

8 CS/CoE 536 : Lockwood 8 Data Interface to Flow Buffer --Global signals clk : in std_logic; reset : in std_logic; -- Active Low: 0 Clears all queues Initialization : out std_logic; -- Active high: Goes to ‘1’ once -- circuit is ready to accept traffic BufferFull : out std_logic; -- Active high: Goes to ‘1’ if memory -- becomes full --packet data going to the flow buffer PktDataIn : in std_logic_vector(31 downto 0); PktDataInValid : in std_logic; SoPktIn : in std_logic; EoPktIn : in std_logic; --packet data going out of the flow buffer PktDataOut : in std_logic_vector(31 downto 0); PktDataOutValid : in std_logic; SoPktOut : in std_logic; EoPktOut : in std_logic; Flow Buffer Queue Manager (MP 3) pp Scheduler (RR, DRR, 3DQ) SRAM Controller

9 CS/CoE 536 : Lockwood 9 Signal Interface to Flow Buffer --Interface with the Queue Contex -- From the Queue Manager Tail: in std_logic_vector(31 downto 0); TailValid: in std_logic; -- To the Queue Manager NextTail: out std_logic_vector(31 downto 0); NextTailValid: out std_logic; -- From the Queue Manager Head: in std_logic_vector(31 downto 0); HeadValid: in std_logic; -- To the Queue Manager NextHead: out std_logic_vector(31 downto 0); NextHeadValid: out std_logic; Flow Buffer Queue Manager (MP 3) p Scheduler (RR, DRR, 3DQ) Tail Next Tail Head Next Head Packets Going to Flow Buffer p Off-Chip Static Random Access Memory (SRAM) SRAM Controller Packets Coming From Flow Buffer

10 CS/CoE 536 : Lockwood 10 EnQueue interface

11 CS/CoE 536 : Lockwood 11 DeQueue interface

12 CS/CoE 536 : Lockwood 12 Using the ZBT SRAM Controller SRAM_REQ –Request to use Interface SRAM_GR –Grant to use interface SRAM_D_OUT –Data Bus from module to SRAM (36 bits) SRAM_D_IN –Data Bus to module from SRAM (36 bits) SRAM_ADDR –Address (18 bits provide access to 256k words) SRAM_WR_RD –Write=0, Read=1 SRAM_D_OUT[35:0] SRAM_ADDR[17:0] SRAM_WR_RD SRAM_REQ SRAM_GR SRAM_D_IN[35:0] SRAM Interface Off-Chip Static Random Access Memory (SRAM)

13 CS/CoE 536 : Lockwood 13 Priorities and Flow Identifiers Flow # generated by the CAM interpreted by the Queue Manager as having two parts –Priority 00 = lowest, 01=low, 10=high, 11=highest Lowest priority are the last cells to depart –Flow 14-bit number that specifies a flow within a priority 16-Bit Flow # From CAM Flow Priority 15 2 1 0

14 CS/CoE 536 : Lockwood 14 Detail of the Queue Manager and Scheduler The 3DQ Scheduler for MP3 combines priority-based scheduling with flow-based scheduling –Supports four priority levels Each implemented as a CoreGen FIFO that stores Flow IDs –Supports per-flow queuing Backlogged flows are serviced in a round-robin manner Flow Buffer Queue Manager p Scheduler Tail Next Tail Head Next Head p P0 Queue Re-Entry P1 Queue Re-Entry P2 Queue Re-Entry P3 Queue Re-Entry Priority Encoder Priority Dencoder

15 CS/CoE 536 : Lockwood 15 Queue Manager and Scheduler For any given priority level –A FIFO of flows tracks the flows that contain one or more packets –When a new packet arrives Tail pointer read from SRAM to provide SDRAM with address of packet Flow’s read and write count are read –If (Write-Read)>0, then the flow was already backlogged –Else the flow should be pushed into the FIFO for later service Flow’s new write count incremented and written to SRAM Tail point is written to SRAM with new NextTail value reported from SDRAM P0 Queue Re-Entry

16 CS/CoE 536 : Lockwood 16 Queue Manager and Scheduler (Continued) –When packet is to be transmitted Head pointer read from SRAM to provide SDRAM with address of packet Flow’s read and write count are read –If (Write-Read)=1, then this packet is the last in the flow. –Else the flow should re-enter the same priority queue Flow’s read count is incremented and written to SRAM Head pointer to written to SRAM with new NextHead value reported from SDRAM P0 Queue Re-Entry

17 CS/CoE 536 : Lockwood 17 Queue Management A B C Reserved Empty Slot [implementation dependant] M[x] M[y] M[z] M[u] Other Flow State FiFi Head Pointer D M[v] Note that packets can be stored anywhere in memory Packet Store Flow State Head Pointer Tail Pointer Empty Slot Head Pointer Tail Pointer Packet Reads = 0 FiFi Packet Writes = 3

18 CS/CoE 536 : Lockwood 18 Initial State of Queue Reserved Empty Slot [implementation dependant] M[0] F1F1 F0F0 Head Pointer M[64k] Note that packets can be stored anywhere in memory Packet Store Flow State Head Pointer Tail Pointer Empty Slots Head Pointer Tail Pointer Packet Reads = 0 FiFi Packet Writes = 0 Reserved Empty Slot [implementation dependant] M[1]

19 CS/CoE 536 : Lockwood 19 New Packet Arrives A B C New Data Reserved Empty Slot [implementation dependant] M[u] FiFi Packet Store Flow State Head Pointer Tail Pointer Empty Slot Updated Pointers Head Pointer Tail Pointer Packet Reads = 0 Packet Writes = 4

20 CS/CoE 536 : Lockwood 20 Packet Departs Head Pointer Tail Pointer Packet Reads = 1 B C FiFi Note that packets can be stored anywhere in memory Packet Store Flow State Head Pointer Tail Pointer Empty Slot Packet Writes = 3

21 CS/CoE 536 : Lockwood 21 Flow Pointer Initialization in SRAM Head & Tail pointers –For all possible FlowIDs (16 bits) –Every head/tail pointer unique. Initial values = {0,1,2,..,0xFFFF} Packet read & write counts reset –For all possible FlowIDs (16 bits) –All counts initially zero Initial values = {0} Head Pointer: x0 Tail Pointer: x0 Packet Reads = 0 Packet Writes = 0 Head Pointer: x1 Tail Pointer: x1 Packet Reads = 0 Packet Writes = 0 Head Pointer: x2 Tail Pointer: x2 Packet Reads = 0 Packet Writes = 0 Head Pointer: x3 Tail Pointer: x3 Packet Reads = 0 Packet Writes = 0 Head Pointer: x4 Tail Pointer: x4 Packet Reads = 0 Packet Writes = 0 Head Pointer: x5 Tail Pointer: x5 Packet Reads = 0 Packet Writes = 0 Head Pointer: x6 Tail Pointer: x6 Packet Reads = 0 Packet Writes = 0 Head Pointer: 7 Tail Pointer: 7 Packet Reads = 0 Packet Writes = 0 Head Pointer: xFFFE Tail Pointer: xFFFE Packet Reads = 0 Packet Writes = 0 Head Pointer: xFFFF Tail Pointer: xFFFF Packet Reads = 0 Packet Writes = 0 SRAM ADDR: FlowID & “00” SRAM ADDR: FlowID & “01” SRAM ADDR: FlowID & “10” SRAM ADDR: FlowID & “11”

22 CS/CoE 536 : Lockwood 22 Flow Pointer Initialization in SRAM FlowID (16 bit) SRAM Address (18 bit) SRAM Data (32 bit) Queue Context Function 000Head Pointer 10Packet Reads 20Packet Writes 30Tail Pointer 141Head Pointer 50Packet Reads 60Packet Writes 71Tail Pointer 282Head Pointer Points to loaction in SDRAM Points to Location in SDRAM. It is initialized when reset_l=‘0’ to FlowID

23 CS/CoE 536 : Lockwood 23 SRAM Multi-Module Interface Controls off-chip ZBT SRAM Provides independent interface for two (or more) modules Arbitrates requests and issues grant to winning module Modules retain access by holding request high after receiving grant ( Diagram : David Taylor )

24 CS/CoE 536 : Lockwood 24 SRAM Interface Timing All I/O signals flopped at module boundary to ensure timing constraints are met Timing diagram takes reference point from inside module with boundary flops ( Diagram : David Taylor ) QM

25 CS/CoE 536 : Lockwood 25 2: Queue Manager provides an address Memory pointer, x = Flow[i].tail Detailed Operation of the Flow Buffer Flow Buffer Queue Manager pp Off-Chip Synchronous Random Access Memory (SDRAM) Free List Manager Tail.PtrHead.Ptr Free.Ptr 1: New packet data arrives on Flow Number i Packet Data Flow# Packet Data 5: Queue Manager provided with new memory pointer, y. Set Flow[i].tail = y 4: New memory pointer, y, provided from free list 1: QM Decides it is time for a packet to depart. Reads Flow[i].head = z 2: packet from M[z] 3: Packet Data stored in M[x] 3: QM sets Flow[I].head = M[z].next 5: z returned to the Free List 4: M[z].data transmitted

26 CS/CoE 536 : Lockwood 26 Detail of the Queue Manager and Scheduler Flow Buffer Queue Manager Tail Next Tail p Scheduler P0 Queue Re-Entry P1 Queue Re-Entry P2 Queue Re-Entry P3 Queue Re-Entry Priority Encoder Priority Dencoder Enqueue FSM DeQueue FSM Off-Chip Static Random Access Memory (SRAM) SRAM Controller p FlowID Valid From CAM To Protocol Wrappers Head Next Head TX Enable reset ready Initialize

27 CS/CoE 536 : Lockwood 27 Other notes about the Queue Controller Managing SRAM Contention –Use control of SRAM to implement semaphore during read+modify+write operation –Relinquish Control of SRAM during packet storage and retrieval Bus bandwidth is required to be shared between the Enqueue and Dequeue components Saving Time During Simulation –Initialize less memory to save time But be sure to initialize all memory prior to synthesis


Download ppt "CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington."

Similar presentations


Ads by Google