March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Slides:



Advertisements
Similar presentations
Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
Advertisements

Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
May 27, 2008 L1-1 Why formal verification remains on the fringes of commercial development Arvind Computer Science & Artificial.
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
February 22, 2005http://csg.csail.mit.edu/6.884/L07-1 Bluespec-1: Design Affects Everything Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1
September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Simple Inelastic and Folded Pipelines Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 14, 2011L04-1.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
October 6, 2009http://csg.csail.mit.edu/koreaL10-1 IP Lookup-2: The Completion Buffer Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Elastic Pipelines: Concurrency Issues
EHRs: Designing modules with concurrent methods
Bluespec-3: A non-pipelined processor Arvind
Concurrency properties of BSV methods and rules
Bluespec-6: Modeling Processors
Folded “Combinational” circuits
Scheduling Constraints on Interface methods
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
IP Lookup: Some subtle concurrency issues
Stmt FSM Arvind (with the help of Nirav Dave)
Performance Specifications
Pipelining combinational circuits
Multirule Systems and Concurrent Execution of Rules
Bluespec-1: Design Affects Everything
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Pipelining combinational circuits
EHR: Ephemeral History Register
Bluespec-4: Architectural exploration using IP lookup Arvind
Constructive Computer Architecture: Well formed BSV programs
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
Modules with Guarded Interfaces
Pipelining combinational circuits
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Elastic Pipelines: Concurrency Issues
Bluespec-3: A non-pipelined processor Arvind
Multirule systems and Concurrent Execution of Rules
Stmt FSM Arvind (with the help of Nirav Dave)
Modular Refinement - 2 Arvind
IP Lookup Arvind Computer Science & Artificial Intelligence Lab
IP Lookup: Some subtle concurrency issues
Elastic Pipelines: Concurrency Issues
Elastic Pipelines and Basics of Multi-rule Systems
Constructive Computer Architecture: Guards
Elastic Pipelines and Basics of Multi-rule Systems
Multirule systems and Concurrent Execution of Rules
IP Lookup: Some subtle concurrency issues
Bluespec-5: Scheduling & Rule Composition
Modular Refinement Arvind
Implementing for Correct Concurrency
Constructive Computer Architecture: Well formed BSV programs
Presentation transcript:

March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

March, 2007 IPlookup-2http://csg.csail.mit.edu/arvind IP Lookup block in a router Queue Manager Packet Processor Exit functions Control Processor Line Card (LC) IP Lookup SRAM (lookup table) Arbitration Switch LC A packet is routed based on the “Longest Prefix Match” (LPM) of it’s IP address with entries in a routing table Line rate and the order of arrival must be maintained line rate  15Mpps for 10GE

March, 2007 IPlookup-3http://csg.csail.mit.edu/arvind IP addressResultM Ref F F E C Sparse tree representation 3 A … A … B C … C … 5 D F … F … 14 A … A … 7 F … F … 200 F … F … F* E5.*.*.* D C * B A7.14.*.* F … F … F F … E A In this lecture:  Level 1: 16 bits  Level 2: 8 bits Level 3: 8 bits  1 to 3 memory accesses

March, 2007 IPlookup-4http://csg.csail.mit.edu/arvind “C” version of LPM int lpm (IPA ipa) /* 3 memory lookups */ { int p; /* Level 1: 16 bits */ p = RAM [ipa[31:16]]; if (isLeaf(p)) return value(p); /* Level 2: 8 bits */ p = RAM [ptr(p) + ipa [15:8]]; if (isLeaf(p)) return value(p); /* Level 3: 8 bits */ p = RAM [ptr(p) + ipa [7:0]]; return value(p); /* must be a leaf */ } Not obvious from the C code how to deal with - memory latency - pipelining … … … … 0 Must process a packet every 1/15 s or 67 ns Must sustain 3 memory dependent lookups in 67 ns Memory latency ~30ns to 40ns

March, 2007 IPlookup-5http://csg.csail.mit.edu/arvind Longest Prefix Match for IP lookup: 3 possible implementation architectures Rigid pipeline Inefficient memory usage but simple design Linear pipeline Efficient memory usage through memory port replicator Circular pipeline Efficient memory with most complex control Designer’s Ranking: 123 Which is “best”? Arvind, Nikhil, Rosenband & Dave ICCAD 2004

March, 2007 IPlookup-6http://csg.csail.mit.edu/arvind Circular pipeline The fifo holds the request while the memory access is in progress The architecture has been simplified for the sake of the lecture. Otherwise, a “completion buffer” has to be added at the exit to make sure that packets leave in order. enter? done? RAM yes inQ fifo no outQ

March, 2007 IPlookup-7http://csg.csail.mit.edu/arvind interface FIFO#(type t); method Action enq(t x);// enqueue an item method Action deq();// remove oldest entry method t first();// inspect oldest item endinterface FIFO n = # of bits needed to represent a value of type t not full not empty rdy enab n n rdy enab rdy enq deq first FIFO module

March, 2007 IPlookup-8http://csg.csail.mit.edu/arvind Request-Response Interface for Memory Synch Mem Latency N Addr Ready ctr (ctr > 0) ctr++ ctr-- deq Enable enq interface Mem#(type addrT, type dataT); method Action req(addrT x); method Action deq(); method dataT peek(); endinterface Data Ack Data Ready req deq peek

March, 2007 IPlookup-9http://csg.csail.mit.edu/arvind Circular Pipeline Code rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule When can enter fire? When can recirculate fire?

March, 2007 IPlookup-10http://csg.csail.mit.edu/arvind module mkFIFO1 (FIFO#(t)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Action clear(); full <= False; endmethod endmodule One Element FIFO enq and deq cannot even be enabled together much less fire concurrently! n not empty not full rdy enab rdy enab enq deq FIFO module The functionality we want is as if deq “happens” before enq; if deq does not happen then enq behaves normally We can build such a FIFO

March, 2007 IPlookup-11http://csg.csail.mit.edu/arvind Dead cycle elimination rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule Can a new request enter the system simultaneously with an old one leaving?

March, 2007 IPlookup-12http://csg.csail.mit.edu/arvind Scheduling conflicting rules When two rules conflict on a shared resource, they cannot both execute in the same clock The compiler produces logic that ensures that, when both rules are applicable, only one will fire Which one? source annotations (* descending_urgency = “recirculate, enter” *)

March, 2007 IPlookup-13http://csg.csail.mit.edu/arvind Circular Pipeline Code rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule In general these two rules conflict but when isLeaf(p) is true there is no apparent conflict!

March, 2007 IPlookup-14http://csg.csail.mit.edu/arvind Rule Spliting rule foo (True); if (p) r1 <= 5; else r2 <= 7; endrule rule fooT (p); r1 <= 5; endrule rule fooF (!p); r2 <= 7; endrule  rule fooT and fooF can be scheduled independently with some other rule

March, 2007 IPlookup-15http://csg.csail.mit.edu/arvind Spliting the recirculate rule rule recirculate (!isLeaf(ram.peek())); IP rip = fifo.first(); fifo.enq(rip << 8); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq(); endrule rule exit (isLeaf(ram.peek())); outQ.enq(ram.peek()); fifo.deq(); ram.deq(); endrule rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule Now rules enter and exit can be scheduled simultaneously