March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Slides:

Advertisements

Similar presentations

Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.

Advertisements

Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.

March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

May 27, 2008 L1-1 Why formal verification remains on the fringes of commercial development Arvind Computer Science & Artificial.

Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)

Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.

February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.

September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.

IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.

IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013

February 22, 2005http://csg.csail.mit.edu/6.884/L07-1 Bluespec-1: Design Affects Everything Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1

September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.

Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.

March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.

September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.

Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.

Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.

February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,

Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.

Simple Inelastic and Folded Pipelines Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 14, 2011L04-1.

October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.

October 6, 2009http://csg.csail.mit.edu/koreaL10-1 IP Lookup-2: The Completion Buffer Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Elastic Pipelines: Concurrency Issues

EHRs: Designing modules with concurrent methods

Bluespec-3: A non-pipelined processor Arvind

Concurrency properties of BSV methods and rules

Bluespec-6: Modeling Processors

Folded “Combinational” circuits

Scheduling Constraints on Interface methods

Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind

Sequential Circuits Constructive Computer Architecture Arvind

Sequential Circuits: Constructive Computer Architecture

IP Lookup: Some subtle concurrency issues

Stmt FSM Arvind (with the help of Nirav Dave)

Performance Specifications

Pipelining combinational circuits

Multirule Systems and Concurrent Execution of Rules

Bluespec-1: Design Affects Everything

Constructive Computer Architecture: Guards

Sequential Circuits Constructive Computer Architecture Arvind

Pipelining combinational circuits

EHR: Ephemeral History Register

Bluespec-4: Architectural exploration using IP lookup Arvind

Constructive Computer Architecture: Well formed BSV programs

Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind

Modules with Guarded Interfaces

Pipelining combinational circuits

Sequential Circuits - 2 Constructive Computer Architecture Arvind

Elastic Pipelines: Concurrency Issues

Bluespec-3: A non-pipelined processor Arvind

Multirule systems and Concurrent Execution of Rules

Stmt FSM Arvind (with the help of Nirav Dave)

Modular Refinement - 2 Arvind

IP Lookup Arvind Computer Science & Artificial Intelligence Lab

IP Lookup: Some subtle concurrency issues

Elastic Pipelines: Concurrency Issues

Elastic Pipelines and Basics of Multi-rule Systems

Constructive Computer Architecture: Guards

Elastic Pipelines and Basics of Multi-rule Systems

Multirule systems and Concurrent Execution of Rules

IP Lookup: Some subtle concurrency issues

Bluespec-5: Scheduling & Rule Composition

Modular Refinement Arvind

Implementing for Correct Concurrency

Constructive Computer Architecture: Well formed BSV programs

Presentation transcript:

March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

March, 2007 IPlookup-2http://csg.csail.mit.edu/arvind IP Lookup block in a router Queue Manager Packet Processor Exit functions Control Processor Line Card (LC) IP Lookup SRAM (lookup table) Arbitration Switch LC A packet is routed based on the “Longest Prefix Match” (LPM) of it’s IP address with entries in a routing table Line rate and the order of arrival must be maintained line rate  15Mpps for 10GE

March, 2007 IPlookup-3http://csg.csail.mit.edu/arvind IP addressResultM Ref F F E C Sparse tree representation 3 A … A … B C … C … 5 D F … F … 14 A … A … 7 F … F … 200 F … F … F* E5.*.*.* D C * B A7.14.*.* F … F … F F … E A In this lecture:  Level 1: 16 bits  Level 2: 8 bits Level 3: 8 bits  1 to 3 memory accesses

March, 2007 IPlookup-4http://csg.csail.mit.edu/arvind “C” version of LPM int lpm (IPA ipa) /* 3 memory lookups */ { int p; /* Level 1: 16 bits */ p = RAM [ipa[31:16]]; if (isLeaf(p)) return value(p); /* Level 2: 8 bits */ p = RAM [ptr(p) + ipa [15:8]]; if (isLeaf(p)) return value(p); /* Level 3: 8 bits */ p = RAM [ptr(p) + ipa [7:0]]; return value(p); /* must be a leaf */ } Not obvious from the C code how to deal with - memory latency - pipelining … … … … 0 Must process a packet every 1/15 s or 67 ns Must sustain 3 memory dependent lookups in 67 ns Memory latency ~30ns to 40ns

March, 2007 IPlookup-5http://csg.csail.mit.edu/arvind Longest Prefix Match for IP lookup: 3 possible implementation architectures Rigid pipeline Inefficient memory usage but simple design Linear pipeline Efficient memory usage through memory port replicator Circular pipeline Efficient memory with most complex control Designer’s Ranking: 123 Which is “best”? Arvind, Nikhil, Rosenband & Dave ICCAD 2004

March, 2007 IPlookup-6http://csg.csail.mit.edu/arvind Circular pipeline The fifo holds the request while the memory access is in progress The architecture has been simplified for the sake of the lecture. Otherwise, a “completion buffer” has to be added at the exit to make sure that packets leave in order. enter? done? RAM yes inQ fifo no outQ

March, 2007 IPlookup-7http://csg.csail.mit.edu/arvind interface FIFO#(type t); method Action enq(t x);// enqueue an item method Action deq();// remove oldest entry method t first();// inspect oldest item endinterface FIFO n = # of bits needed to represent a value of type t not full not empty rdy enab n n rdy enab rdy enq deq first FIFO module

March, 2007 IPlookup-8http://csg.csail.mit.edu/arvind Request-Response Interface for Memory Synch Mem Latency N Addr Ready ctr (ctr > 0) ctr++ ctr-- deq Enable enq interface Mem#(type addrT, type dataT); method Action req(addrT x); method Action deq(); method dataT peek(); endinterface Data Ack Data Ready req deq peek

March, 2007 IPlookup-9http://csg.csail.mit.edu/arvind Circular Pipeline Code rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule When can enter fire? When can recirculate fire?

March, 2007 IPlookup-10http://csg.csail.mit.edu/arvind module mkFIFO1 (FIFO#(t)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Action clear(); full <= False; endmethod endmodule One Element FIFO enq and deq cannot even be enabled together much less fire concurrently! n not empty not full rdy enab rdy enab enq deq FIFO module The functionality we want is as if deq “happens” before enq; if deq does not happen then enq behaves normally We can build such a FIFO

March, 2007 IPlookup-11http://csg.csail.mit.edu/arvind Dead cycle elimination rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule Can a new request enter the system simultaneously with an old one leaving?

March, 2007 IPlookup-12http://csg.csail.mit.edu/arvind Scheduling conflicting rules When two rules conflict on a shared resource, they cannot both execute in the same clock The compiler produces logic that ensures that, when both rules are applicable, only one will fire Which one? source annotations (* descending_urgency = “recirculate, enter” *)

March, 2007 IPlookup-13http://csg.csail.mit.edu/arvind Circular Pipeline Code rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule enter? done? RAM inQ fifo rule recirculate (True); TableEntry p = ram.peek(); ram.deq(); IP rip = fifo.first(); if (isLeaf(p)) outQ.enq(p); else begin fifo.enq(rip << 8); ram.req(p + rip[15:8]); end fifo.deq(); endrule In general these two rules conflict but when isLeaf(p) is true there is no apparent conflict!

March, 2007 IPlookup-14http://csg.csail.mit.edu/arvind Rule Spliting rule foo (True); if (p) r1 <= 5; else r2 <= 7; endrule rule fooT (p); r1 <= 5; endrule rule fooF (!p); r2 <= 7; endrule  rule fooT and fooF can be scheduled independently with some other rule

March, 2007 IPlookup-15http://csg.csail.mit.edu/arvind Spliting the recirculate rule rule recirculate (!isLeaf(ram.peek())); IP rip = fifo.first(); fifo.enq(rip << 8); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq(); endrule rule exit (isLeaf(ram.peek())); outQ.enq(ram.peek()); fifo.deq(); ram.deq(); endrule rule enter (True); IP ip = inQ.first(); ram.req(ip[31:16]); fifo.enq(ip[15:0]); inQ.deq(); endrule Now rules enter and exit can be scheduled simultaneously