Download presentation
Presentation is loading. Please wait.
Published byCláudia Barroso Porto Modified over 6 years ago
1
MacroNET Lightnode Architecture & Network
ELE-580i PRESENTATION-II 04/29/2003 Canturk ISCI
2
OUTLINE Introduction/Review [1min] Methodology [2min]
Communications [3min] CPU & Router [6min] Transistor Counts [4min] Simulation [1min] MacroNET SIM DEMO [3min] 11/17/2018
3
Introduction / Review MacroNET: Project Definition & Motivation:
Large area electronics + flexible & deformable <rolled like a window shade> Distributed processing units (light) Project Definition & Motivation: Sparse topology & Lightweight node & Fault tolerant Transistor Budget Compute & Route same + buffer=regfile/memory Some extent of path diversity Particular design challenges Tight complexity, fault tolerance Relaxed Latency, performance Related Work: More performance driven Lightweight networks – for on chip processors focus on reducing latency 11/17/2018
4
Methodology/Experience
Assume a Topology Now TORUS (as the superset of others) Define the nature of Communication With external world / sensors The Strokes instructions among nodes Zoom around node(r,c) Dim / brighten Maketop N/S/W/E Compute distance Instructions within nodes Packet Types Explicit acks for simplicity [alewife] Short size [alpha] Idempotency/delivery Msg/ack/conf [idempotent] + Acked & broadcasting Routing Model Livelock avoidance Transistor Budget Topology Routing & Flow Control Traffic Pattern Compute Node Model Instructions & Communication Define compute node CPU + Router Architecture T counts < 1000 10% for pads 11/17/2018
5
COMMUNICATIONS Input Method: Packet Types: 8 bit flits, phits = flits
From sensors like strokes converted to instr-n packets Zoom/Dim/Brighten/Maketop: Click to cell Processor makes the message Broadcasts Compute Distance: 1st click: 1st cell Processor makes the message Broadcasts 2nd click: 2nd cell Processor sets status flag, waits for message computes distance Packet Types: 8 bit flits, phits = flits CPU instr-n: 000|xxxxx MSG: [abc|xxxxx][xxxxxxxx][…] Special Packets: ACK: [111|11111] ACKED: [111|10101] CONFIRM: [111|01010] 11/17/2018
6
Instructions All instr-ns go thru CPU pipe Zoom Around:
Local instr-ns: [000|xxxxx] Network instr-ns: [abc|xxxxx] Zoom Around: 1st flit will tell src 2nd flit will tell color for next or keep [001|01|10|x][color/keep][bright/keep] Dim/ Brighten : Single flit [010/011|xxxxx] Maketop N/S/W/E: Livelock problems when incorporating data exchange Use Maketop + Exchange (unicast) Maketop: Single flit tells direction [100|01|xxx] Exchange: tells dest. [110|01|11|x][color][bright] Sth like dimensional routing Need timeout for flt tolerance now Compute Distance: single flit tells dest. [101|11|00|x] Can be made unicast but no need 11/17/2018
7
CPU & ROUTER Router = Processor + AUXILIARY
Package headers are instr-ns Tagging tells what is what Headers also decoded in ID Messages also walk thru CPU pipe Maketop Different packets arrive simult. Need simple arbitration (LRS) Exchange Routing Need simple router within controller Compute Distance: arithmetic Need a simple ALU Status Flags (recall HW March. Presentation) For each outgoing channel Global one for CPU 11/17/2018
8
Router Architecture Mostly for broadcasting with minimal possible HW
Stupid, but T counts are problem Single cycle in order datapath No buffering Keeps certain node info Examples in [NOW][mmr][flex] & [stallion] focus on other issues We can’t fit an 8 bit 4x4 Xbar ([Ruby]&[bkmrk]) We start from [SP2] & [Hwpres] and implement: Mem based buffering/switching (static structure) Allocation: LRS ~Credit Based Flow Control (creditMAX=1) Register Mapped (5 reg) Network interface 11/17/2018
9
Router Diagram CONTROLLER RF 11/17/2018 I/p channel status
o/p channel status P0 P1 P2 P3 Process Queue - I$ Dec EX D$ RF NODE O/P (ack) (acked) (confirm/idle) Timeout Timer CPU Status Flags Routing CONTROLLER Allocation Datapath Related Message Generation NODE INFO Color Brightness X Y TOP 11/17/2018
10
Status Fields CPU Status flags
Such as: Waiting for data from Px To Stall the datapath or block dispense of Reg-s I/p channel Status Fields [I|R/W|A|Acked] I: Idle <Default> R/W: Routing or waiting for its turn A: CPU processing packet or sending downstream Only 1 port can be active at a time Acked: Another port already rcved/processed same msg O/p channel Status Fields [Ack|Acked|Timeout|W] Ack: Downstream idle (Acked last flit) <default> Acked: Downstream Acked’ed last packet don’t send any more Timeout: Downstream’s last awaited ACK timed out W: Waiting for ACK 11/17/2018
11
TRANSISTOR COUNTS Except from Datapath: Datapath: Control:
13 x 8 bit Regs 1 x 4 bit Timer 3 x 2 bit X/Y/TOP Datapath: I$: May be none Decode: comb’l and small RF: Very few might do EX: AT LEAST add/sub (compute dist) D$: May be none Control: Few states, at most 4 bit reg Small NSE & o/p DEC 11/17/2018
12
Process Constraints 1000 Ts per node Only n-channel Ts
[MIT RAW] ~10% wasted for padding All the logic < 900 Ts Only n-channel Ts No CMOS Alternatives: Dynamic logic w/ delay on PCHRG lines Pseudo NMOS logic w/o the PMOS Static discharge Can’t help VT drop Others we have EMD 11/17/2018
13
Transistor Calculation
Reg-s are the killer Non-overlapping 2 phase [stallion] 6Ts with pass gated Complicated clk generation C2MOS 8 Ts & 2 phase Double True single phase (master-slave) 12T True single phase clk=Hi transparent, but 6T Controller should take care of clk 13 x 8bit + 3 x 2bit + 1 x 4bit = 660Ts EX: only add/sub & can do serial TG adder simplify to pass gate Less than 20T Can also add a barrel shifter Less than 20Ts [colt][stallion] Controller: Even with > 100 states 8 bit Reg sufficient 50 Ts for state RF: 1 reg = 48T #RF< 3! Remaining: Comb’l DEC Comb’l Cont I$ D$ < ~150T 11/17/2018
14
SIMULATION Target: Restrictions Revert to Torus topology (4ary 2cube)
Anything will be a subset Only Consider channel latencies Will reveal network latency Provide fault injection Key to assess other topologies Also demonstrate fault tolerance Emulate clicks – simple user input Obviously, gui Try to keep loyal to actual decision making Restrictions Cannot do zoom with 4x4 network 11/17/2018
15
MacroNET SIM (r,c) (0,0) (1,3) (0,2) (3,3) (1,0) (0,1) (1,2) (3,0)
Similar to animation of previous talk Currently(28/4/11:00am) dim works on Torus Easy to add: Brighten, Compute distance, MakeTop More effort: Fault injection & exchange (r,c) (0,0) (1,3) (0,2) (3,3) (1,0) (0,1) (1,2) (3,0) (2,3) (3,2) (0,0) (0,1) (0,2) (0,3) (0,3) Initial Processor (r,c) (1,0) (1,1) (2,0) (3,1) (2,2) (1,1) (1,2) (1,3) Updated Processor Sent Instruction (2,0) (2,1) (2,1) (2,2) (2,3) Sent ACK Sent ACKed (3,0) (3,1) (3,2) (3,3) 11/17/2018 Used Channel
16
MacroNET SIM DEMO 11/17/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.