Presentation is loading. Please wait.

Presentation is loading. Please wait.

Power-Aware RAM Processing for FPGAs December 9, 2005 Power-aware RAM Processing for FPGA Embedded Memory Blocks Russell Tessier University of Massachusetts.

Similar presentations


Presentation on theme: "Power-Aware RAM Processing for FPGAs December 9, 2005 Power-aware RAM Processing for FPGA Embedded Memory Blocks Russell Tessier University of Massachusetts."— Presentation transcript:

1 Power-Aware RAM Processing for FPGAs December 9, 2005 Power-aware RAM Processing for FPGA Embedded Memory Blocks Russell Tessier University of Massachusetts Vaughn Betz, David Neto and Thiagaraja Gopalsamy Altera Corporation

2 Power-Aware RAM Processing for FPGAs December 9, 2005 Overview °Operation of FPGA embedded memory blocks (EMBs) °Power consumption in EMBs °Opportunities for power saving Shut down clocks to memory core °Three automated power saving techniques Unused memory port shutdown Memory control signal transform Memory mapping to multiple blocks °Experimental results

3 Power-Aware RAM Processing for FPGAs December 9, 2005 FPGA Embedded Memory Blocks °Embedded memory blocks (EMBs) are important parts of FPGAs °Consume roughly 14% of Altera Stratix II dynamic power * Increasing in recent designs * Stratix II Low Power Applications Note, 2005

4 Power-Aware RAM Processing for FPGAs December 9, 2005 Stratix II Embedded Memory Block – External View °Input ports (data, address, control) are synchronous °Mode 1: Single port (ignore Port B) °Mode 2: True dual port Memory Core Port A Data In Port A Address Port A R/W Enable Port A Data Out Clock enables Port B R/W Enable Port B Data In Port B Address Port B Data Out Clock enables Port APort B

5 Power-Aware RAM Processing for FPGAs December 9, 2005 Stratix II Embedded Memory Block – External View °Mode 3: Simple dual-port Large majority of RAM implementations Port A Data In Port A Address Port A Write Enable Memory Core Clock enables Port B Data Out Clock enables Port B Read Enable Port B Data In Port B Address Port APort B

6 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Internal View Write Data MClk Write Enable Pulse Gen. Column Mux Write Buffers Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge Read Data Read Enable Latch Address MClk Clk Enable Clk MClk

7 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Read: Step 1 °Substantial power required to charge bit lines BIT Bit Line Pre-charge Precharge BIT lines to VCC MClk Clk Enable = 1 Clk

8 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Read: Step 2 MClk Column Mux Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge MClk Address Data read out of RAM cells MClk Clk Enable = 1 Clk

9 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Read: Step 3 Read Data MClk Column Mux Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge Read Enable = 1 Latch MClk Address Data passes through latch to Read Data lines MClk Clk Enable = 1 Clk

10 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Read Summary °If read clock enable = 0, steps 1 and 2 suppressed °If read enable = 0, step 3 suppressed Read Data MClk Column Mux Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge Read Enable Latch Address MClk Clk Enable Clk MClk

11 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Write: Step 1 °Substantial power required to charge bit lines BIT Bit Line Pre-charge Precharge BIT lines to VCC MClk Clk Enable = 1 Clk MClk

12 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Write: Step 2 Write Data MClk Write Enable Pulse Gen. Column Mux Write Buffers Sense Amps Bit Line Pre-charge MClk Data loaded into write buffers based on write enable MClk Clk Enable = 1 Clk MClk

13 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Write: Step 3 Write Data MClk Write Enable Pulse Gen. Column Mux Write Buffers Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge MClk Clk Enable Clk Address MClk Data loaded into RAM cells MClk

14 Power-Aware RAM Processing for FPGAs December 9, 2005 Embedded Memory Block Port Write Summary °If write clock enable = 0, steps 1, 2, and 3 suppressed °If write enable = 0, step 2 suppressed Write Data MClk Write Enable Pulse Gen. Column Mux Write Buffers Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge MClk Clk Enable Clk

15 Power-Aware RAM Processing for FPGAs December 9, 2005 Reducing RAM Power Consumption °Each RAM element can use an enabled or free running clock °Use enabled clocks rather than free running clocks to prevent bit-line pre-charge °Only enable RAMs when access is necessary °Read enable not always specified by designer Write enable created for functionality MClk Clk Enable Clk

16 Power-Aware RAM Processing for FPGAs December 9, 2005 Power Optimization #1 °For single-port memories Tie Port B clock enable to GND Previously tied high, with write enable disabled Port B Clock Memory Core Port A Data In Port A Address Port A R/W Enable Port A Data Out Clock enables Port APort B Port B Write Enable = 0 Shut Off

17 Power-Aware RAM Processing for FPGAs December 9, 2005 Single Port Optimization Experiments °Determine power effect of shutting off clock to unused Port B °Only impacts single-port RAM and ROM designs °43 Stratix II designs Large customer designs with memory Targeted to smallest achievable FPGA Hand-generated input vectors °Quartus 5.0 °Target maximum frequency

18 Power-Aware RAM Processing for FPGAs December 9, 2005 Memory Power – Port Optimization °9.2% average power reduction for designs with memories (only impacts ROMs and single port memories) Memory Dynamic Power 0 10 20 30 40 50 60 Designs % Power Reduction 5 10 15 20 25 30 35 40

19 Power-Aware RAM Processing for FPGAs December 9, 2005 Dynamic Power - Port Optimization °2.4% average power reduction for designs with memories (only impacts ROMs and single port memories) Dynamic Power 0 5 10 15 20 25 30 35 Designs % Power Reduction 5 10 15 20 25 30 35 40

20 Power-Aware RAM Processing for FPGAs December 9, 2005 FPGA RAM Processing °FIFOs and Shift registers converted into logical RAMs °Logical RAMs broken into RAM blocks of sizes appropriate for physical implementation °Each RAM block assigned to a physical embedded memory block FIFO, Shift Register, RAM specification Create Logical Memory Logical RAMs Logical-to- physical RAM processing RAM blocks/ logic Memory/ logic placement Placed Memory

21 Power-Aware RAM Processing for FPGAs December 9, 2005 FIFO Elaboration to Logical RAM °Convert to logic and synchronous RAM with signal pattern found on EMB Clock Wrreq counter Data Write Address Read Address Q Write enable Read enable Q Rdreq Vcc Wr clk enable Rd clk enable Implemented in LUTs/FFs Clock Data Wrreq Data Wrreq Q Rdreq Q Before After Logical RAM

22 Power-Aware RAM Processing for FPGAs December 9, 2005 Power Optimization #2 °Convert EMB read enable/write enable signals to associated read/write clock enable signals °Limitations Each port must have dedicated read or write enable signal (simple-dual port) Embedded memory block have read enable Clock Wren Data Write Address Read Address Q Write enable Read enable Q Rden Vcc Wr clk enable Rd clk enable Write Address Read Address Clock Wren Data Write Address Read Address Q Write enable Read enable Q Rden Vcc Wr clk enable Rd clk enable Write Address Read Address BeforeAfter

23 Power-Aware RAM Processing for FPGAs December 9, 2005 Read Port Control Signal Equivalence °Memory core inactive if read clock enable inactive °Read operation will occur if both read enable and read clock enable are high One signal could be tied to VCC MClk Column Mux Write Buffers Row Decode Column Decode RAM cell BIT Bit Line Conditioning Read Address Read Enable Latch Read Data MClk VCC Clk

24 Power-Aware RAM Processing for FPGAs December 9, 2005 Read Port Control Signal Equivalence °If read clock enable = 0 and read enable = 1, read suppressed °If read clock enable = 1 and read enable = 0, read suppressed Read Data MClk Column Mux Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge VCCLatch Address MClk Read enable Clk

25 Power-Aware RAM Processing for FPGAs December 9, 2005 Write Port Control Signal Equivalence °Memory core inactive if write clock enable is inactive °Write operation will occur if both write enable and write clock enable are high One signal could be tied to VCC Write Data MClk Write Enable Pulse Gen. Column Mux Write Buffers Row Decode Column Decode RAM cell BIT Bit Line Conditioning Write Address MClk VCC Clk

26 Power-Aware RAM Processing for FPGAs December 9, 2005 Write Port Control Signal Equivalence °If write clock enable = 0 and write enable = 1, write suppressed °If write enable = 0 and write clock enable = 1, write suppressed Write Data MClk VCC Pulse Gen. Column Mux Write Buffers Sense Amps Row Decode Column Decode RAM cell BIT Bit Line Pre-charge MClk Write enable Clk

27 Power-Aware RAM Processing for FPGAs December 9, 2005 Quartus II Implementation °Conversion mode Quartus II default Ties off R/W enable to RAM clock enables Doesn’t make transform if CE already present on port °Combining mode AND user RAM clock enables with derived R/W clock Could impact performance MClk Write Enable Clk User-defined Write Clk Enable MClk Clk User-defined Write Clk Enable

28 Power-Aware RAM Processing for FPGAs December 9, 2005 Clock Enable Conversion Experiments °40 Stratix II RAM-based designs designs °Quartus 5.1 °Target max frequency °Quartus II simulation with test vectors °Dynamic power evaluated with Quartus II PowerPlay power analyzer °Covers the following optimizations Automatic conversion of R/W enable to R/W clock enable Combining of R/W enable with existing R/W clock enable

29 Power-Aware RAM Processing for FPGAs December 9, 2005 Memory Power – Clock Enable Optimization °9.7% average power reduction for convert and combine for all designs (6.3% for convert only) Memory Dynamic Power -10 0 10 20 30 40 50 60 70 13579111315171921232527293133353739 Designs % Power Reduction Enable convert Enable convert/ combine

30 Power-Aware RAM Processing for FPGAs December 9, 2005 Core Dynamic Power – Clock Enable Optimization °2.6% average power reduction for convert and combine for all designs (1.8% for convert only) Core Dynamic Power -5 0 5 10 15 20 25 30 13579111315171921232527293133353739 Designs % Power Reduction Enable convert Enable convert/ combine

31 Power-Aware RAM Processing for FPGAs December 9, 2005 Mapping RAM to Multiple EMBs °User-defined memory often too large to fit in one EMB °Must use RAM in multiple EMBs to implement logical RAM °Implementation choice can impact design area, performance, and power. 4k deep x 4 wide 16K bits 4K bits M4K User-defined (logical) memory Physical (EMB) memory

32 Power-Aware RAM Processing for FPGAs December 9, 2005 Memory Organization °Each EMB can be configured to have different depth and width (e.g. Stratix II M4K) °All hold 4K bits °Slightly lower power consumption for wider EMB configurations (not including routing) 4K words deep 1 bit wide 32 bits wide 128 words deep 8 bits wide 512 words deep

33 Power-Aware RAM Processing for FPGAs December 9, 2005 Area and Delay Optimal Mapping °Configure each EMB to be as deep as possible °Number of address bits on each EMB same as on logical memory °Area and performance efficient: no external logic needed °Power inefficient: All EMBs must be active during each logical RAM access 4k words deep and 1 bit wide (4 times) Addr[0:11] Data[0:3] 4k words deep and 4 bits wide Logical memory 4 EMBs active during access EMB Vertical Slicing

34 Power-Aware RAM Processing for FPGAs December 9, 2005 Alternative Mapping °Configure EMB to have width of logical RAM (e.g. 1Kx4) Allows shutdown of some RAMs each cycle But adds some logic °Saves RAM power, adds combinational logic and register power More Power Efficient: 1K deep x 4 wide (4 times) 1 EMB active during access Addr Decoder 4 Addr[0:9] Addr[10:11] Data[0:3] 4k words deep and 4 bits wide Logical memory Addr[10:11] Horizontal Slicing

35 Power-Aware RAM Processing for FPGAs December 9, 2005 RAM Slicing - Example °Power reduction available with different slicing 4kx32 Dynamic Power 0 20 40 60 80 100 120 140 Maximum Depth Dynamic Power (mW) 4kx32 Best range Multiplexer Power Increasing 1282565121k2k4k EMB Power Increasing

36 Power-Aware RAM Processing for FPGAs December 9, 2005 Power Optimization #3: Power-aware RAM Partitioning °Power optimal EMB configuration often between “horizontal” and “vertical” °Need algorithm to consider possible logical to physical RAM mappings Completed placement Memory/ Logic Placement Insert Decode and Mux Logic Power-aware RAM Partitioner FIFO, Shift Register Create Logical Memory Logical RAMs Logical to Physical RAM processing RAM blocks/ Logic

37 Power-Aware RAM Processing for FPGAs December 9, 2005 Power-aware RAM Partitioning Algorithm °For each EMB type For each EMB depth versus width configuration - Determine number of required EMBs, decoder, and output mux circuits - Estimate power of RAM access (active EMBs, decoder, and output mux) - Limit to four-way muxing at most Save lowest power configuration °Rank possible EMB implementations by power °Select lowest-power, feasible choice Check if EMB usage overflowed by choice If yes, select next choice

38 Power-Aware RAM Processing for FPGAs December 9, 2005 Experimental Approach °Simulation and power estimation performed Multi-bit input multiplexers Decoders EMB blocks in different configurations °40 designs evaluated °Quartus 5.1 °Mapped to smallest possible device and target max frequency °Simulation with test vectors, power analysis with PowerPlay °Approach used in combination with clock enable conversion and combining

39 Power-Aware RAM Processing for FPGAs December 9, 2005 Memory Power °21.0% average power reduction for all techniques for memory designs (9.7% with only enable convert/combine)

40 Power-Aware RAM Processing for FPGAs December 9, 2005 Overall Core Dynamic Power °6.8% average power reduction for all techniques for memory designs (2.6% with convert/combine) -5 0 5 10 15 20 25 30 35 13579111315171921232527293133353739 Designs % Dyn. Power Reduction Enable convert/ combine Enable convert/ combine + mem partition

41 Power-Aware RAM Processing for FPGAs December 9, 2005 Design Performance °1.0% average performance loss for all techniques (0.1% for enable convert/combine) Average Design Clock Frequency -30 -25 -20 -15 -10 -5 0 5 10 Designs % Frequency Improvement Enable Convert/ Combine Enable Convert/ Combine + Mem Partition

42 Power-Aware RAM Processing for FPGAs December 9, 2005 Results Summary °Almost 7% core dynamic power reduction across all designs Some designs benefit more than others °Minimal clock frequency hit for most designs Enable convert Enable convert/ combine Enable convert/ combine + Mem partition Core dynamic power -1.8%-2.6%-6.8% Memory dynamic power -6.3%-9.7%-21.0% Max clk freq -0.1%-0.2%-1.0% LUT count 0.0%0.1%0.7%

43 Power-Aware RAM Processing for FPGAs December 9, 2005 Impact of Multiple Embedded Memory Blocks °Rerun 40 designs but only allow one type of target EMB for each mapping °All designs targeted to Stratix II EP2S180 °Significant power impact for most designs versus EP2S180 target with no restrictions M512M4KM-RAM Designs completed23384 Core dynamic power40.4%6.6%47.3% Memory power279.5%33.3%754.0% Max clk freq.-2.2%0.6%-1.0% LUT count0.4%-0.5%0.0%

44 Power-Aware RAM Processing for FPGAs December 9, 2005 Summary °Key to reducing RAM power is keeping clocks disabled. °Single port RAMs a straightforward optimization °Movement of read/write enables to clock enables limits dynamic activity °Power-aware RAM partitioner attempts to select power-optimal mapping – combined with clock enable enhancement °Overall About 30% average memory power reduction -9% single port optimization -21% enable convert/combine and memory partitioning About 9% average dynamic power reduction -2% single port optimization -7% enable convert/combine and memory partitioning


Download ppt "Power-Aware RAM Processing for FPGAs December 9, 2005 Power-aware RAM Processing for FPGA Embedded Memory Blocks Russell Tessier University of Massachusetts."

Similar presentations


Ads by Google