Presentation is loading. Please wait.

Presentation is loading. Please wait.

Arria 10 & Stratix 10 EMIF Architecture

Similar presentations


Presentation on theme: "Arria 10 & Stratix 10 EMIF Architecture"— Presentation transcript:

1 Arria 10 & Stratix 10 EMIF Architecture
今天主要是为大家分享一下Arria 10新的EMIF结构。

2 Agenda Arria 10 EMIF Introduction Arria 10 EMIF Architecture Stratix 10 EMIF – Key differences from Arria 10 Arria 10 EMIF IP: altera_emif

3 Altera EMIF IP: Overview
Memory Model Controller PHY Driver (Traffic Generator) Avalon Altera EMIF IP Core Memory AFI Example Design Example Testbench Pass/Fail Configurable GUI Generates IP, example design, and simulation testbench Provides simple interface for user logic Avalon interface Handles complexity of FPGA to Memory Device interface

4 Summary of Arria 10 EMIF Support
Support Model PHY Controller Protocols Width, Depth Tier 1 (Turn-key) Hard DDR MHz Up to 144bit, 4 rank DDR3/3L 1066 MHz Tier 2 (Turn-key) Soft (3rd Party) RLDRAM MHz Up to 72bit, 1CS (1) Soft QDR II+/II+Xtreme 633 MHz Up to 72bit, 1CS QDR IV 1066 MHz LPDDR3 800 MHz Up to 144bit, 4CS DDRx Ping Pong PHY Up to 2x72bit, 2x2CS PHY-only Customer designed DDR4, DDR3, RLDRAM 3, LPDDR3 Tier 3 Customer Owned Customer-built on top of PHYlite - Customer builds PHY, sequencer, and controller TCAM, Flash, DDR2, QDR II, DDR II/II+, LLDRAM II, LPDDR2, Mobile DDR Other emerging protocols Promoted to turn-key or PHY-only if justified Tier 1 protocol, Tier 2 protocol, Tier 3 protocol (1) Can support 2CS for DDP, with no tracking, some performance loss

5 Arria 10 EMIF Architecture

6 Arria 10 I/O Sub-System Implemented as 2 columns Supports
Each column has up to 8 Tiles Tile = Bank (e.g. 2A, 2B, …) Supports General Purpose I/Os (GPIO) I/O Registers & I/O Buffers On-chip Termination Control (OCT) PLLs IOPLL for EMIF and user logic LVDS External Memory Interfaces (EMIF) Hard Memory Controller Hard PHY Hard Nios / Calibration logic DLL HSSI

7 A10 EMIF Architecture: I/O Tile
Each I/O tile has: Hard memory controller Sequencer PLL and PHY clock trees Clock phase alignment (CPA) DLL DQSin clock trees 48 pins, organized into 4 I/O Lanes Up to 4 x8/x9 read data groups Using 1 lane per group Up to 2 x18 read data groups Using 2 lanes per group 1 x36 read data group Using all 4 lanes Address/command pins PLL ref clock pins and RZQ pins Pins unused by EMIF can be GPIOs to/from tile i+1 to/from FPGA core to/from pins to/from tile i-1 A single tile has all the hardware needed to build an EMIF (e.g. DDR3 x8) A wider EMIF is built by stitching together multiple tiles…

8 A10 EMIF Architecture: I/O Column
One hard NIOS with memory (IOAUX) Calibration of all EMIFs in one column Memory stores: “Super-calibration” algorithm (supports all protocols) Calibration runtime data Cannot be re-purposed for other usage Can be interfaced with soft logic via Avalon-MM bus e.g. debug toolkit access Note: need extra debug component in some cases Clocked using an on-die oscillator Up to 13 I/O tiles A tile corresponds to an I/O Sub-Bank (e.g. 3B)

9 Building an A10 EMIF Interface
An EMIF is built by: IOAux (hard Nios) One or more consecutive tiles One tile designated as A/C tile Contains all A/C pins Fixed A/C pinout within tile – see altera.com Unused lanes can be used for data Contains active hard controller (if used) Tile 2 Tile 1 Tile 0

10 Example: DDR3 x8 w/ Hard Controller
An EMIF is built by: IOAux (hard Nios) One or more consecutive tiles One tile designated as A/C tile Contains all A/C pins Fixed A/C pinout within tile Unused lanes can be used for data Contains active hard controller (if used) DDR3 x8 w/ Hard Controller 1 tile 3 lanes for A/C pins 1 lane for data 1 hard controller 1 sequencer 1 PLL and set of PHY clock trees Tile 2 Tile 1 Tile 0 Controller A/C lane 0 Sequencer A/C lane 1 PLL A/C lane 2 CPA DQ Group 0

11 Example: DDR3 x72 w/ Hard Controller
An EMIF is built by: IOAux (hard Nios) One or more consecutive tiles One tile designated as A/C tile Contains all A/C pins Fixed A/C pinout within tile Unused lanes can be used for data Contains active hard controller (if used) DDR3 x72 w/ Hard Controller 3 tiles 3 lanes for A/C pins 9 lanes for data Middle tile used as AC tile 1 hard controller 1 sequencer 3 PLLs and 3 sets of PHY clock trees Tile 2 Tile 1 Tile 0 DQ Group 0 DQ Group 1 PLL DQ Group 2 DQ Group 3 Controller A/C lane 0 Sequencer A/C lane 1 PLL A/C lane 2 CPA DQ Group 4 DQ Group 5 DQ Group 6 PLL DQ Group 7 DQ Group 8

12 Example: 2 x16 interfaces sharing a tile
X12 I/O Lane Delay Chain Control Write Buffer Read Buffer Controller Sequencer DBC Fixed Address / Command Pin out Fixed Address / Command Pin out Data path Unused (Free for GPIO, but not LVDS)

13 Rules/Guidelines for Address/Command Pins
A/C pins of an interface must be in a single tile In a multi-tile interface, A/C tile should be in the middle to avoid latency penalty A lane must not be used by both A/C and data pins But pins unused by EMIF can be used by GPIOs A/C pins must follow predefined locations within a tile For soft controller, rule may be relaxed in a future release Every protocol and format has a different pin mapping scheme Scheme documented in “readme” file dynamically generated by MW/QSYS Also documented in pin tables

14 Sharing Address/Command Pins – “Ping-Pong PHY”
Why? Saves pins Time-multiplex A/C pins shared by two logically independent controllers Criteria: Same protocol, rate, frequency DDR3 and DDR4 only 8/16/32/64 with or without ECC enabled Maximum 2CS (or dual-rank) Controller must be in two adjacent Tiles Upper one drives the external A/C bus Will use all 4 I/O lanes Lower one sends A/C output to upper Tile How? Enable DDRx Ping-Pong PHY mode

15 Clocking Architecture
Improved PHY clock trees Every tile has a PLL and a PHY clock tree Can only drive I/Os in the same tile A multi-tile interface uses multiple PLLs and PHY clock trees Recall: 28nm interface spanning multiple sub-banks is driven by one PLL and PHY clock tree Shorter PHY clock trees enable lower jitter and DCD to achieve GHz performance Clocks in a multi-tile EMIF are synchronized by: Restricting PLL ref clock frequencies and M/N values for a given EMIF speed EMIF GUI shows valid options New concept for customers Using a Reference Clock Tree to route a common ref clock signal to all PLLs New hardware in Arria 10 Handled automatically by Fitter and should be completely transparent to customers

16 Transfers between Core and Periphery
C2P/P2C timing closure was challenging for 28nm Cause 1: Long delays for signals crossing the C2P/P2C boundary (esp. corners) Cause 2: High skew between core clock and PHY clock 20nm improvements Much lower delays crossing C2P/P2C boundary Clock phase alignment (CPA) dynamically aligns core clock to PHY clock Should completely eliminate the need to manually tune clock phases to close C2P/P2C timing Turned on automatically for all Arria 10 EMIFs (but not in ACDS 13.1) Caveat: Multi-tile challenges due to CCPR Global To FPGA core… CPA

17 Clock Phase Alignment Block (CPA)
User Logic In Arria 10, EMIF core clock networks are always driven by CPA

18 Power Domains

19 Sharing Resources Often desirable to share resources between interfaces Certain resources cannot be or do not need to be shared Hard memory controller – not sharable PLL/DLL – do not need to be shared In Arria 10, the following resources can be shared: I/O tile Hard Nios PLL ref clock pins Core clock networks OCT block and RZQ pin Address/command pins (Ping-Pong PHY) More information in External Memory Interface Handbook

20 Stratix 10 EMIF Architecture: Key Differences from Arria 10

21 Key EMIF Architecture Differences from Arria 10
Performance: Topline DDR4 spec remains at 1333 MHz HMC operates in half-rate up to 667 MHz (w/ QR conversion registers) Mid speed grade DDR4 performance boosted to 1200 MHz (A10 is 1066 MHz) Size: ND7 has up to 3 I/O columns Up to 12 I/O Tiles per column Features: Support native x4 with full set of shadow registers Add DDR4 3DS support (details TBD) IOPLL self-calibration Architecture: New device configuration scheme Share hard Nios with device configuration block No 3V I/O Tile; 3V support now embedded in SDM sector Redesigned (shorter) PHY clock tree Target DDR4 JEDEC spec of +/- 2% DCD on CK/CK# A10 HPS  S10 APS

22 Stratix 10 Performance Targets (for fast speedgrade)
Tier Memory Standard Configuration Speed (MHz) 1 Rank/CS 2 Rank/CS 4 Rank/CS Max Freq User MC PHY 1 DDR4 UDIMM / RDIMM / SODIMM / Device 1333 333 667 1200 300 600 933 233 467 DDR4 LRDIMM DDR3/3L UDIMM / RDIMM / SODIMM / Device 1067 267 533 167 DDR3/3L LRDIMM 2 LPDDR3 800 200 400 RLDRAM3 300* 267* 233* DDRx Ping-Pong PHY QDR IV XP QDR IV HP QDR II+Xtreme 633 317 QDR II+ 550 275 QDR II 350 175 * Supported via 3rd party soft controller Purple = quarter-rate (QR) Red = half-rate (HR)

23 Arria 10 EMIF IP: altera_emif

24 What is altera_emif? Find in GUI as “Arria 10 External Memory Interface IP” New EMIF IP Delivers a complete interface solution through a single IP Will no longer be using names such as “UniPHY” and “HPCII” Supports latest FPGA families: Arria 10 Will re-use for Stratix 10 and beyond Supports all memory protocols through a single IP Protocol is simply a parameter Supports both PHY-only and PHY+controller configuration

25 Notable Changes in altera_emif
Single entry point for all protocols Faster and more robust generation mechanism Automatic pin assignments More flexible example design generation flow Dynamically generated data-sheet (readme.txt) No-warning policy No RTL connections from OCT to I/Os Simplified filesets Simulation model accuracy and simulation speed Improved timing analysis Roadmap for Debug Toolkit and Example Driver enhancements

26 Single Entry-Point for All Protocols
DDR3 Presets RLDRAM3 Presets Preset filtering tool GUI and default values updated as you change protocol selection

27 General altera_emif Parameters

28 I/O Parameters

29 Memory Topology Parameters

30 Memory Timing Parameters

31 Memory Timing Parameters

32 Board Timing Parameters

33 Board Timing Parameters

34 Controller Parameters

35 Diagnostics Parameters

36 Generated example and simulation example design

37 Generated Filesets

38 Generated .qip File

39 Example Design Fileset
readme.txt Instructions for customer Loadable into QSYS to add more components / incorporate into user system ed_sim.qsys .qsys file capturing simulation example design ed_synth.qsys .qsys file capturing QII example design Execute these to fully elaborate self-containing example designs… make_qii_design.tcl Script to generate QII example design project make_sim_design.tcl Script to generate simulation example design project

40 Elaborating Example Design
QII project: quartus_sh -t make_qii_design.tcl [device_name] Simulation example design: quartus_sh -t make_sim_design.tcl [VERILOG|VHDL] Notes Scripts take less than 1 minute to execute You can re-run them as you need

41 Synthesizable Example Design
qii_emif_ex_design altera_emif global_reset_n soft_reset_n arch_nf pll_ref_clk Avalon Traffic Gen EMIF status

42 Simulation Example Design
sim_emif_ex_design altera_emif Sim clock source arch_nf Sim reset source Sim memory model Avalon Traffic Gen Sim pass/fail checker

43 Making Pin Location Assignment

44 RTL View: PHY + Hard Controller
altera_emif global_reset_n soft_reset_n arch_nf pll_ref_clk Instantiates: PLL (iopll) Nios (ioaux) Hard controller, PHY and CPA (tiles, lanes, buffers) EMIF emif_usr_clk emif_usr_reset_n Ctrl AMM / AST Ctrl MMR (opt) Ctrl sideband (opt) FPGA Core (user logic) FPGA Core (emif soft logic) FPGA Periphery Board

45 RTL View: PHY + Soft Controller
altera_emif global_reset_n soft_reset_n arch_nf pll_ref_clk Instantiates: PLL Nios PHY and CPA EMIF emif_usr_clk Soft controller emif_usr_reset_n afi_clk afi_half_clk Ctrl AMM / AST afi_reset_n Ctrl MMR (opt) AFI Ctrl sideband (opt) FPGA Core (user logic) FPGA Core (emif soft logic) FPGA Periphery Board

46 RTL View: PHY-Only altera_emif arch_nf Instantiates: PLL Nios
global_reset_n soft_reset_n arch_nf pll_ref_clk Instantiates: PLL Nios PHY and CPA EMIF afi_clk afi_half_clk afi_reset_n AFI FPGA Core (user logic) FPGA Core (emif soft logic) FPGA Periphery Board

47 RTL View: Inside arch_nf
Fitter merges IOAUX for interfaces in the same column arch_nf IOAUX TILE_CTRL io_12_lane I/O Buffers Fitter can rotate pins within a lane based on pin asgmts, but can’t move pins across lanes. io_12_lane io_12_lane IOPLL io_12_lane TILE_CTRL io_12_lane Fitter duplicates PLL for multi-tile interfaces and merges PLL when a tile is shared io_12_lane io_12_lane io_12_lane IP RTL instantiates the minimum tiles/lanes required. Fitter re-allocates physical tiles based on pin asgmts. RTL assumes A/C tile is in the middle of interface.

48 Thank You Questions?


Download ppt "Arria 10 & Stratix 10 EMIF Architecture"

Similar presentations


Ads by Google