Lecture 11: External SRAM

Slides:



Advertisements
Similar presentations
INPUT-OUTPUT ORGANIZATION
Advertisements

Dr. Rabie A. Ramadan Al-Azhar University Lecture 3
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
SYSTEM CLOCK Clock (CLK) : input signal which synchronize the internal and external operations of the microprocessor.
The 8085 Microprocessor Architecture
Memories and the Memory Subsystem; The Memory Hierarchy; Caching; ROM.
ENGIN112 L30: Random Access Memory November 14, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 30 Random Access Memory (RAM)
CS 151 Digital Systems Design Lecture 30 Random Access Memory (RAM)
9/20/6Lecture 3 - Instruction Set - Al Hardware interface (part 2)
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
Basic Computer Organization CH-4 Richard Gomez 6/14/01 Computer Science Quote: John Von Neumann If people do not believe that mathematics is simple, it.
1 Sequential Circuits Registers and Counters. 2 Master Slave Flip Flops.
INPUT-OUTPUT ORGANIZATION
EKT 221 Digital Electronics II
Khaled A. Al-Utaibi Memory Devices Khaled A. Al-Utaibi
Khaled A. Al-Utaibi  Intel Peripheral Controller Chips  Basic Description of the 8255  Pin Configuration of the 8255  Block Diagram.
Memory and Storage Dr. Rebhi S. Baraka
SEQUENTIAL CIRCUITS Component Design and Use. Register with Parallel Load  Register: Group of Flip-Flops  Ex: D Flip-Flops  Holds a Word of Data 
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the FPX.
12/16/  List the elements of 8255A Programmable Peripheral Interface (PPI)  Explain its various operating modes  Develop a simple program to.
Introduction to Microprocessors - chapter3 1 Chapter 3 The 8085 Microprocessor Architecture.
Gunjeet Kaur Dronacharya Group of Institutions. Outline I Random-Access Memory Memory Decoding Error Detection and Correction Read-Only Memory Programmable.
발표자 : 이재신 발표 일시 : Chapter 2 Hardware Fundamentals for the Software Engineer The embedded-systems software engineer must often understand.
Memory Interface EEE 365 [FALL 2014] LECTURER 12 ATANU K SAHA BRAC UNIVERSITY.
Chapter 5 - Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization.
Sequential Logic Design
Digital Logic Design Alex Bronstein Lecture 3: Memory and Buses.
Memories.
Class Exercise 1B.
COMP211 Computer Logic Design
The 8085 Microprocessor Architecture
16.317: Microprocessor System Design I
Lecture 15 Sequential Circuit Design
9S12C Multiplexed Bus Expansion
UNIT – Microcontroller.
CPU Sequencing 6/30/2018.
Morgan Kaufmann Publishers
Clock in Digital Systems
Memory Systems 7/21/2018.
The 8085 Microprocessor Architecture
Appendix B The Basics of Logic Design
1 Input-Output Organization Computer Organization Computer Architectures Lab Peripheral Devices Input-Output Interface Asynchronous Data Transfer Modes.
Memory Units Memories store data in units from one to eight bits. The most common unit is the byte, which by definition is 8 bits. Computer memories are.
Dr. Michael Nasief Lecture 2
8085 Microprocessor Architecture
Memory chips Memory chips have two main properties that determine their application, storage capacity (size) and access time(speed). A memory chip contains.
ECE 434 Advanced Digital System L03
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
ECE Digital logic Lecture 16: Synchronous Sequential Logic
Chapter 11 Sequential Circuits.
Sequential Circuits: Latches
Interfacing Memory Interfacing.
CPE/EE 422/522 Advanced Logic Design L02
Limitations of STA, Slew of a waveform, Skew between Signals
AT91 Memory Interface This training module describes the External Bus Interface (EBI), which generatesthe signals that control the access to the external.
The Xilinx Virtex Series FPGA
Programmable Interval timer 8253 / 8254
8085 Microprocessor Architecture
Programmable Interval timer 8253 / 8254
Sequential Circuits: Latches
Registers.
The 8085 Microprocessor Architecture
Overview Last lecture Digital hardware systems Today
part 2: implementation, analysis & design
The Xilinx Virtex Series FPGA
Lecture 11: Control/Synthesis of Memories in FPGA
8253 – PROGRAMMABLE INTERVAL TIMER (PIT). What is a Timer? Timer is a specialized type of device that is used to measure timing intervals. Timers can.
The Programmable Peripheral Interface (8255A)
CPU Sequencing 7/20/2019.
4-Bit Register Built using D flip-flops:
Presentation transcript:

Lecture 11: External SRAM UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2016 Some slides courtesy of MIT 6.111 Instructor: Chris Terman http://web.mit.edu/6.111/www/f2008/

Memories: A practical primer

Memories in Verilog

Multi-port memories, aka., reg files

FIFOS

FIFO in action

FPGA memory implementation

LUT-based RAMs

LUT-based RAM module

Tools often build it for you

BRAM Block RAM (BRAM) is a type of random access memory (configurable memory module) that is embedded throughout an FPGA for data storage. Use BRAM to: Transfer data between multiple clock domains. Transfer data between an FPGA target and a host processor. Transfer data between FPGA targets. Store large data sets on an FPGA target more efficiently than RAM built from look-up tables.

Block Memories (BRAM)

Memory classification and metrics

Overview Random Access Memory (RAM) is used for massive storage. A register file is faster and more flexible, but feasible only for small storage due to large size. A read or write operation to an SRAM (asynchronous Static RAM) requires that data, address, and control signals be asserted in a specific order, and remain stable for a certain amount of time . SRAM is accessed through a memory controller which ensures this requirements are met.

Static RAM: Latch-based memory

Memory array architecture

Static SRAM cell (6T cell)

Using external memory devices

Basic Memory Controller The memory controller provides a ‘synchronous wrap’ around the SRAM data_f2s_r data_s2f data_f2s_ur

Basic Memory Controller When the main system wants to access the memory, it places the address and data (for a write operation) on the bus and activates mem and r/w At the rising edge of the clock, all signals are sampled by the memory controller and the desired operation is performed accordingly

Basic Memory Controller Ports on the side of the main system: mem: asserted to 1 to initiate a memory operation r/w: specifies whether the operation is a read (1) or write (0) operation addr: 18-bit address data_f2m: 16-bit data to be written from FPGA to SRAM data_m2f_r: 16-bit registered data retrieved from SRAM to FPGA data_m2f_ur: 16-bit unregistered data retrieved from SRAM to FPGA ready: indicates if controller is ready for new command

Block Diagram of a Memory Controller The FSM follows the timing diagrams Figures 11.2 and 11.3 to generate a proper control sequence Two data registers: one each for read and write operations tri-state buffer

MCM 6264C 8k * 8 SRAM

Functional Table of SRAM Operation ce_n we_n oe_n lb_n ub_n dio (lower) dio (upper) disabled 1 - Z read data_out write data_in

Functional Table of SRAM Default: ce_n, lb_n, and ub_n signals always activated (SRAM always enabled and using both bytes of the data bus) Simplified functional table with ce_n, lb_n, and ub_n set to defaults Operation we_n oe_n dio (16-bits) output disabled 1 Z read 16-bit word data_out write 16-bit word - data_in

Reading an Asynchronous SRAM

Address controlled reads

Writing to asynchronous SRAM

Sample memory interface logic

Tri state data buses in Verilog

Synchronous SRAM memories

ZBT eliminates the wait state

Pipelining allows faster clock

Register File vs SRAM A register file usually has one write port and multiple read ports while SRAM usually have a common read/write port The read and write ports of a register file can be accessed at the same time Writing to a register takes only one clock cycle Data from a register's read ports is always available and the read operation involves no clock or additional control signals for register files

Register File vs SRAM A register file is faster and more flexible. However, due to the circuit size of an FF, a register file is feasible only for small storage.

EEPROM Electronically Erasable Programmable ROM

Interacting with Flash and EEPROM

Dynamic RAM (DRAM) Cell

Asynchronous DRAM operation

Addressing with memory maps

Memory devices (helpful knowledge)

You should understand why

Memory Controller FSM for SRAM We will consider several design choices First we will describe a safe design that provides large timing margins and does not impose any stringent timing constraints. Then we will consider some aggressive designs, the challenges they bring and some potential solutions.

Safe Design Defaults: oe_n = 1; we_n = 1; tri_n = 1; ready = 0 FSM is initially in the idle state, starts the memory operation when the mem signal is activated. The r/w signal determines whether it is a read or write operation. read: r1 state write: w1 state

Safe Design: Read Operation The memory address, addr, is sampled and stored in the raddr register at the transition. The data is stored in the rs2f register at the transition from r2 to idle, and the oe_n signal is deactivated afterwards. data_s2f_r is a registered output and available after the FSM exits the r2 state until the next read cycle. The data_s2f_ur is an unregistered output connected directly to the SRAM's dio bus. Its data becomes valid one clock cycle earlier than data_s2f_r but will be removed after the FSM enters the idle state.

Safe Design: Write Operation The memory address, addr, and data, data_f2s, are sampled and stored in the raddr and rf2s registers at transition. The we_n and tri_n signals are both activated in the w1 state. tri_n controls the data flow from FPGA to SRAM At the w2 state, we_n is deactivated but tri_n remains asserted to ensure that the data is properly latched to the SRAM during 0→1 edge of we_n. At the end of write cycle, FSM returns to idle state and tri_n is deactivated to remove data from dio bus.

Safe Design: Timing Analysis Assumptions: FSM is controlled by a 50-MHz clock and thus stays in each state for 20 ns. The SRAM is IS61LV25616A with the following timing parameters

Safe Design: Timing Analysis Read operation: During the read cycle, oe_n is asserted for two states, i.e., 40ns which provides a 30ns margin over the 10ns tAA . The data is stored in the data_s2f register when the FSM moves from the r2 state to the idle state. Although oe_n is deasserted at the transition, the data remains valid for a small interval because of the FPGA's pad delay and the tHZOE delay of the SRAM chip. It can be sampled properly by the clock edge.

Safe Design: Timing Analysis Write operation: During the write cycle, we_n is asserted in the w1 state, and the 20ns interval exceeds the 8ns tPWEI requirement. The tri_n signal remains asserted in the w2 state and thus ensures that the data is still stable during the 0-to-1 transition edge of the we_n signal.

Safe Design: Timing Analysis Performance: Both read and write operations take two clock cycles to complete. During the read operation, data_s2f_ur is available just before the rising edge of the second clock cycle and the data_s2f_r is available right after the rising edge of the second clock cycle. Both read and write operations must return to idle state after completion. The main system must wait for another clock cycle to issue a new memory operation, and thus the back-to-back memory access takes three clock cycles.

Timing Issues on Asynchronous SRAM Deactivation of the we_n signal The 0-to-1 transition of we_n functions somewhat like a clock edge of an FF, in which the data is latched and stored to the internal memory element. Even though the data hold time (tHD) is zero for this SRAM, deactivating we_n and removing data at the same time, is not a reliable a approach because of the variations in propagation delays. Must ensure that we_n is deactivated before data is removed from the bus.

Timing Issues on Asynchronous SRAM Potential conflict on the data bus, dio. The data bust is bidirectional, used for both read an write operations. A condition known as fighting occurs if both controller and SRAM place data on the bus at the same time.

Alternative Design I Target: reduce the back-to-back operation overhead Instead of always returning to the idle state, the memory controller check the mem signal at the end of current memory operation (i.e., in the r2 or w2 state) and determine the next state. Initiates a new memory operation immediately if there is pending request.

Alternative Design I idle

Alternative Design I: Timing Analysis Back-to-back memory operations may cause fighting on the data bus. For example, if write operation is performed immediately after a read operation Tristate buffer of SRAM: passing → high impedance Tristate buffer FPGA: high impedance → passing If either of the tristate buffers changes mode too slowly (delays tHZOE and tLZOE), both buffers may allow data to be placed on the bus in a small interval and fighting occurs *The timing issues with the basic ‘safe’ design also apply to this design

Alternative Design II Target: perform single memory operation in one clock cycle Permitted by the timing parameters as the read and write cycles of the SRAM are each 10ns and each clock cycle is 20 ns The r2 and w2 states are removed Takes one clock cycle to complete the memory access and requires two clock cycles to complete the back-to-back operations.

Alternative Design II

Alternative Design II: Timing Analysis Read operation Address signal first propagates through the FPGA's I/0 pads to SRAM's address bus, and retrieved data then propagates back through I/0 pads to FPGA's internal logic Need to satisfy: SRAM address access time (tAA = 10ns) + two pad delays (4ns ~ 10ns each) < one cycle (20ns) Write operation we_n must be deactivated before data to properly latch data to SRAM, which normal synthesis cannot guarantee. Fine tune of synthesis is required to achieve these. *The timing issues with the basic ‘safe’ design also apply to this design

Alternative Design III Target: combine the features from the two preceding designs Takes one clock cycle to complete the memory access and one clock cycle to complete back-to-back operations. The we_n signal must be asserted for a fraction of the clock period and cannot be shown in the diagram. It is derived from the we_tmp signal shown is w1 state

Alternative Design III idle

Alternative Design III: Timing Analysis The data is latched to the SRAM at the 0-to-1 transition of the we_n signal During back-to-back write operations, state remains w1 and we_n remain asserted to 0 continuously. One possible solution is to assert the signal only at the first half of the clock, which is 10ns (< tWPE1 ). assign we_n = we_tmp | ~clk; which is not reliable due to potential glitches and delay variation. Better alternatives are discussed next. *The timing issues with the previous two designs also apply to this design

Advanced FPGA Features to Solve Timing Issues An FSM cannot generate a control sequence that is "finer" than the period of its clock signal Some device (Spartan 3 in this case) and software dependent ad-hoc features to obtain better control: Digital Clock Manager (DCM): to obtain a “finer” control sequence by using a faster clock Input/Output Block (IOB): to minimize the off-chip pad delay

Digital Clock Manager (DCM) DCM can multiply or divide the frequency or shift the phase of incoming clock to generate new clock. It is possible to drive memory controller with a DCM-generated 200- MHz (period = 5ns) clock signal. Example: To satisfy the 10ns we_n requirement, one can expand the w1 state to two states and assert the we_n signal in these states. The complete write operation now requires four states but they amount to only 20ns.

Input/Output Block (IOB) An input/output block (IOB) of a Spartan-3 FPGA provides a programmable interface between an I/O pin and the device's internal logic. To minimize off-chip pad delay, output registers of memory controller can be placed at the FFs in IOBs and configure the driver with proper slew rate . An IOB contains a double data rate (DDR) register, which has 2 clocks and 2 inputs. Conceptually, the inputs are sampled independently by the two clocks and sampled values are stored in the same register.

Input/Output Block (IOB) Combining DDR and DCM to generate we_n clk180, generated by the DCM is 180 degree out of phase with clk The 1 is always loaded at the rising edge of the clk180 signal (falling edge of the clk signal) essentially deactivating the second half of the we_n signal to generates a clean half-cycle signal.

Extra slides…

Block Diagram of SRAM 18-bit address bus Bidirectional 16-bit data bus chip enable write enable output enable lower byte enable upper byte enable

Block Diagram of SRAM 18-bit address bus, ad Bidirectional 16-bit data bus, dio, divided into upper and lower bytes, which can be accessed individually Five control signals (‘_n’ denotes active low): ce_n (chip enable) we_n (write enable) oe_n (output enable) lb_n (lower byte enable) ub_n (upper byte enable)

Timing Diagrams for Read Operation tRC tOHA tAA Timing diagram of an address controlled read cycle we_n = 1, oe_n = 0 tRC (≈ tAA): read cycle time, the minimal elapsed time between two read operations tAA: address access time, the time required to obtain stable output data after an address change tOHA: output hold time, the time that the output data remains valid after the address changes (not to be confused with the hold time of an FF) Find the specific values for the device in the data sheet

Timing Diagrams for Read Operation tLZOE tDOE tHZOE Timing diagram of an oe_n controlled read cycle we_n = 1 tDOE: output enable access time, the time required to obtain valid data after oe_n is activated tHZOE: output enable to high-Z time, the time for the tri-state buffer to enter the high-impedance state after oe_n is deactivated. tLZOE: output enable to low-Z time, the time for the tri-state buffer to leave the high-impedance state after oe_n is activated Find the specific values for the device in the data sheet

Timing Diagram for Write Operation tWC tSA tWPE1 tSD tHD tWC: write cycle time, the minimal elapsed time between two write operations tSA: address setup time, the minimal time that the address must be stable before we_n is activated tHA: address hold time, the minimal time that the address must be stable after we_n is deactivated tPWE1: we_n pulse width, the minimal time that we_n must be asserted tSO: data setup time, the minimal time that data must be stable before the latching edge (the edge in which we_n moves from 0 to 1). tHD: data hold time, minimal time that data must be stable after latching edge

Control Sequences for Read Cycle we_n should be deactivated during the entire operation Place the address on the ad bus and activate the oe_n signal. These two signals must be stable for the entire operation. Wait for at least tAA. The data from the SRAM becomes available after this interval Retrieve the data from dio and deactivate the o e n signal.

Control Sequences for Write Cycle Place the address on the ad bus and data on the dio bus and activate the we_n signal. These signals must be stable for the entire operation. Wait for at least tPWE1. Deactivate the we_n signal. The data is latched to the SRAM at the 0- to-1 transition edge. Remove the data from the dio bus.