Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware Design This material exempt per Department of Commerce license exception TSU.

Similar presentations


Presentation on theme: "Hardware Design This material exempt per Department of Commerce license exception TSU."— Presentation transcript:

2 Hardware Design This material exempt per Department of Commerce license exception TSU

3 Objectives After completing this module, you will be able to:
List the functionality that defines an arbiter, a master, and a slave List the various buses available in the MicroBlaze processor Discuss the six configurations of the MicroBlaze processor

4 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
On-Chip Peripheral Bus (OPB) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations PowerPC Processor Programmer’s Model Reset Logic in the PowerPC Processor JTAG Configurations in the Virtex-II Pro Device

5 Buses 101 A bus is a multiwire path on which related information is delivered Address, data, and control buses Processor and peripherals communicate through buses Peripherals may be classified as: Arbiter, master, slave, or master/slave (bridge) Arbiter Master Arbiter Master/ Slave Master This is a typical system structure where one or more devices are present. In a processor-based system, you will find one or more processors. Master devices are capable of driving all three buses: address, data, and control. When more than one master device is present, an arbiter is required to guide the master devices when they can drive the buses. Slave devices will simply respond to master requests. A hybrid devicemaster and slave, also called a bridgeis required when buses are segmented into multiple segments. The master and slave devices transfers information from one bus segment onto another bus segment. Slave Slave Slave

6 Buses 101 Bus masters have the ability to initiate a bus transaction
Bus slaves can only respond to a request Bus arbitration is a three-step process: A device requesting to become a bus master asserts a bus request signal The arbiter continuously monitors the request and outputs an individual grant signal to each master according to the master’s priority scheme and the state of the other master requests at that time The requesting device samples its grant signal until the master is granted access. The master then initiates a data transfer between the master and a slave when the current bus master releases the bus Arbitration mechanisms Fixed priority, round-robin, hybrid Bus arbitration is a three-step process: 1. A master device that wants to become the next master asserts a request signal to the arbiter. 2. The arbiter continuously monitors requests from all master devices. When a request from a higher priority master arrives, the arbiter will assert a grant signal. 3. The requesting master that receives a grant signal becomes the next master when the current master ceases its mastership according to the bus protocol. The arbitration can be hidden or nonhidden. Hidden arbitration indicates that another device is to become master when the current device is driving the bus. Nonhidden arbitration prevents another device from getting a grant signal when the current master is driving the bus.

7 MicroBlaze Processor Bus Example
CacheLink Ext Mem IOPB Bus IIC ILMB Bus UART IXCL BRAM ILMB IOPB Ext Mem GPIO MicroBlaze™ Ethernet DLMB DOPB DXCL LCD DLMB Bus DOPB Bus Ext Mem CacheLink BRAM INTC The MicroBlaze processor core is organized as a Harvard architecture with separate bus interface units for data access and instruction access. Each bus interface unit is further split into a Local Memory Bus (LMB) and IBM’s On-Chip Peripheral Bus (OPB). The LMB provides single-cycle access to on-chip dual-port block RAM. The OPB interface provides a connection to both on-chip and off-chip peripherals and memory. In addition, the MicroBlaze processor core provides eight input and eight output interfaces to Fast Simplex Link (FSL) buses. The FSL buses are unidirectional, nonarbitrated, dedicated communication channels. With version 3.0 core, MicroBlaze also have a CacheLink interface (FSL like) for providing cache ports to external memory using FSL like interface. The MicroBlaze processor bus interfaces include the following features: OPB V2.0 bus interface with byte-enable support. LMB provides simple synchronous protocol for efficient block RAM transfers. LMB provides guaranteed performance of 125 MHz for the local memory subsystem. FSL provides a fast nonarbitrated streaming communication mechanism. CacheLink provides a fast nonarbitrated external memory caching mechanism. Definitions: ILMB: Instruction-Side Local Memory Bus DLMB: Data-Side Local Memory Bus IOPB: Instruction-Side On-Chip Peripheral Bus DOPB: Data-Side On-Chip Peripheral Bus IXCL: Xilinx Instruction CacheLink Interface DXCL: Xilinx Data CacheLink Interface All buses are 32 bits OPB ARB

8 CoreConnect Bus Architecture
The IBM CoreConnect bus architecture standard provides three buses for interconnecting cores, library macros, and custom logic: Processor Local Bus (PLB) On-Chip Peripheral Bus (OPB) Device Control Register (DCR) bus IBM offers a no-fee, royalty-free CoreConnect bus architecture license Licenses receive the PLB arbiter, OPB arbiter, and PLB/OPB bridge designs along with bus-model toolkits and bus-functional compilers for the PLB, OPB, and DCR buses Required only if you create your own CoreConnect bus architecture peripheral or you are using the Bus Functional Model (BFM)

9 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
On-Chip Peripheral Bus (OPB) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations PowerPC Processor Programmer’s Model Reset Logic in the PowerPC Processor JTAG Configurations in the Virtex-II Pro Device

10 On-Chip Peripheral Bus (OPB)
The On-Chip Processor Bus (OPB) decouples lower bandwidth devices from the PLB It is a less complex protocol than the PLB No split transaction or address pipelining capability Centralized bus arbitration—OPB arbiter Connection infrastructure for the master and slave peripheral devices The OPB bus is designed to alleviate system performance bottlenecks by reducing capacitive loading on the PLB Fully synchronous to one clock Shared 32-bit address bus; shared 32-bit data bus Supports single-cycle data transfers among the master and the slaves Supports multiple masters, determined by arbitration implementation The bridge function can be the master on the PLB or OPB The OPB (one of the elements of IBM CoreConnect bus architecture) is a general-purpose synchronous bus designed for easy connection of on-chip peripheral devices. Features include: • 32-bit or 64-bit data bus • Up to 64-bit address • Supports 8-bit, 16-bit, 32-bit, and 64-bit slaves • Supports 32-bit and 64-bit masters • Dynamic bus sizing with byte, halfword, fullword, and doubleword transfers • Optional byte enable support • Distributed multiplexer bus instead of 3-state drivers • Single-cycle data transfers between the OPB master and the OPB slaves (not including arbitration) • 16-cycle bus time-out (provided by the arbiter) • Support for multiple OPB bus masters and sequential address protocol • Support for bus parking, bus locking, and slave-requested retry • Bus arbitration overlapped with the last cycle of bus transfers The OPB_V20 design includes several sources for a bus reset. A power-on reset circuit asserts the OPB_Rst for 16 clock cycles anytime the FPGA has completed configuration. External resets that occur during the 16 clock reset time are ignored. After the 16-clock reset has completed, external resets can be applied to the OPB_V20 reset signals. The external resets are SYS_Rst, WDT_Rst, and Debug_SYS_Rst. SYS_Rst is the main user reset for the OPB and can be connected to internal logic or an external signal or switch. WDT_Rst can be connected to the reset output of a watchdog timer to allow for OPB resets in the event of a watchdog time-out. Debug_SYS_Rst can be connected to the reset output of a debug peripheral, such as the JTAG UART, so that the debugger can remotely reset the OPB. SYS_Rst, WDT_Rst, and Debug_SYS_Rst are synchronized to the OPB_Clk, but the width of these signals is otherwise unaltered.

11 On-Chip Peripheral Bus (OPB)
Supports 16 masters and an unlimited number of slaves (only limited by the expected performance) The OPB arbiter receives bus requests from the OPB masters and grants the bus to one of them Fixed and dynamic (LRU) priorities Bus logic is implemented with AND-OR logic. Inactive devices drive zeros Read and write data buses can be separated to reduce loading on the OPB_DBus signal An OPB system is composed of masters, slaves, a bus interconnect, and an arbiter. The OPB_V20 core implements the bus interconnect and arbiter function in an OPB system. The OPB bus signals are created by logically “OR’ing” the signals that drive the bus. OPB devices that are not active during a transaction are required to drive zeros into the OR structure. This structure forms a distributed multiplexer and results in efficient bus implementations. Bus arbitration signals, such as M_request and OPB_MGrant, are directly connected between the OPB arbiter and each OPB master device. The OPB_V20 supports up to 16 masters and an unlimited number of slaves (up to the hardware resources available). The port widths into the OPB_V20 are designed to increase in size as more masters or slaves are added. The priority arbitration mode is configurable via a design parameter. Fixed priority arbitration with processor access to read/write priority registers, and dynamic priority arbitration implementing a true least recent used (LRU) algorithm. OPB Bus arbitration uses the following protocol: 1. An OPB master asserts its bus request signal. 2. The OPB arbiter receives the request and outputs an individual grant signal to each master according to the priority of the signal and the state of the other requests. 3. An OPB master samples its grant signal at the rising edge of the OPB clock. In the following cycle, the OPB master initiates a data transfer between the master and a slave by asserting its select signal. The OPB arbiter issues a bus grant signal only during valid arbitration cycles as follows: • Idle: No data transfer is in progress. • Overlapped arbitration cycle: Arbitration in this cycle allows another master to begin a transfer in the following cycle, avoiding the need for a dead cycle on the bus. The Xilinx OPB arbiter can be configured to have either registered or combinational grant outputs. Registered grant outputs are asserted one clock cycle after each arbitration cycle, resulting in one dead cycle on the bus. However, registered grant outputs allow the OPB bus to run at higher clock rates.

12 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
On-Chip Peripheral Bus (OPB) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations PowerPC Processor Programmer’s Model Reset Logic in the PowerPC Processor JTAG Configurations in the Virtex-II Pro Device

13 Local Memory Bus (LMB) The Local Memory Bus (LMB) provides single-cycle access to on-chip dual-port block RAM for MicroBlaze™ processors The LMB provides simple synchronous protocol for efficient block RAM transfers The LMB provides maximum guaranteed performance of 125 MHz in Virtex-II devices for the local memory subsystem DLMB: Data interface, local memory bus (block RAM only) ILMB: Instruction interface, local memory bus (block RAM only) The LMB is a fast, local bus for connecting MicroBlaze processor instruction and data ports to high-speed peripherals, primarily on-chip block RAM (block RAM). The LMB is an efficient, single-master bus, requiring no arbiter. It has separate read and write data buses and utilizes low FPGA resources. The bus provides a synchronous protocol operating at 125 MHz. Peak data rate of 500 MB/s and typical data rate of 333 MB/s can be achieved. The MicroBlaze processor can have one or two LMB controllers, depending on whether or not instruction or data memory is present on the LMB side.

14 LMB Timing Rules for generating an LMB clock
The MB, LMB, and OPB clock must be the same clock Use timing constraints to determine the clock speed The LMB is a synchronous bus used primarily to access on-chip block RAM. It uses a minimum number of control signals and a simple protocol to ensure that local block RAM is accessed in a single clock cycle. LMB signals and definitions are shown in the following table. All LMB signals are high true. Signal Data Interface Instr. Interface Type Description Addr[0:31] Data_Addr[0:31] Instr_Addr[0:31] O Address bus Byte_Enable[0:3] not used Byte enables Data_Write[0:31] Write data bus AS D_AS I_AS Address strobe Read_Strobe IFetch Read in progress Write_Strobe Write in progress Data_Read[0:31] Instr[0:31] I Read data bus Ready DReady IReady Ready for next transfer Clk Bus clock

15 Bus Summary 1. Maximum clock rates are estimates and are presented for comparison only. The actual maximum clock rate for each bus is dependent on device family, device speed grade, design complexity, and other factors. 2. Peak data rate is the maximum theoretical data transfer rate at the clock rate shown for each bus. 3. The typical data rates are intended to illustrate data rates that are representative of actual system configurations. The typical data is highly dependent on the application software and system hardware configuration. 4. Assumes primarily cache-line fills, minimal read and write concurrency (66.7-percent bus utilization). 5. Assumes the minimal use of sequential address capabilities and three clock cycles per OPB transfer. 6. The OCM controller operates at the PPC405 core clock rate, but its data transfer rate is limited by the access time of the on-chip memory. The typical data rate assumes percent bus utilization. 7. Assumes 66.7-percent bus utilization. 8. Assumes that the DCR operates at the same clock rate as the PLB and each DCR access requires five clock cycles. The number of clock cycles per DCR transfer is dependent on how many DCR devices are present in the system. Each additional DCR device adds latency to all DCR transfers.

16 Bus Summary

17 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
Processor Local Bus (PLB) On-Chip Peripheral Bus (OPB) Device Control Register (DCR) On-Chip Memory Bus (OCM) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations PowerPC Processor Programmer’s Model Reset Logic in the PowerPC Processor JTAG Configurations in the Virtex-II Pro Device

18 The Software Streaming Data Challenge
Suppose you want to move data through a hardware/software processing application with following characteristics Data may be of a streaming or burst nature Deterministic latency between hardware and software Possible solutions include Bus peripheral, maybe OPB Multiple clock-cycle overhead Address decode time Arbitration, loss of hardware/software coherency Custom microprocessor instruction access to peripheral hardware May require processor to be stalled Complex logic can slow overall processor speed May require assembly language to access special instruction Fast Simplex Links (see next slide)

19 MicroBlaze Processor-Based Embedded Design
I-Cache BRAM Local Memory Bus MicroBlaze 32-Bit RISC Core Flexible Soft IP BRAM Configurable Sizes D-Cache BRAM Possible in Virtex™-II Pro Arbiter OPB On-Chip Peripheral Bus Fast Simplex Link 0,1….7 Custom Functions Custom Functions UART 10/100 E-Net Memory Controller Build MB: Because the MicroBlaze processor is a soft-logic processor, it runs on all current FPGA families. Build Caches: With the MicroBlaze processor, you can select whether to use an instruction cache or a data cache or both, and their sizes. FPGA block RAM is used for these caches. Build LMB: The MicroBlaze processor has a 32-bit Local Memory Bus (LMB) that is used for low-latency access to on-chip block RAM. The LMB provides single-cycle access to on-chip dual-port block RAM and is split into instruction-side LMB and data-side PMB. Build Off-Chip Memory: Starting with the 6.1 release (September 2003), MicroBlaze includes a tightly coupled off-chip Flash/SRAM memory controller to provide high-speed, low-latency access. Build Fast Simplex Link: The MicroBlaze processor contains zero to eight input Fast Simplex Link interfaces and zero to eight output Local Link interfaces. The Fast Simplex Link channels are dedicated unidirectional point-to-point data-streaming interfaces. The Fast Simplex Link interfaces on the MicroBlaze processor are 32-bits wide. In addition, the same Fast Simplex Link channels can be used to transmit or receive either control or data words. A separate bit indicates whether the transmitted (received) word is control or data information. There are two cycle-assembly instructions: get and put. Wrapping this into custom C functions with appropriate custom hardware also provides an optimal method by which you can implement custom instructions for the processor. Build OPB: The MicroBlaze processor also shares the On-Chip Peripheral Bus (OPB) with the PowerPC processor. The same peripherals that work on the OPB with the PowerPC processor can also be used with the MicroBlaze processor. Build PPC: Finally, because the OPB is shared, you can attach a MicroBlaze processor as an OPB peripheral connected to a PowerPC system on a Virtex-II Pro device. Off-Chip Memory FLASH/SRAM FSL Channels

20 Another Alternative: Fast Simplex Links ( FSL)
Unidirectional point-to-point FIFO-based communication Dedicated (unshared) and nonarbitrated architecture Dedicated MicroBlaze™ C and ASM instructions for easy access High speed, access in as little as two clocks on processor side, 600 MHz at hardware interface Available in Xilinx Platform Studio (XPS) as a bus interface library core from Hardware → Create or Import Peripheral Wizard FSL_M_Clk FIFO FSL_S_Clk FSL_M_Data [0:31] FSL_S_Data [0:31] FSL core illustrating simplified inputs and outputs Internal FIFO, control bit, data path, configured using the FSL Wizard FSL_M_Control 32-bit data FSL_S_Control FSL_M_Write FSL_S_Read FSL_M_Full FSL_S_Exists FIFO Depth

21 FSL Advantages It is simple, fast, and easy to use
FSL versus Custom uP instruction Custom hardware is not dependent on instruction decode Clock speed is not slowed down by new hardware FSL is faster than a bus interface – saves clock cycles Eliminates bus signaling overhead No arbitration No address decode No acknowledge cycles Decoupled data clock from CPU allows for asynchronous operation Minimal FPGA fabric overhead No need to stall the processor because of custom function clock latency No need to modify the C compiler (could be risky!) Control bits limit the need for a complex interrupt structure

22 FSL Highlights FSL architecture is automatically generated based on user requirements . Required resources from 21 to 451 slices Core can be configured as master or slave Separate control bit channel MicroBlaze™ allows for up to eight parallel FSL channels Easy to use C calls for MicroBlaze Predefine compiler functions Optional use of assembler language

23 FSL Features Configurable data bus widths – 8, 16, 32 bits
Configurable FIFO depths – 1 to 8193 using SRL16 or block RAM Synchronous or asynchronous FIFO clocking with respect to the MicroBlaze™ system clock Selectable use of control bit Blocking and nonblocking software instructions for data and control (get and put) Simple software interface using predefined C instructions; Automatically generated C drivers

24 Knowledge Check Which buses are included in the CoreConnect bus architecture standard? What is the maximum performance of the LMB in a Virtex™-II device?

25 Answers Which buses are included in the CoreConnect bus architecture standard? PLB, OPB, and DCR What is the maximum performance of the LMB in a Virtex™-II device? 125 MHz

26 Knowledge Check What is FSL? How many FSL channels are supported by
MicroBlaze How FSL can improve performance?

27 Answers What is FSL? How many FSL channels are supported by
FSL is a dedicated simplex link with a FIFO interface How many FSL channels are supported by MicroBlaze 8 How FSL can improve performance? Since FSL is a dedicated link bus arbitration does not exist. Fixed data latency of two clocks assuming data is available

28 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
Processor Local Bus (PLB) On-Chip Peripheral Bus (OPB) Device Control Register (DCR) On-Chip Memory Bus (OCM) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations PowerPC Processor Programmer’s Model Reset Logic in the PowerPC Processor JTAG Configurations in the Virtex-II Pro Device

29 MicroBlaze Processor Embedded soft RISC processor (version 4.00a)
32-bit address and data busses 32-bit instruction word (three operands and two addressing modes) 32 registers (32-bit wide) 3 pipe stages (single issue) Big-endian format Buses Full Harvard-architecture OPB (CoreConnect bus architecture standard), instruction, and data LMB for connecting to local block RAM (faster), instruction, and data Fast Simplex Links: Dedicated unidirectional point-to-point data streaming interfaces Dedicated CacheLink ports for instruction and data caching with four-words cache line size and critical word first access capability The MicroBlaze embedded soft core processor includes the following features: Thirty-two 32-bit general purpose registers Thirty-bit instruction word with three operands and two addressing modes Separate 32-bit instruction and data buses that conform to IBM’s OPB Separate 32-bit instruction and data buses with direct connection to on-chip RAM through an LMB 32-bit address bus Single-issue pipeline Instruction and data cache Hardware debug logic Fast Simplex Link (FSL) support Hardware multiplier (in the Virtex™-II and subsequent devices) Floating Point Unit (v4.00a onwards) Instruction and data caching is allowed either on CacheLink or OPB independent of each other operation- i.e. instruction may use CacheLink, data may use OPB

30 MicroBlaze Processor ALU Floating Point Unit Program counter
Uses hardware multipliers/DSP48 in the Virtex™-II or later families Barrel shifter Floating Point Unit Implements IEEE 754 single precision floating point standards Supports addition, subtraction, multiplication, division, and comparision Program counter Instruction decode Instruction cache Direct-mapped Configurable caching over OPB or CacheLink Configurable size: 2 KB, 4 KB, 8 KB, 16 KB, 32 KB, 64 KB

31 MicroBlaze Processor Performance
All instruction takes one clock cycle, except: Load and store (two clock cycles) Multiply (two clock cycles) Branches (three clock cycles, can be one clock cycle) Operating Frequency 180 MHz on Virtex-4 LX (-12) devices 150 MHz on Virtex-II Pro (-7) devices 100 MHz on Spartan 3 (-5) devices Dhrystone-MIPs (2.1 standard benchmark) using LMB block RAM 166 MHz on Virtex-4 LX (-12) devices 138 MHz on Virtex-II Pro (-7) devices 92 MHz on Spartan 3 (-5) devices Peak performance of 0.92 DIMPS/MHz 1269 LUTs in Virtex-4, 1225 LUTS in Virtex-II Pro, 1318 LUTs in Spartan 3 The MicroBlaze architecture balances execution performance against implementation size. It is important to keep in mind that benchmark performance will vary depending on the processor configuration, implementation tool results, targeted FPGA architecture, and device speed grade. The performance numbers presented on this slide are best case comparison numbers achieved under conditions equivalent to those used in the industry. The maximum performance and maximum clock frequency will vary from one design to another according to configuration options used.

32 MicroBlaze Processor Memory Space
Memory and peripherals The MicroBlaze processor uses 32-bit addresses Special addresses MicroBlaze processors must have user-writable memory from 0x through 0x F Each vector consists of two instructions IMM followed by a BRAI instruction to address full memory range 0x0000_0000 0x0000_0008 0x0000_0010 0xFFFF_FFFF 0x0000_0018 Reset Address Exception Address Interrupt Address LMB Memory Reserved OPB Memory Peripherals 0x0000_0020 0x0000_0028 0x0000_004F Break Hardware Exception The actual address map is defined in the MicroBlaze Hardware Specification (MHS) file. This file contains an address map specifying the addresses of LMB memory, OPB memory, and external memory and peripherals. The address range grows from 0. The lowest range is the LMB memory. This is followed by the OPB memory, the external memory, and the peripherals. Some addresses in this address space have predefined meaning. The processor jumps to address 0x0 on reset, to address 0x8 on exception, address 0x10 on interrupt. To address 0x18 on non-maskable, maskable hardware and software Break, and to 0x20 on hardware exception. Each vector allocates two addresses to allow full address range branching (requires an IMM followed by a BRAI instruction). Register R14 is used to save return address on interrupt, R16 on break, and R17 on hardware exception. The address range 0x28 to 0x4F is reserved for future software support by Xilinx.

33 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
Processor Local Bus (PLB) On-Chip Peripheral Bus (OPB) Device Control Register (DCR) On-Chip Memory Bus (OCM) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations

34 Bus Configuration Systems
MicroBlaze processor bus interfaces are available in six configurations. The optimal configuration for your application depends on code size and data spaces and whether or not you require fast access to the internal block RAM.

35 Configuration 1 Large external instruction memory
Reference Configuration 1 Large external instruction memory Fast internal instruction memory (block RAM) Large external data memory Fast internal data memory (block RAM) Use Configuration 1 when your application requires more instruction and data memory than is available in the on-chip block RAM. Critical sections of instruction and data memory can be allocated to the faster ILMB block RAM to improve your application’s performance. Depending on how much data memory is required, the data-side memory controller may not be present. The data-side OPB is also used for other peripherals, such as UARTs, timers, general purpose I/O, additional block RAM, and custom peripherals. The OPB-to-OPB bridge is only required when the data-side OPB needs access to the instruction-side OPB peripherals, such as for software-based debugging. Because of the extra logic required to implement two buses per side, the maximum clock rate of the CPU may be slightly less than configurations with one bus per side (about 110 MHz). This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. Typical applications: MPEG decoder Communications controller Complex state machine for process control and other embedded applications Set top boxes

36 Configuration 2 Large external instruction memory
Reference Configuration 2 Large external instruction memory Large external data memory Fast internal data memory (block RAM) Use Configuration 2 when your application requires more instruction and data memory than is available in the on-chip block RAM. In this configuration, all of the instruction memory is resident in off-chip memory or on-chip memory on the instruction-side OPB. Depending on how much data memory is required, the data-side memory controller may not be present. The data-side OPB is also used for other peripherals, such as UARTs, timers, general-purpose I/O, additional BRAM, and custom peripherals. The OPB-to-OPB bridge is only required if the data-side OPB needs access to the instruction-side OPB peripherals, such as for software-based debugging. This configuration allows the CPU core to operate at the maximum clock rate (typically, 125 MHz) because of the simpler instruction-side bus structure. Instruction fetches on the OPB, however, are slower than fetches from block RAM on the LMB. Overall processor performance is lower than implementations with the LMB, unless a large percentage of code is run from the internal instruction history buffer. This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. Typical applications: MPEG decoder Communications controller Complex state machine for process control and other embedded applications Set top boxes

37 Configuration 3 Fast internal instruction memory (block RAM)
Reference Configuration 3 Fast internal instruction memory (block RAM) Large external data memory Fast internal data memory (block RAM) Use Configuration 3 when your application code fits into the on-chip block RAM, but more memory may be required for data memory. Critical sections of data memory can be allocated to the faster DLMB block RAM to improve your application’s performance. Depending on how much data memory is required, the data-side memory controller may not be present. The data-side OPB is also used for other peripherals, such as UARTs, timers, general-purpose I/O, additional block RAM, and custom peripherals. This configuration allows the CPU core to operate at the maximum clock rate because of the simpler instruction-side bus structure. The instruction-side LMB provides two-cycle pipelined read access from the BRAM for an effective access rate of one instruction per clock. This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. Typical applications: Data-intensive controllers Small to medium state machines

38 Configuration 4 Large external instruction memory
Reference Configuration 4 Large external instruction memory Fast internal instruction memory (block RAM) Large external data memory Use Configuration 4 when your application requires more instruction and data memory than is available in the on-chip block RAM. Critical sections of instruction memory can be allocated to the faster ILMB block RAM to improve your application’s performance. The data-side OPB is used for one or more external memory controllers and other peripherals, such as UARTs, timers, general-purpose I/O, additional block RAM, and custom peripherals. The OPB-to-OPB bridge is only required if the data-side OPB needs access to the instruction side OPB peripherals, such as for software-based debugging. Because of the extra logic required to implement two buses per side, the maximum clock rate of the CPU may be slightly less than configurations with one bus per side (about 110 MHz). This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. However, software-based debugging of code in the ILMB block RAM can only be performed if a block RAM memory controller is included on the data-side OPB bus to provide write access to the LMB block RAM. Typical applications: MPEG decoder Communications controller Complex state machine for process control and other embedded applications Set top boxes

39 Configuration 5 Large external instruction memory
Reference Configuration 5 Large external instruction memory Large external data memory Use Configuration 5 when your application requires external instruction and data memory. In this configuration, all of the instruction and data memory is resident in off-chip memory or on-chip memory on the OPB buses. The data-side OPB is used for one or more external memory controllers and other peripherals such as UARTs, timers, general-purpose I/O, block RAM, and custom peripherals. The OPB-to-OPB bridge is only required if the data-side OPB needs access to the instruction-side OPB peripherals, such as for software-based debugging. This configuration allows the CPU core to operate at the maximum clock rate (125 MHz) because of the simpler instruction-side bus structure. However, instruction fetches on the OPB are slower than fetches from block RAM on the LMB. Overall processor performance is lower than implementations using LMB, unless a large percentage of code is run from the internal instruction history buffer. This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. Typical applications: MPEG decoder Communications controller Complex state machine for process control and other embedded applications Set top boxes

40 Configuration 6 Fast internal instruction memory (block RAM)
Reference Configuration 6 Fast internal instruction memory (block RAM) Large external data memory Use Configuration 6 when your application code fits into the on-chip ILMB block RAM, but more memory may be required for data memory. The data-side OPB is used for one or more external memory controllers and other peripherals, such as UARTs, timers, general-purpose I/O, additional block RAM, and custom peripherals. This configuration allows the CPU core to operate at the maximum clock rate because of the simpler instruction-side bus structure. The instruction-side LMB provides two-cycle pipelined read access from the block RAM for an effective access rate of one instruction per clock. This configuration allows debugging of application code through either software-based debugging (resident monitor debugging) or hardware-based JTAG debugging. However, software-based debugging of code in the ILMB block RAM can only be performed if a block RAM memory controller is included on the data-side OPB bus to provide write access to the LMB block RAM. Typical applications: Minimal controllers Small to medium state machines

41 Outline Buses 101: Arbiter, Master, Slave Fast Simplex Links (FSL)
Processor Local Bus (PLB) On-Chip Peripheral Bus (OPB) Device Control Register (DCR) On-Chip Memory Bus (OCM) Local Memory Bus (LMB) Fast Simplex Links (FSL) MicroBlaze Processor Programmer’s Model MicroBlaze Processor Configurations

42 Skills Check

43 Knowledge Check Where do the following vectors reside in the MicroBlaze processor? Reset Interrupt Exception What is the starting address of the code if LMB memory is present?

44 Answers Where do the following vectors reside in the MicroBlaze processor? Reset (0x ) Interrupt (0x ) Exception (0x ) What is the starting address of the code if LMB memory is present? 0x

45 Where Can I Learn More? Tool documentation Processor documentation
Processor IP Reference Guide Processor documentation MicroBlaze™ Processor Reference Guide Support Website EDK Website:


Download ppt "Hardware Design This material exempt per Department of Commerce license exception TSU."

Similar presentations


Ads by Google