This material exempt per Department of Commerce license exception TSU Hardware Design
Hardware Design 2 Objectives After completing this module, you will be able to: List the functionality that defines an arbiter, a master, and a slave List various buses available for the PowerPC processor Discuss the JTAG interface in Virtex-II Pro devices
Hardware Design 3 Outline Supported Busses – Processor Local Bus (PLB) – On-chip Peripheral Bus (OPB) – Device Control Register (DCR) – On-Chip Memory Bus (OCM) Processor Use Models PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
PowerPC Bus Example PPC405 ISOCM DSOCM DSPLB ISPLB INTC BRAM DDR PLB ARB BRAM SDRAM DCR PLB2OPB IIC OPB ARB GPIO UART Ethernet LCD BRAM INTC OPB2PLB ISOCM Bus Data- 64 bits Address- 32 bits PLB Bus Data- 64 bits Address- 32 bits OPB Bus Data- 32 bits Address- 32 bits DCR Bus Data- 32 bits Address- 10 bits DSOCM Bus Data- 32 bits Address- 32 bits
CoreConnect The IBM CoreConnect standard provides three buses for interconnecting cores, library macros, and custom logic: – Processor Local Bus (PLB) – On-chip Peripheral Bus (OPB) – Device Control Register (DCR) bus IBM offers a no-fee, royalty-free CoreConnect architectural license – Licenses receive the PLB arbiter, OPB arbiter, and PLB/OPB bridge designs along with bus-model toolkits and bus-functional compilers for the PLB, OPB, and DCR buses – Required only if you create your own CoreConnect peripheral or you are using Bus Functional Model (BFM)
Hardware Design 6 Outline Supported Buses – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) – On-Chip Memory (OCM) Processor Use Models MicroBlaze Processor Programmer’s Model MicroBlaze Configurations PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
PLB Bus – Connection infrastructure for high-bandwidth master and slave devices – Fully synchronous to one clock – Centralized bus arbitration—PLB arbiter – 64-bit data bus – Addresses high-performance, low-latency, and design-flexibility issues through: Decoupled address and read and write data buses with split transaction capability Concurrent read and write transfers yielding a maximum bus utilization of two data transfers per clock Address pipelining that reduces bus latency by overlapping a new write request with an ongoing write transfer and up to three read requests with an ongoing read transfer Ability to overlap the bus request and grant protocol with an ongoing transfer
PLB Interconnect / Architecture One to 16 PLB masters, each connect all of their signals to the PLB arbiter The PLB arbiter multiplexes signals from masters onto a shared bus to which all the inputs of the slaves are connected One to n PLB slaves OR together their outputs to drive a shared bus back to the PLB arbiter The PLB arbiter handles bus arbitration and the movement of data and control signals between masters and slaves
PLB Bridge The PLB-to-OPB bridge translates PLB transactions into OPB transactions This bridge functions as a slave on the PLB side and a master on the OPB side The bridge contains a DCR slave interface to provide access to its bus error status registers The bridge is necessary in systems where a PLB master device, such as a CPU, requires access to OPB peripherals
Hardware Design 10 Outline Buses 101: Arbiter, Master, Slave – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) – On-Chip Memory (OCM) – Local Memory Bus (LMB) Processor Use Models MicroBlaze Processor Programmer’s Model MicroBlaze Configurations PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
OPB Bus The OPB bus decouples lower bandwidth devices from the PLB It is a less complex protocol than PLB – No split transaction or address pipelining capability Centralized bus arbitration—OPB arbiter Connection infrastructure for the master and slave peripheral devices The OPB bus is designed to alleviate system performance bottlenecks by reducing capacitive loading on the PLB – Fully synchronous to one clock – Shared 32-bit address bus, shared 32-bit data bus – Supports single-cycle data transfers between the master and the slaves – Supports multiple masters, determined by arbitration implementation – The bridge function can be the master on the PLB or OPB
OPB Bus Supports 16 masters and an unlimited number of slaves (limited by the expected performance) The OPB arbiter receives bus requests from the OPB masters and grants the bus to one of them – Fixed and dynamic (LRU) priorities Bus logic is implemented with AND-OR logic. Inactive devices drives zeros Read and write data buses can be separated to reduce loading on the OPB_DBus signal
Hardware Design 13 Outline Buses 101: Arbiter, Master, Slave – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) – On-Chip Memory (OCM) Processor Use Models PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
DCR Bus Device-control register bus – IBM CoreConnect standard – Used to talk to control registers (1024 total) – 32-bits-wide data all cycles word-oriented – Supports read and write only, no burst cycles – Simple acknowledgement termination – CPU supports special privileged instructions for access to the DCR Normal DCR requires special CPU assembly code to access – There is a “fixed” 1024-word I/O space – Must be privilege mode to access registers – Requires macros or inline assembly
DCR Bus C405DCRABUS C405DCRDBUSOUT C405DCRREAD DCRC405ACK DCRC405DBUSIN C405DCRWRITE PPC405DCR Devices dcr_Ack dcr_Write dcr_Read dcr_RdData dcr_WrData dcr_ABus dcr_Clk
Memory Mapped DCR DCR bridges allow memory mapping of DCR space anywhere within the system memory – OPB DCR bridge – Allows DCR devices to exist within 4 KB of contiguous space – Must be accessed on word boundaries and one word at a time – Easier to use, but it requires a PLB and OPB transaction
Hardware Design 17 Outline Supported Buses – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) – On-Chip Memory (OCM) Processor Use Models PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
OCM Bus 405 OCM I/Fs – PPC405 has a separate interface used for high-speed access of on-chip memory – PPC405 presents address on both the PLB bus and the OCM bus Addresses cannot exist in both PLB and OCM space – OCM addresses are non-cacheable, leaving the cache resources for the PLB accesses The processor block contains the OCM controllers – The processor block contains dedicated controllers to interface between the OCM I/F and FPGA BRAM – There are separate independent controllers for the I-side and D-side to provide higher performance All signals are in big-endian format
OCM Bus Features – Independent 16-MB logical space for each of the DSOCM and ISOCM 16 MB must be reserved regardless of actual memory used – 64-bit ISOCM and 32-bit DSOCM – Up to 128 KB / 64 KB (ISOCM / DSOCM) using programmable BRAM aspect ratios Programmable processor versus BRAM clock ratio – DSBRAM load: BRAM initialization (Data2MEM), CPU, and FPGA using dual-port BRAM – ISBRAM load: BRAM initialization (Data2MEM) and DCR CPU DCR-accessible registers
OCM Bus Benefits – Avoids loads into cache, reducing pollution and thrashing – Has fast-fixed latency of execution – On the D-side, dual-port BRAM enables a bidirectional data connection with the processor Sample uses – I-side: Interrupt service routines, boot-code storage – D-side: Scratch-pad memory, bidirectional data transfer
Hardware Design 21 Bus Timing Use timing constraints to determine which ratio to use *There are two independent clocks for each OCM controller: – BRAMDSOCMCLK – BRAMISOCMCLK PLB CLKOPB CLKDCR CLKOCM CLK * Transaction synchronous with Processor clock PLB clockProcessor clock Clock ratio1:1 to 16:11:1 to 4:11:1 to 8:11:1 to 4:1 ExampleProcessor clock at 300 MHz, PLB at 100 MHz PLB at 100 MHz, OPB at 50 MHz Processor clock at 300 MHz, DCR at 100 MHz Processor clock at 300 MHz, OCM at 150 MHz
Hardware Design 22 Bus Summary
Hardware Design 24 Skills Check
Review Questions What is the advantage of using the memory- mapped DCR component? What is the disadvantage of using the memory- mapped DCR component? Which buses are included in the CoreConnect standard?
Answers What is the advantage of using the memory-mapped DCR component? – Does not require inline ASM instructions to access the bus What is the disadvantage of using the memory-mapped DCR component? – Requires a PLB and an OPB transaction Which buses are included in the CoreConnect standard? – PLB, OPB, and DCR
Hardware Design 27 Outline Supported Busses – PLB – OPB – DCR – OCM Processor Use Models PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
Hardware Design 28 Processor Use Models Highest Integration, Extensive Peripherals, RTOS & Bus Structures Networking & Wireless High Performance Medium Cost, Some Peripherals, Possible RTOS & Bus Structures Control & Instrumentation Moderate Performance Lowest Cost, No Peripherals, No RTOS & No Bus Structures VGA & LCD Controllers Low/High Performance 123 State MachineMicrocontrollerCustom Embedded Range of Use Models
Hardware Design 29 Outline Supported Busses – PLB – OPB – DCR – OCM Processor Use Models PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
PowerPC Processor Note: The OCM bus does not connect to the cache controller
Hardware Design 31 PowerPC Processor A 32-bit implementation of the PowerPC embedded- environment architecture Support for embedded-systems applications – Flexible memory management – Multiply and accumulate instructions for computationally intensive applications – Enhanced debug capabilities – 64-bit time base – Programmable interval (PIT), fixed interval (FIT), and watchdog timers Performance-enhancing features – Static branch prediction – Five-stage pipeline – Hardware multiply/divide for faster integer arithmetic – Enhanced string and multiple-word handling – Minimized interrupt latency
PowerPC Memory and peripherals – PPC405 uses 32-bit addresses Special addresses – Every PowerPC system should have the boot section starting at 0xFFFFFFFC – The default program space occupies a contiguous address space from 0xFFFF0000 to 0xFFFFFFFF – If interrupt handlers are present, vector table must start at 64K boundary 0x0000_0000 0xFFFF_0000 0xFFFF_FFFC Peripherals PLB/OPB Memory Reset Address
Hardware Design 33 Outline Buses 101: Arbiter, Master, Slave – PLB – OPB – DCR – OCM PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
Reset Sequence Sequencing of reset signals coming out of reset managed by PROC_SYS_RESET: – First — Bus structures come out of reset PLB and OPB arbiter and bridges for example – Second — Peripherals come out of reset 16 clocks later UART, SPI, and IIC, for example – Third — The CPUs come out of reset 16 clocks after the peripherals
Hardware Design 35 Outline Supported Busses – PLB – OPB – DCR – OCM Processor Use Model PowerPC Processor Programmer’s Model Reset Logic in PowerPC JTAG Configurations in Virtex-II Pro
Hardware Design 36 JTAG TAP Options At design time, you have control over whether each of the PowerPC JTAG TAP (in case of multiple PowerPCs) is incorporated into the FPGA JTAG TAP chain after FPGA configuration, or whether it remains a separate chain This is accomplished by instantiating or not instantiating the dedicated JTAGPPC block
Hardware Design 37 Virtex-II Pro Split JTAG Chains The isolated chain supports embedded development and debug tools User-defined JTAG pins – Provides a direct and isolated connection to the PPC405 JTAG I/F – JTAGPPC block is not used in this configuration TDO TDI PPC 405 User-Defined JTAG Pins on the FPGA Fixed/Dedicated JTAG Pins on the FPGA CPU JTAG DEBUG PORT FPGA JTAG CONFIG PORT
Hardware Design 38 Virtex-II Pro Combined JTAG Chains The combined chain supports development and debug tools – ChipScope Pro (PC4) – iMPACT (PC4) – GDB (PC4) – SingleStep XE (visionPROBEII ) Using the JTAGPPC block – Integrates the PPC405 with the FPGA fabric JTAG chain (dedicated JTAG pins) PPC 405 User-Defined JTAG Pins on the FPGA Fixed/Dedicated JTAG Pins on the FPGA CPU JTAG DEBUG PORT FPGA JTAG CONFIG PORT TDO TDI JTAG PPC
Hardware Design 39 Skills Check
Hardware Design 40 Review Questions What connections must be made to debug software on the IBM PowerPC processor? Where does the reset vector reside in the PowerPC processor?
Hardware Design 41 Answers What connections must be made in to debug software on the IBM PowerPC processor? – PowerPC JTAG ports connecting either the JTAGPPC component or the external pins Where does the reset vector reside in the PowerPC processor? – 0xFFFFFFFC
Where Can I Learn More? Tool documentation – Processor IP Reference Guide Processor documentation – PowerPC™ Processor Reference Guide – PowerPC 405 Processor Block Reference Guide – MicroBlaze™ Processor Reference Guide Support Website – EDK Website: