Keystone PCIe Usage Eric Ding.

Slides:



Advertisements
Similar presentations
Chapter 9 Bootloader.
Advertisements

Ethernet Over PCI Express Presented by Kallol Biswas
Nios Multi Processor Ethernet Embedded Platform Final Presentation
© 2003, Cisco Systems, Inc. All rights reserved..
KeyStone C66x CorePac Overview
Yaron Doweck Yael Einziger Supervisor: Mike Sumszyk Spring 2011 Semester Project.
Extended Memory Controller and the MPAX registers And Cache
RIP V1 W.lilakiatsakun.
KeyStone Advance Debug
Software setup with PL7 and Sycon V2.8
Multicore Applications Team
Multicore Applications
KeyStone Training Multicore Navigator Overview. Overview Agenda What is Navigator? – Definition – Architecture – Queue Manager Sub-System (QMSS) – Packet.
Keystone PCIe Usage Eric Ding.
Chapter 9 Bootloader. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 9, Slide 2 Learning Objectives  Need for a bootloader.
OUTLINE WHAT ? HOW ? WHY ? BLUEPOST Poster and Message Content Specified by the User Displaying the Poster Content on a Monitor Sending Messages to.
ARM-DSP Communication Architecture
Implementation of ProDrive Model Ran Katzur
KeyStone Training Multicore Applications Literature Number: SPRPXXX
Introduction to K2E Devices
PCIe 2.0 Base Specification Protocol And Software Overview
NS Training Hardware. System Controller Module.
The OSI Model and the TCP/IP Protocol Suite
Getting Started With DSP A. What is DSP? B. Which TI DSP do I use? Highest performance C6000 Most power efficient C5000 Control optimized C2000 TMS320C6000™
Keystone PCIe Usage Eric Ding.
Hardware Overview Net+ARM – Well Suited for Embedded Ethernet
Multicore Software Development Kit (MCSDK) Training Introduction to the MCSDK.
Multicore Navigator: Queue Manager Subsystem (QMSS)
Hacking the Bluetooth Pairing Authentication Process Graduate Operating System Mini Project Siyuan Jiang and Haipeng Cai.
The 6713 DSP Starter Kit (DSK) is a low-cost platform which lets customers evaluate and develop applications for the Texas Instruments C67X DSP family.
Multicore Software Development Kit (MCSDK) Training Introduction to the MCSDK.
Basic Router Configuration Honolulu Community College Cisco Academy Training Center Semester 2 Version 2.1.
Lecture 2 TCP/IP Protocol Suite Reference: TCP/IP Protocol Suite, 4 th Edition (chapter 2) 1.
SDR Test bench Architecture WINLAB – Rutgers University Date : October Authors : Prasanthi Maddala,
KeyStone Training Network Coprocessor (NETCP) Overview.
The University of New Hampshire InterOperability Laboratory Introduction To PCIe Express © 2011 University of New Hampshire.
Delight QuickBooks Online Banking Internal Support Training QuickBooks Windows 2009/2010 Online Banking.
Extended Memory Controller and the MPAX registers
KeyStone Training Serial RapidIO (SRIO) Subsystem.
KeyStone MPM Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Windows 7 Firewall.
C66x KeyStone Training HyperLink. Agenda 1.HyperLink Overview 2.Address Translation 3.Configuration 4.Example and Demo.
GBT Interface Card for a Linux Computer Carson Teale 1.
Applied research laboratory David E. Taylor Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop 1/2002.
C66x KeyStone Training HyperLink
Extended Memory Controller and the MPAX registers And Cache Multicore programming and Applications February 19, 2013.
UBI >> Contents Chapter 2 Software Development tools Code Composer Essentials v3: Code Debugging Texas Instruments Incorporated University of Beira Interior.
Version How to Use Packet Tracer MarinaMD.
© 2006 Cisco Systems, Inc. All rights reserved. Implementing Secure Converged Wide Area Networks (ISCW) Module 6: Cisco IOS Threat Defense Features.
KeyStone Training Multicore Navigator: Packet DMA (PKTDMA)
Challenges in KeyStone Workshop Getting Ready for Hawking, Moonshot and Edison.
Keystone Family PCIE Eric Ding. TI Information – Selective Disclosure Agenda PCIE Overview Address Translation Configuration PCIE boot demo.
Networks and Protocols CE Week 7b. Routing an Overview.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
Multicore Applications Team KeyStone C66x Multicore SoC Overview.
KeyStone SoC Training SRIO Demo: Board-to-Board Multicore Application Team.
Keystone Advanced Debug. Agenda Debug Architecture Overview Advanced Event Triggering DSP Core Trace System Trace Application Embedded Debug Support Multicore.
NS Training Hardware Traffic Flow Note: Traffic direction in the 1284 is classified as either forward or reverse. The forward direction is.
Network Coprocessor (NETCP) Overview
ChibiOS/RT Demo A free embedded RTOS
2.1 Chapter 2 Network Models Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
DSP C5000 Chapter 10 Understanding and Programming the Host Port Interface (EHPI) Copyright © 2003 Texas Instruments. All rights reserved.
KeyStone SoC Training SRIO Demo: Board-to-Board Multicore Application Team.
1394 H/W and OHCI Gi-Hoon Jung. 2002/01/162 Agenda Overview of the VITANA board OHCILynx PCI-based Host Controller Overview of the OHCI Spec.
Testing PCI Express Generation 1 & 2 with the RTO Oscilloscope
CS 286 Computer Organization and Architecture
AT91RM9200 Boot strategies This training module describes the boot strategies on the AT91RM9200 including the internal Boot ROM and the U-Boot program.
Chapter 9 Bootloader.
Chapter 13: I/O Systems.
KeyStone Training Multicore Applications Literature Number: SPRPXXX
Presentation transcript:

Keystone PCIe Usage Eric Ding

Agenda PCIe Overview Address Translation Configuration PCIe Demo

Agenda PCIe Overview Address Translation Configuration PCIe Demo

PCIe Topology Example PCIe: A tree structure with nodes connected to each other via point-to-point links. The root node is called the root complex (RC). The leaf nodes are called end points (EP) and the nodes that connect multiple devices to each other are called switches (SW).

KeyStone Architecture 1 to 8 Cores @ up to 1.25 GHz MSMC MSM SRAM 64-Bit DDR3 EMIF Application-Specific Coprocessors Power Management Debug & Trace Boot ROM Semaphore Memory Subsystem S R I O x4 P C e x2 U A T p l i c a t o n - f / 2 Packet DMA Multicore Navigator Queue Manager h r s x3 Network Coprocessor w E G M Accelerator Security PLL EDMA C66x™ CorePac L1 P-Cache D-Cache L2 SRAM HyperLink TeraNet CorePac & Memory Subsystem Memory Expansion Multicore Navigator Network Coprocessor External Interfaces SGMII allows two 10/100/1000 Ethernet interfaces Four high-bandwidth Serial RapidIO (SRIO) lanes for inter-DSP applications SPI for boot operations UART for development/testing Two PCIe at 5 Gbps I2C for EPROM at 400 Kbps Application-specific Interfaces: Antenna Interface 2 (AIF2) for wireless applications Telecommunications Serial Port (TSIP) x2 for media applications Shannon expands on Nyquist ‘s processing power by doubling the C66x DSP core count to 8. This configuration provides a high performing IP Network Transport solution and excels at the layer 2 and even addresses layer 3 processing requirements of wireless base station while offering significantly lower power consumption than traditional GPPs. Layer 2, 3 and IP network/transport Coprocessor for fast path processing along with Linux support are provided in Shannon resulting in an excellent platform for layer 2 and 3 processing. The Shannon and Nyquist / Turbo Nyquist combination are ideal for large scale Macro base stations. Using Hyperlink 50 Shannon can also be configured as an extension of Nyquist / Turbo Nyquist. HyperLink 50 is a very low latency (<200ns), high data rate (50 Gbps) interface which is typically used as a bridge Shannon and Turbo Nyquist. In addition to its low latency and high data rate this connection is transparent to the software. A Shannon / Turbo Nyquist pairing can combine into a very high performance 12 core SoC. The Coprocessors can work in a ‘farm’ concept and can be shared between the 12 cores. Multicore Navigator and it’s unified communication capabilities facilitate such ‘coupling’ of devices natively. Optionally the IO can be powered down to reduce the overall power consumption if only the Shannon cores are required. 5

PCIe Features PCI-SIG: PCI Express Base Specification (Rev. 2.0) Root Complex (RC) and End Point (EP) operation modes. In EP mode, supports both legacy EP mode and native PCIe EP mode. Set from bootstrap pins PCIESSMODE[1:0] at power-up (00->EP, 01->Legacy EP, 10->RC). Software overwrites the setting by changing the PCIESSMODE bits in the DEVSTAT register. Gen1 (2.5 Gbps) and Gen2 (5.0 Gbps) x2 lanes Outbound/Inbound max payload size of 128/256 bytes

Agenda PCIe Overview Address Translation Configuration PCIe Boot Demo

Address Translation PCIe device uses PCIe address to Tx/Rx packets over a PCIe link. Outbound transfer means the local device initiates the transactions to write to or read from the external device. The CPU or the device-level EDMA is used for outbound data transfer. The PCIe module does not have built-in EDMA. Inbound transfer means the external device initiates the transactions to write to or read from the local device. The PCIe module has a master port to transfer the data to or from the device memory; No CPU or EDMA is needed for inbound transfer in the local device. BAR: used to accept/reject TLP (Transport Layer Protocol).

Outbound Translation - 1 PCIe data space 256 MB (0x6000_0000~0x6FFF_FFFF) Enable/disable through CMD_STATUS register. When enabled, the outbound PCIe address (0x6000_0000~0x6FFF_FFFF) can be modified to a new address based on the outbound translation rules. Equally divided into 32 regions Registers for outbound (OB): OB_SIZE: identify the size of 32 equally-sized translation regions to be 1MB/2MB/4MB/8MB OB_OFFSET_INDEXn (n =0;31): represent bits[31:20] of the PCIe address for 32-bit or 64-bit addressing; not all bits will be used (depend on OB_SIZE); bit[0] enables the outbound region OB_OFFSETn_HI (n =0;31) : represent bits[63:32] of the PCIe address for 64-bit addressing; must be zero for 32-bit addressing

Outbound Translation - 2

Outbound Translation - 3 OB_SIZE OB_OFFSET_INDEXn Region indexing Translation 0 (1 MB) [24:20] [31:20] 1 (2 MB) [25:21] [31:21] 2 (4 MB) [26:22] [31:22] 3 (8 MB) [27:23] [31:23] Example: OB_SIZE: 1 MB; OB_OFFSET_INDEX0 = 0x9000_0001; OB_OFFSET0_HI = 0x0; PCIE data space address: 0x6001_5678; What is the translated PCIe address?

Outbound Translation - 4 Example continues: Because OB_SIZE = 1 MB == using bit [24:20] for region indexing Thus the index region is 0, and the next 20 bits – bit 0 to 19 determine the offset into the region Using OB_OFFSET_INDEX0 and OB_OFFSET0_HI The region upper base address is the OB_OFFSET0_HI = 0 and the upper 12 bits of the OB_OFFSET_INDEX0 register is bits 31:20 of the base address, so the combined based address of region 0 is 0x0000 0000 900 X XXXX Then the translated PCIe address = bits[31:20] of 0x9000_0000 + bits[19:0] of 0x6001_5678 = 0x9001_5678

Inbound Translation - 1 Enable/disable through CMD_STATUS register During negotiation, the RC and the EP exchange memory requests. These values are saved in the BAR registers BARn: two BARs (BAR0~1) in RC mode and six BARs (BAR0~5) in EP mode; Each register overlays initial address and MASK (based on DBI_CS2 bit in the CMD_STATUS register) BAR0 cannot be remapped to any other location than to PCIe application registers (starting from 0x2180_0000 in KeyStone device). It allows the RC device to control EP in the absence of dedicated software running on EP During initialization, the values in the BAR are used to build (up to) four memory regions.

Inbound Translation - 2 Each memory region has the following IB_BAR Inbound Translation Match Register (write the MASK into IB_BAR, read gives the BAR number to match the inbound translation) IB_START_LO Inbound Translation Start Address Low Register IB_START_HI Inbound Translation Start Address High Register IB_OFFSET Inbound Translation device base address For inbound address A that arrives the following happens: Using a IB_BAR MASK register, A is compared with the low and high address of BAR 0-5 to see if there is a match If there is no match, the address is rejected If there is a match, the internal device address is calculated as follows: The first BAR number that covers A is determine, then look up in IB_BARn (0-3) to see which IB_BAR has this BAR number The difference between A – IB_START (64 bit, high and low) is calculated The result is added to the device base address

Inbound Translation - 3 Example (assume that BAR1 covers the inbound address): For a 32-bit BAR, IB_BAR0 = 1 -> BAR1_MASK = 0x000F FFFF IB_START0_LO = 0xF740_0000; IB_START0_HI = 0x0; IB_OFFSET0 = 0x1080_0000 For PCIe address 0xF740_1234, what is the DSP device’s internal address?

Inbound Translation - 4 Example Calculation: The incoming address of 0xF740_1234 (covered by BAR 1) with the mask 0x000F FFFF gives 0xF740 0000. Matches the address of IB_START1 register, it is accepted, BAR1 is inside IB_BAR0 DSP internal address: 0xF740_1234 - 0x000F FFFF + 0x1080_0000 = 0x1080_1234 (local L2)

Agenda PCIe Overview Address Translation Configuration PCIe Demo

PCIe Initialization Boot mode: PCIe boot by selecting pins on 6678/6670 EVM boards. IBL code PLL workaround (6678 Errata, advisory 8) Power-up PCIe Configure PLL Configure PCIe registers Waiting for PCIe link-up Stay inside IBL, monitor the magic address (6678: 0x87FFFC; 6670: 0x8FFFFC) for secondary boot

PCIe Boot

Agenda PCIe Overview Address Translation Configuration PCIe Demo

Demo C:\ti\pdk_C6678_xx_yy\packages\ti\drv\pcie\example\sample Example instructions are in the readme file and in the next slides

Demo ****************************************************************************** * FILE PURPOSE: Readme File for the PCIE Example Project * FILE NAME: Readme.txt * Copyright (C) 2011, Texas Instruments, Inc. ***************************************************************************** The example demonstrates the use of APIs provided in PCIE LLD. This example does NOT work in the Shannon Simulator. Check the release notes for pre-requisites, tools and version information. ------------------ Example Overview In the PCIe sample example two Shannon EVMs are used to test the PCIe driver. As described in the following figure, Shannon 1 is configured as a Root Complex and Shannon 2 is configured as End Point. Shannon 1 Shannon 2 ------------------ ------------------ | | | | | Root | PCIe Link | End Point | | Complex | <-------------------------->| |

Demo At startup, each EVM configures its PCIe subsystem: • Serdes, clock, PLL • PCIe Mode and Power domain • Inbound/Outbound address translation and BAR registers • Link training is triggered Once the PCIe link is established, the following sequence of events will happen: • Shannon 1 sends data to Shannon 2 • Shannon 2 waits to receive all the data • Shannon 2 sends the data back to Shannon 1 • Shannon 1 waits to receive all the data • Shannon 1 verifies if the received data matches the sent data and declares test pass or fail.

Demo ------------------------- Steps to run the example 1. Build the example 2. Do a System Reset in both EVMs 3. Load exampleProject.out in core zero in both EVMs 4. In Shannon 1, use CCS watch window to modify the value of global variable PcieModeGbl. In this example, Shannon 1 is a Root Complex, therefore set PcieModeGbl to pcie_RC_MODE (this can be done through a drop down menu in the watch window). 5. Click the "Run" button in CCS for both EVMs (it is okay to have a few seconds between the "Run" buttons are clicked in both sides).

Demo ------------------------- Expected result 1. In Shannon 1 CCS console the status of the test will be updated. At the end, the message "Test passed" is expected. 2. In Shannon 2 CCS console the status of the test will be updated. At the end, the message "End of test" is expected.

For More Information For more information, refer to the PCI Express (PCIe) for KeyStone Devices User’s Guide. For questions regarding topics covered in this training, visit the support forums at the TI E2E Community website. NEW