Download presentation
Presentation is loading. Please wait.
Published byAndra Thompson Modified over 9 years ago
1
Distributed Processors Allow Revolutionary Hardware & Software Partitioning Version 1.1 –March 2002 – APD / J-L Brelet & P Hardy - All right reserved - © XILINX 2002 8th Workshop on Electronis for LHC Experiments 9 – 13 September 2002, Colmar (France) Authors: Jean-Reynald Mace & Jean-Louis Brelet / Xilinx
2
Colmar Workshop XILINX, Sept. 02 p 2 Agenda System Partitioning – Traditional techniques – Innovative approaches Example 1: DES Encryption Algorithm – HW solution compared to SW solution Example 2: Wireless LAN – HW / SW trade-off Enabling Technology: Virtex-II Pro
3
Colmar Workshop XILINX, Sept. 02 p 3 System Partitioning Definition: – “The mapping of a system level architecture into specific HW and SW components based upon application requirements” Today Implementation in: – Fixed HW components: FPGA, ASIC, ASSP,… – SW components: Code running on CPU, DSP processors, microcontrollers,… Hardware Components Embedded Software Application Control Management
4
Colmar Workshop XILINX, Sept. 02 p 4 Example System Functions Hardware : – Physical Layer – Memory Interfaces – Protocol Bridges – Finite State Machine – Signal Processing – Encryption Software : – Protocol Stack – User Interface – Diagnostics – Control – Signal Processing – Encryption
5
Colmar Workshop XILINX, Sept. 02 p 5 Optimal Solutions Enabled by On-Demand Architectural Synthesis Hardware: – Physical Layer – Memory Interfaces – Protocol Bridges – FSM – Signal Processing – Encryption Software: – Protocol Stack – User Interface – Diagnostics – Control – Signal Processing – Encryption Flexible Mapping
6
Colmar Workshop XILINX, Sept. 02 p 6 Traditional System Design Fixed HW / SW partitioning Early and final architecture mapping Critical commitment made at concept level SW mgr SW Dev.SW dev Fixed Interface HW mgr HW eng PCB eng Hardware Components Embedded Software
7
Colmar Workshop XILINX, Sept. 02 p 7 New System Partitioning Flexible HW / SW partitioning – Enables tradeoffs throughout the process Architecture redefinition possible – Tune for optimal performance and cost HW Team SW Team HW Team Hardware Components Embedded Software Flexible Interface
8
Colmar Workshop XILINX, Sept. 02 p 8 Innovative Partitioning New System Approach: – Enables non-traditional system architecture SW modules can be implemented in HW HW modules can be moved to SW – Requires a scalable and flexible platform that enables optimal HW / SW integration. Co-Design Methodology – Design attributes optimized during development (Performances, resource usage,…) – SW developers and HW engineers create solutions at module level for optimal systems
9
Colmar Workshop XILINX, Sept. 02 p 9 Agenda System Partitioning – Traditional techniques – Innovative approaches Example 1: DES Encryption Algorithm – HW solution compared to SW solution Example 2: Wireless LAN – HW / SW trade-off Enabling Technology: Virtex-II Pro
10
Colmar Workshop XILINX, Sept. 02 p 10 DES Overview DES Algorithm: – Message is split into fixed length blocks – Encode each block with fixed « key » – Block length = 64 bits (advanced 128-b), Key length = 56 bits 3DES Is An Enhanced Version of Encryption / Decryption – If Key 1 = Key 2 = Key 3, than 3DES is fully compatible with DES EncryptDecrypt Encrypt Data Key 1Key 2Key 3
11
Colmar Workshop XILINX, Sept. 02 p 11 System Integrator’s Dilemma DES Is Simple Algorithm System Engineer Has To Evaluate: – SW coding compare to HW implementation – Need for a specific processor and performances – Need for a dedicated solution – Cost effective solution of ASSP – Level of customization required – Fixed or flexible implementation
12
Colmar Workshop XILINX, Sept. 02 p 12 Architectural Options Popular DES Algorithm Is Available As SW code: – Public domain C or C++ code – Example of encryption data rate for 128-b DES : TMS320C62xx at 200 MHz delivers ~100 Mbps(*) MIPS 64-b RISC at 250 MHz delivers ~400 Mbps(*) Pentium III at 1 GHz delivers ~ 460 Mbps(*) HW Implementation Available At: – www.opencores.org www.opencores.org – Over 1.5 Gbps data rate in Virtex-II at 130 MHz (*) 3DES 56-b Algorithm Achieves 10.7 Gbps Throughtput – Xilinx record-breaking announcement in April 2002 * Source: Helion Technology Limited, Xilinx Design Consultant (Xilinx Xcell journal Issue 43 Summer 2002)
13
Colmar Workshop XILINX, Sept. 02 p 13 Mixed HW / SW Solution Encryption / Decryption Data Path: – DES encryption module is called twice – Decryption requires more compute power Decrypt Encrypt DES Decryption Algorithm Processor DES Encryption Algorithm Processor HW Data Flow
14
Colmar Workshop XILINX, Sept. 02 p 14 Full HW Implementation Full HW Implementation: – Shared Encryptor Encrypt Other Tasks Processor HW Decrypt Data Flow Full HW Pipelined Solution – Easy to add Parallelism – Easy to couple to distributed processors Encrypt Or No Processor? Processor HW Decrypt Encrypt Data Flow
15
Colmar Workshop XILINX, Sept. 02 p 15 Choices of HW / SW Partition Various Solutions To Fit Each Performances / Cost Requirement: – SW vs HW vs mixed HW / SW New Approach: – On-Demand Architecture Synthesis to modify HW / SW trade-off dynamically Distributed Processors Offer Another Level Of Flexibility Through Parallel Implementations
16
Colmar Workshop XILINX, Sept. 02 p 16 Agenda System Partitioning – Traditional techniques – Innovative approaches Example 1: DES Encryption Algorithm – HW solution compared to SW solution Example 2: Wireless LAN – HW / SW trade-off Enabling Technology: Virtex-II Pro
17
Colmar Workshop XILINX, Sept. 02 p 17 Networking Application: Wireless LAN Intra Forwarding Technique: Video transmission MPEG2 FTP File transfert: FTP QoS
18
Colmar Workshop XILINX, Sept. 02 p 18 Physical Layer Wireless LAN: Access point Architecture Presentation LayerNetwork LayerApplication LayerTransport LayerSession LayerData Link Layer Bus HOST I/F Medium Access Control Channel Access Control
19
Colmar Workshop XILINX, Sept. 02 p 19 Wireless LAN: QoS Wireless LAN example: – Intra forwarding technique – Complex algorithms of network access with few levels of prioritization in order to guarantee the QoS Select Most Urgent Frame – Choice is based on few parameters: – priority (Po to Pn) – Lifetime (Normalized Residual Lifetime, … CAPUPRLDISNRLDB PoPn 256 Ptrs 64 Bits Ptr of the Selected Frame Ptr of the Received Frame Pointer :
20
Colmar Workshop XILINX, Sept. 02 p 20 QoS: Full Hardware Design in FPGA: – FSM like design with adder/subtractor (~1000 LUT / 50MHz) – One table of pointers implemented in FPGA Block Ram 2 BRAM used for 4 priorities – Pipelining used – Easy to manage the Lifetime (update every 10 us) Complex Function in HW: – Electing two frames from one table of pointer by scrolling and comparison techniques Table of ptr of frames to be transmitted Elected ptr of Frame to transmit F11 F1 F3 F0 F10 Permutation
21
Colmar Workshop XILINX, Sept. 02 p 21 QoS: Full Software Design in Firmware: – Simple ~250 lines of C Code – Microprocessor used: PPC 405 – One table of pointers per priority in external memory (SDRAM) – Sort algorithm very well known and easy to implement Complex Function in SW: – System Real Time Requirement – Frame lifetime controlled by a set of timers In the same time new frame is coming, existing frame should move from upper priority table ….. F41 F52 F7 F22 F11 F31 F10 F21 F1 F3 F0 F11 Highest Priority Table Elected ptr of Frame to transmit
22
Colmar Workshop XILINX, Sept. 02 p 22 QoS: Mixed HW / SW Hardware Module: – Liftetime and move ptr between tables – Design : FSM like design with adder/subtractor (~200 lut-50MHz) 4 tables of pointers per priority with the FPGA Block Ram Updated Lifetime by scrolling Semaphore Software/Hardware interface: – Semaphore based communication Software Module: – Insertion and sort of the tables – Design : Easy to write (~200 lines of C Code) Sort algorithm Semaphore lib F41 F52 F7 F22 ….. F41 F52 F7 F22
23
Colmar Workshop XILINX, Sept. 02 p 23 Design Solutions Comparison Full HW Solution – Full control of events timing and easy parallelism design – Complex HDL coding of the FSM State Machines architecture requires advanced expertise Important validation time in design cycle Full SW Solution – Easy coding in C (sort algorithm) and flexibility – Difficult to handle real-time constraints Performances limitation by Von Neumann architecture (Proc.) Mixed HW / SW Solution: The Best Of The both Worlds – Offer advantages of HW and SW solution with the right partitioning
24
Colmar Workshop XILINX, Sept. 02 p 24 Agenda System Partitioning – Traditional techniques – Innovative approaches Example 1: DES Encryption Algorithm – HW solution compared to SW solution Example 2: Wireless LAN – HW / SW trade-off Enabling Technology: Virtex-II Pro
25
Colmar Workshop XILINX, Sept. 02 p 25 Platform FPGA Architecture A Solution that provides: – IP Immersion The ability to integrate a wide variety of Hard & Soft IP – A single Platform for multiple applications – Total customization – Full Hardware and Firmware upgradability Hard-IP Soft-IP System Connectivity HW functions
26
Colmar Workshop XILINX, Sept. 02 p 26 MGT Fabric PowerPC 405 Core 300+ MHz / 450+ DMIPS Performance Up to 4 per device 3.125 Gbps Multi-Gigabit Transceivers (MGTs) Supports 10 Gbps standards Up to 24 per device IP-Immersion™ Fabric ActiveInterconnect™ 18Kb Dual-Port RAM Xtreme™ Multipliers 16 Global Clock Domains Virtex-II Pro Platform FPGA
27
Colmar Workshop XILINX, Sept. 02 p 27 High-Bandwidth Communications Code (SW) and data are stored in BRAM, without any external resources On-Chip Memory (OCM) offers an unique data bandwidth between FPGA fabric (HW) and embedded PowerPC core (SW) High-Bandwidth Communications between distributed processors OCM™ Technology BlockRAMs I-Cache 16KB MMU Fetch & Decode Timers and Debug Logic Execution Unit 32x32b GPR ALU, MAC D-Cache 16KB Acceleration Logic 6.4Gb/sec
28
Colmar Workshop XILINX, Sept. 02 p 28 Flexibility of Programmable Systems Nearly all Systems are composed of: – Logic + Memory + Processor Virtex-II Pro enables optimum “system partitioning” between Hardware and Software Performing SW tasks in HW is Inefficient Performing HW tasks in SW is Slow Provides the best of both worlds
29
Colmar Workshop XILINX, Sept. 02 p 29 Conclusion Distributed Processors Allow Flexible HW / SW Partitioning: – Optimal mapping at the module level – Offer to design with best solution of both worlds Virtex-II Pro The First Programmable System To Enable True Architectural Synthesis: – Unique bandwidth between embedded processors and HW – Unique on-chip solution provides an application-specific mix of logic, memory, integrated processors, and high bandwidth I/O
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.