Download presentation
Presentation is loading. Please wait.
Published byJayson Shepherd Modified over 9 years ago
1
ESC499 – A TMD-MPI/MPE B ASED H ETEROGENEOUS V IDEO S YSTEM Tony Zhou, Prof. Paul Chow April 6 th, 2010
2
ESC499 – E NG S CI T HESIS Background The Background Message Passing Interface (MPI): is a specification for an API that allows many computers to communicate with one another. An API is an abstraction that defines and describes an interface for the interaction with a set of functions. MPI has become a de facto standard for communication among processes that model a parallel program running on a distributed memory system. Prof. Paul Chow’s Research Hardware systems are better suited for parallel processing. FPGA’s reconfigurable nature makes hardware computing engine (CE) design easy. Similar to what MPI provides to software developers, TMD-MPI provides software and hardware middleware layers of abstraction for communications to enable the portable interaction between embedded processors, CEs and X86 processors.
3
ESC499 – E NG S CI T HESIS Filling the Gap, and Defining the Scope The TMD-MPI research is still in its infant stage compared to the MPI standard, implementation and characterization of designs are lacking. This project attempts to fill this gap by investigating alternative approaches to present hardware and software elements. If a simple feasible heterogeneous system was successfully demonstrated, this thesis will focus on expanding the software element network to exploit more parallelism. Software Element Hardware Element Software Element Hardware Element Software Element Hardware Element
4
ESC499 – E NG S CI T HESIS Objectives The goal is to create a heterogenous video processing system that demonstrates TMD-MPI’s capabilities as the interface between CEs and software processes. Called heterogenous due to the combination of hardware engines and software processes. Implement and characterize different configurations of the system. Research and Groundwork Manuel Saldana’s paper “A Parallel Programming Model for a Multi-FPGA Multiprocessor Machine” TMD-MPI Library v1.0: software MPI interface designed for Xilinx Microblaze TMD-MPE v1.0: hardware implementation of send and receive commands of the TMD-MPI library. Jeff Goeder’s Project Video System Groundwork: streams video from VGA port, to external memory, then to DVI-out, through MPE-MPE message passing.
5
System Block Diagram
6
ESC499 – E NG S CI T HESIS High Level Implementation The primary goal focuses on functionality rather than performance. Speed and performance considerations aside, two approaches from the high level perspective can be adopted. Distributed Memory Distributed memory for each node Pass the entire video as continuous messages Shared Memory Shared memory for all the nodes Pass only the pointer to the video in memory Computing Engine 1 Local memory Computing Engine 1 Local memory Computing Engine 1 Computing Engine 2 Shared Memory for all devices Network Traffic: (640x480 px) (32-bit/px) = 1200 KB per frame Network Traffic: 32-bit (4B) memory addresses
7
ESC499 – E NG S CI T HESIS Distributed Memory Distributed-memory, video streaming approach. Entry n … Entry 2 Entry 1 Video Decoder @100Mhz Microblaze @ 1-10Mhz Single frame example: Multi-Frame Speed Issue: Xilinx MicroblazeV-Dec DVI out Entry n … Entry 2 Entry 1 DVI-output @100Mhz Xilinx FSL (FIFO) Microblaze cannot pull data off the FIFO fast enough due to several factors Xilinx FSL (FIFO)
8
ESC499 – E NG S CI T HESIS Microblaze PLB bus traffic Entry n … Entry 2 Entry 1 Video Decoder @100Mhz Microblaze @ 1-10Mhz Entry n … Entry 2 Entry 1 DVI-output @100Mhz FIFO First, Xilinx FSL (FIFO) interface access time. Second, memfory access time, bus arbitration. Third, implicit sequential execution of instructions in a normal processor. Microblaze operates @ 100Mhz, however the speed is limited by other factors
9
ESC499 – E NG S CI T HESIS Shared Memory Shared-memory, address mapped tasks Single frame example: Only 32-bit memory addresses are passed as messages between ranks. Significant reduction in network traffic (b/f: 640 x 480 x 32 bits per frame) Multiple microblazes in parallel Inside the memory: Each microblaze is assigned a different region in the common memory space. Each microblaze can have its own codec (eg on left) or the same one. Each microblaze then put its own section of frame into its corresponding place in the DVI-out memory space
10
ESC499 – E NG S CI T HESIS Why Software & Why Hardware ConclusionResults The TMD-MPI approach to heterogeneous systems prove to be easy and efficient in development. Shared memory approach significantly improves speed and is linearly scalable. SoftwareHardware FunctionalityVery goodBad PerformanceSlowVery Fast DevelopmentFastSlow Cost-- Suggestion: software-to-hardware, since TMD-MPI/MPE abstracts interface complexities away from the developer.
11
THANKS AND Q&A Acknowledgements: Professor Paul Chow, Sami Sadaka, Kevin Lam, Kam Pui Tang, Manuel Saldana
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.