1 / 11

ESC499 – A TMD-MPI/MPE Based Heterogeneous Video System

ESC499 – A TMD-MPI/MPE Based Heterogeneous Video System. Tony Zhou, Prof. Paul Chow April 6 th , 2010. Background. The Background Message Passing Interface (MPI): is a specification for an API that allows many computers to communicate with one another.

duman
Download Presentation

ESC499 – A TMD-MPI/MPE Based Heterogeneous Video System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESC499 – A TMD-MPI/MPE Based Heterogeneous Video System Tony Zhou, Prof. Paul Chow April 6th, 2010

  2. Background • The Background • Message Passing Interface (MPI): is a specification for an API that allows many computers to communicate with one another. • An API is an abstraction that defines and describes an interface for the interaction with a set of functions. • MPI has become a de facto standard for communication among processes that model a parallel program running on a distributed memory system. • Prof. Paul Chow’s Research • Hardware systems are better suited for parallel processing. FPGA’s reconfigurable nature makes hardware computing engine (CE) design easy. • Similar to what MPI provides to software developers, • TMD-MPI provides software and hardware middleware layers of abstraction for communications to enable the portable interaction between embedded processors, CEs and X86 processors. ESC499 – EngSci Thesis

  3. Filling the Gap, and Defining the Scope • The TMD-MPI research is still in its infant stage compared to the MPI standard, implementation and characterization of designs are lacking. • This project attempts to fill this gap by investigating alternative approaches to present hardware and software elements. • If a simple feasible heterogeneous system was successfully demonstrated, this thesis will focus on expanding the software element network to exploit more parallelism. ESC499 – EngSci Thesis Hardware Element Hardware Element Software Element Software Element Software Element Software Element Hardware Element Hardware Element Software Element Software Element Software Element Software Element Software Element

  4. Objectives • The goal is to create a heterogenous video processing system that demonstrates TMD-MPI’s capabilities as the interface between CEs and software processes. • Called heterogenous due to the combination of hardware engines and software processes. • Implement and characterize different configurations of the system. • Research and Groundwork • Manuel Saldana’s paper • “A Parallel Programming Model for a Multi-FPGA Multiprocessor Machine” • TMD-MPI Library v1.0: software MPI interface designed for Xilinx Microblaze • TMD-MPE v1.0: hardware implementation of send and receive commands of the TMD-MPI library. • Jeff Goeder’s Project • Video System Groundwork: streams video from VGA port, to external memory, then to DVI-out, through MPE-MPE message passing. ESC499 – EngSci Thesis

  5. System Block Diagram

  6. High Level Implementation • The primary goal focuses on functionality rather than performance. • Speed and performance considerations aside, two approaches from the high level perspective can be adopted. • Distributed Memory • Distributed memory for each node • Pass the entire video as continuous messages • Shared Memory • Shared memory for all the nodes • Pass only the pointer to the video in memory ESC499 – EngSci Thesis Network Traffic: (640x480 px) (32-bit/px) = 1200 KB per frame Network Traffic: 32-bit (4B) memory addresses

  7. Distributed Memory • Distributed-memory, video streaming approach. • Microblaze cannot pull data off the FIFO fast enough due to several factors ESC499 – EngSci Thesis Single frame example: Video Decoder @100Mhz Xilinx FSL (FIFO) DVI out V-Dec Xilinx Microblaze Microblaze @ 1-10Mhz Multi-Frame Speed Issue: Xilinx FSL (FIFO) DVI-output @100Mhz

  8. Microblaze PLB bus traffic • First, Xilinx FSL (FIFO) interface access time. • Second, memfory access time, bus arbitration. • Third, implicit sequential execution of instructions in a normal processor. ESC499 – EngSci Thesis • Microblaze operates @ 100Mhz, however the speed is limited by other factors Video Decoder @100Mhz FIFO Microblaze @ 1-10Mhz FIFO DVI-output @100Mhz

  9. Shared Memory • Shared-memory, address mapped tasks • Only 32-bit memory addresses are passed as messages between ranks. • Significant reduction in network traffic (b/f: 640 x 480 x 32 bits per frame) • Multiple microblazes in parallel • Each microblaze is assigned a different region in the common memory space. • Each microblaze can have its own codec (eg on left) or the same one. • Each microblaze then put its own section of frame into its corresponding place in the DVI-out memory space ESC499 – EngSci Thesis Single frame example: Inside the memory:

  10. Results • Why Software & Why Hardware • The TMD-MPI approach to heterogeneous systems prove to be easy and efficient in development. • Shared memory approach significantly improves speed and is linearly scalable. • Suggestion: software-to-hardware, since TMD-MPI/MPE abstracts interface complexities away from the developer. Conclusion ESC499 – EngSci Thesis

  11. Acknowledgements: Professor Paul Chow, Sami Sadaka, Kevin Lam, Kam Pui Tang, Manuel Saldana Thanks and Q&A

More Related