1 / 48

A Monte Carlo Simulation Accelerator using FPGA Devices

A Monte Carlo Simulation Accelerator using FPGA Devices. Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip. Overview. Overview. Objective Background Software-only Implementation Hardware Implementation FPGA Soft-Core Micro-Processor.

Download Presentation

A Monte Carlo Simulation Accelerator using FPGA Devices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip

  2. Overview

  3. Overview • Objective • Background • Software-only Implementation • Hardware Implementation • FPGA • Soft-Core Micro-Processor

  4. Overview • Background • Interest Rate Modeling • Brace-Gatarek-Musiela (BGM) Model • Motivation and Contribution • System Design • System Design Overview • System Components • System Operations

  5. Overview • Experiment and Result • Resources • Performance • Data Transmission Overhead • Conclusion • Future Improvement • Q & A Section

  6. Objective

  7. Objective • What we achieved in last semester • Study and get familiar with the development related tools • Implement some simple examples to get experience in system development of FPGA with Soft-core Micro-processor • First ever successful port of the Microblaze system to the Celoxica RC200 development board • Study the performance and power consumption of the system

  8. Objective • How about this semester • Build up a Monte Carlo Simulation Accelerator using FPGA technology and Soft-core Micro-processor • Study the speed up and performance • Study the transmission overhead of the transmission channel between user core and Soft-core Micro-processor

  9. FPGA and Soft-Core Micro-Processor

  10. Software only implementation • The performance is NOT satisfactory • Sequential execution of instruction instead of parallel execution • Slow Memory access • Lack of ability to customize hardware • No way to save power by switching off hardware module • There is a need to solve the problem in another approach

  11. FPGA Technology • More and more popular in system design • Higher degree of parallelism • Fewer clock cycle required

  12. FPGA Technology • Explicitly hardwired to perform a certain operation • Optimized for specific purpose higher performance • Enable customization of hardware module • Power Saving • Reconfigurable • Enable reuse of hardware • Able to simulate and synthesize the circuits from a high level program-like description • Easy system development and system testing • Shorter time to market higher profit

  13. Soft-Core Micro-Processor • Most systems use a PC+FPGA accessed through a PCI bus • Bottleneck for entire system • Use of Soft-Core Micro-Processor • Everything is implemented in FPGA • Transmission of data is within the FPGA • A higher transmission bandwidth and lower latency

  14. Soft-Core Micro-Processor • Other advantages • Easier to develop • Retain the advantage of using FPGA • Flexible • Retargetable • Conclusion • FPGA technology + Soft-Core Micro-Processor

  15. Interest Rate Modeling

  16. Interest Rate Modeling • Important of interest rate modeling • Simulate market behavior with historical parameter values • Explain interest rate movements in terms of an underlying model •  decision making on economic policy •  risk management

  17. Brace-Gatarek-Musiela (BGM) Model • One of the most popular interest rate models • Base on Monte Carlo Method • Looping Part (most computational expensive)

  18. Implementing BGM Model using FPGA and Soft Core Microprocessor BGM core generate 50 paths with 9 fixed points

  19. Implementing BGM Model using FPGA and Soft Core Microprocessor • Implemented by FPGA in parallel style • Post-processing calculation by Microblaze • Average and Standard error • Fast Simplex Link Bus for data transmission between BGM core and Microblaze

  20. Contribution

  21. Contribution • Improve the performance of the system

  22. System Design

  23. System Design Overview

  24. System Component

  25. Microblaze • A soft-core Microprocessor • Delivered as HDL source code for synthesis • Designed in VHDL • Specially optimized for Xilinx FPGAs • A reduced instruction set computer (RISC) • Speed of Microblaze across different devices from Xilinx Statistics

  26. User Core – BGM • Connect the core designed in VHDL to the Microblaze system • Solve most computational expensive task in fully hardware • Need to follow the signal and timing of the bus connected • A microprocessor description (MPD) file • Defines the interface of the peripheral • Ports, Buses • A Peripheral Analyze Order (PAO) file • A list of HDL files in order of compilation that are needed for synthesis

  27. Fast Simplex Link (FSL) • 32 bits wide bus • Unidirectional point-to-point data streaming interfaces • Control and Data communication support • FIFO based communication • Fast Internal data and control transmission • Peak bandwidth 300MB / SEC

  28. Fast Simplex Link (FSL)

  29. Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003

  30. Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003 Use Read Marco microblaze_bread_datafsl(val, id) for reading data from FSL FIFO to Microblaze

  31. On-Chip Memory, Local Memory Bus and Memory Bus Controller • On Chip Memory • Storage medium for the data and instruction • Minimize the transmission overhead between the Microblaze and the memory • Local Memory Bus • Single-cycle access to on-chip dual-port block RAM • Performance of 125 MHz • LMB BRAM Interface Controller • Interface between the LMB and the bram_block peripheral • Separate controller for data and control

  32. On-Chip Peripheral Bus (OPB Bus) • Connection between the main system and the peripherals • Make Microblaze System More Functional • In this project • UART • OPB Timer • GPIO

  33. Universal Asynchronous Receiver-Transmitter (UART) • Handles asynchronous serial communication • Libgen allows the mapping of standard input and output • Use of scanf and printf for the communication with user

  34. OPB Timer • Facilitate the correct measurement of the performance • Initiate timer  Start timer  Stop timer  Get timer value • XStatus XTmrCtr_Initialize • void XTmrCtr_Start • void XTmrCtr_Stop • Xuint32 XTmrCtr_GetValue

  35. General Purpose Input Output (GPIO) • Problem found on FSL Bus • Reset signal connected to Gound • No way to reset the BGM core through FSL Bus • Solution • Make change to the VHDL source code • Use GPIO

  36. Reset Reset Microblaze FSL Reset BGM Core X Reset Reset Microblaze GPIO Reset by GPIO Reset by FSL BGM Core General Purpose Input Output (GPIO)

  37. Microblaze System Start System Operations BGM Core is reset Timer is started BGM Process yes Any More Data No Post-Processing Calculation by Microblaze Timer is stopped Result is printed out End of Microblaze System

  38. System Operations BGM Process Start BGM Core in process of generating path Data transfer from BGM core to Microblaze System Data format transform Temperate storage of data End of Microblaze System

  39. Experimental Results

  40. Resources • Unable to place whole system to the FPGA board • System Simulation by ModelSim

  41. Performance Comparison of performance for the running of BGM core in FPGA and in PC (By Dr. Zhang) Speed up factor : 19.87

  42. Performance The comparison of performance for the running the BGM core in FPGA and PC with different number of paths generated (By Dr. Zhang) Stable Performance with different path numbers

  43. Performance Simulation of Microblaze system Total time required for generating 50 paths : 2.871ms Speed up factor : 21.94

  44. Transmission Bandwidth

  45. Transmission Bandwidth In FSL Bus 32 bit of data is sent by about 40000ps Transmission bandwidth is around 100MB per second Same significant as the peak transmission bandwidth as stated in specification

  46. Conclusion • A Monte Carlo Simulation Accelerator was implemented using FPGA technology and Xilinx Microblaze Soft-core Micro-processor • A speed up factor 21.94 when compared with software only implementation • Higher bandwidth and lower latency can be achieved using FSL Link between Microblaze and BGM core • High performance, the parallelism of execution of instruction, the reconfigurability and reuseability and the short development time……

  47. Future Development • Put the whole system in the FPGA board • Implement other applications which put high performance and short developing time as the major consideration • Study other IP core included and make improvement to the system

  48. Q & A

More Related