1 / 19

报告人:黎贺

报告人:黎贺. 实验四 面向MJPEG解码的MPSoC系统优化. 实验内容. (1)dma模块的添加与使用 (2)idct专用硬件加速单元的添加与使用 (3)整体效果与问题分析. DMA模块添加与使用. 1,定义DMA模块的高层硬件抽象,模拟硬件电路功能 本实验中已经在系统硬件平台下定义好了DMA高层抽象模型,内含vci_dma.h,dma.h,vci_dma.c 2,在硬件平台的top.cpp文件中定义DMA模块,包括以下内容: (1)修改Mapping table

overton
Download Presentation

报告人:黎贺

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 报告人:黎贺 实验四 面向MJPEG解码的MPSoC系统优化

  2. 实验内容 (1)dma模块的添加与使用 (2)idct专用硬件加速单元的添加与使用 (3)整体效果与问题分析

  3. DMA模块添加与使用

  4. 1,定义DMA模块的高层硬件抽象,模拟硬件电路功能1,定义DMA模块的高层硬件抽象,模拟硬件电路功能 本实验中已经在系统硬件平台下定义好了DMA高层抽象模型,内含vci_dma.h,dma.h,vci_dma.c 2,在硬件平台的top.cpp文件中定义DMA模块,包括以下内容: (1)修改Mapping table maptab.add(Segment("vcidma" , DMA_BASE , DMA_SIZE , IntTab(4), false)); (2)添加设备模块 soclib::caba::VciDma<vci_param> vcidma("vcidma", maptab, IntTab(4), IntTab(4), (1<<(vci_param::K-1))); (3)定义dma信号 soclib::caba::VciSignals<vci_param> signal_vci_dmai("signal_vci_dmai"); soclib::caba::VciSignals<vci_param> signal_vci_dmat("signal_vci_dmat"); (4)定义dma与其他设备的连接 vgmn.p_to_initiator[4](signal_vci_dmai); vgmn.p_to_target[4](signal_vci_dmat); vcidma.p_clk(signal_clk); vcidma.p_resetn(signal_resetn); vcidma.p_vci_target(signal_vci_dmat); vcidma.p_vci_initiator(signal_vci_dmai); vcidma.p_irq(signal_mips0_it3); 3,修改segmentation.h,定义dma基地址和大小 #define DMA_BASE 0xC6000000 #define DMA_SIZE 0x01000000 4,修改platform_desc文件,注册dma模块 Uses('vci_dma'),

  5. 软件部分 1,header文件夹下添加dma.h和soclib_io.h 其中dma.h定义了dma操作所需的5个功能寄存器: SoclibDmaRegisters{DMA_SRC,DMA_DST,DMA_LEN,DMA_RESET,DMA_IRQ_DISABLED}; soclib_io.h中定义好了dma相关操作: static inline void soclib_io_set(void *comp_base, size_t reg, uint32_t val) static inline uint32_t soclib_io_get(void *comp_base, size_t reg) 2,修改编译链接脚本:ldscript/mips,加入dma基地址及dma个数 SOCLIB_DMA_NDEV = .; LONG(0x1) SOCLIB_DMA_DEVICES = .; LONG(0XC6000000) 3,修改dispatch.c源文件 (1)加入dma头文件: #include"dma.h" (2)将解码数据传送方式由: memcpy ((void *) 0xC4000000, picture, SOF_section . width * SOF_section . height * 2); 改为: soclib_io_set( dma, DMA_DST, (void *)0xC4000000 ); soclib_io_set( dma, DMA_SRC, picture ); soclib_io_set( dma, DMA_LEN, SOF_section . width * SOF_section.height*2); while( soclib_io_get( dma, DMA_LEN ) );

  6. IDCT加速单元的添加与使用

  7. 1,添加idct硬件加速单元的高层抽象模型(本实验已提供)1,添加idct硬件加速单元的高层抽象模型(本实验已提供) • 2,在硬件平台的top.cpp添加以下内容: (1)修改Mapping table maptab.add(Segment("idct", IDCT_BASE, IDCT_SIZE, IntTab(9), false)); (2)添加设备模块 soclib::caba::VciIDCT<vci_param> idct("idct",maptab, IntTab(9)); (3)定义IDCT信号 soclib::caba::VciSignals<vci_param> signal_vci_idct("signal_vci_idct"); (4)定义IDCT与其他设备的连接 vgmn.p_to_target[9](signal_vci_idct); idct.p_clk(signal_clk); idct.p_resetn(signal_resetn); idct.p_t_vci(signal_vci_idct); 3,修改segmentation.h,定义IDCT基地址和大小 #define IDCT_BASE 0xC5000000 #define IDCT_SIZE 0x00000100 4,修改platform_desc文件,注册IDCT模块 Uses('vci_idct'),

  8. 软件修改 1,header文件夹下添加idct.h 2,修改编译链接脚本:ldscript/mips:添加IDCT基地址及IDCT个数 SOCLIB_IDCT_NDEV = .; LONG(0x1) SOCLIB_IDCT_DEVICES = .; LONG(0xC5000000) 3,在source文件夹下加入驱动程序:soclib_idct.c

  9. IDCT模块驱动soclib_idct.c 1,定义模块读写端口 typedef struct soclib_idct_port { uint32_t status; uint32_t write; uint32_t read; } soclib_idct_port_t; 2,模块申明 module_t soclib_idct_module = { .name = "soclib_idct", .init = soclib_idct_init, .cleanup = soclib_idct_cleanup };

  10. IDCT模块驱动soclib_idct.c 9(续) 3,函数映射 desc = file_allocate ("idct_device"); desc -> open = soclib_idct_open; desc -> read = soclib_idct_read; desc -> write = soclib_idct_write; desc -> ioctl = soclib_idct_ioctl; desc -> stats . st_mode = 0; desc -> cookie = (void *) & IDCTs[idx]; device_register ("idct", desc); 4,soclib_idct_open和soclib_idct_write等驱动函数的实现

  11. IDCT模块调用 1,main.c函数中vfs_open()打开idct运算单元 fb_id=vfs_open("/dev/idct.0", 0, 0); 2,compute.c中加入头文件: #include"idct.h" 并将 IDCT(&block_YCbCr[i * 64], &Idct_YCbCr[i * 64]); 改为 vfs_write(fb_id,&block_YCbCr[i*64],64* sizeof(uint32_t)) vfs_read(fb_id,&Idct_YCbCr[i*64],64*sizeof(uint8_t)); • 也即,将输入数据block_YCbCr[i * 64]送至IDCT模块入口,并从IDCT模块处读出输入至Idct_YCbCr[i * 64]即可,具体操作由硬件电路来实现,而不是原来的定义IDCT函数来实现软件层的操作定义

  12. 实验结论与问题分析

  13. 双核双线程(无dma和idct)

  14. 双核单线程(dma和idct)

  15. 双核双线程(dma和idct)

  16. 双核双线程,只有idct

  17. 双核单线程,只有idct

  18. 实验问题与讨论 结论: IDCT专用硬件加速模块效果明显,耗时缩减几乎一半 问题: 1,单核比双核要快2ns 2,dma方式传送数据耗时不仅没有大幅下降,还增加3-5个周期数 3,双核双线程(也包括四线程等)下解码只能进行一帧,双核单线程下可以全部顺利解码 可能原因: 1,线程之间的任务划分,以及线程之间的通信与数据交换 2,...... 3,经调试,多线程下各个线程数据的划分产生了混乱,具体原因不详

  19. 谢谢

More Related