1 / 14

Blocked 2D Convolution

Blocked 2D Convolution. Ravi Sankar P Nair 010469036. Implement 2D Convolution. Source: http://www.songho.ca/dsp/convolution/convolution2d_example.html. Implement 2D Convolution.cpp in GPU Kernel. Implement 2D Convolution.cpp in GPU Kernel. Use Constant memory to store M matrix.

bran
Download Presentation

Blocked 2D Convolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Blocked 2D Convolution Ravi Sankar P Nair 010469036

  2. Implement 2D Convolution Source: http://www.songho.ca/dsp/convolution/convolution2d_example.html

  3. Implement 2D Convolution.cpp in GPU Kernel

  4. Implement 2D Convolution.cpp in GPU Kernel Use Constant memory to store M matrix

  5. Implement 2D Convolution.cpp in GPU Kernel Use Constant memory to store M matrix

  6. Performance Testing CPU vs. GPU What is the measured floating-point computation rate for the CPU and GPU kernels on this application? How do they each scale with the size of the input? #include <sys/time.h>

  7. Performance Testing CPU vs. GPU What is the measured floating-point computation rate for the CPU and GPU kernels on this application? How do they each scale with the size of the input? Alternate Timer method

  8. Performance Testing CPU vs. GPU What is the measured floating-point computation rate for the CPU and GPU kernels on this application? How do they each scale with the size of the input? #include <sys/time.h>

  9. Performance Testing CPU vs. GPU 2. How much time is spent as an overhead cost of using the GPU for computation? Consider all code executed within your host function, with the exception of the kernel itself, as overhead. How does the overhead scale with the size of the input?

  10. Performance Testing CPU vs. GPU Table shows values in micro seconds. Run on GTX 480 pacman.ddns.uark.edu Total Setup = Setup M,N + Setup GPU call Over Head GPU = Setup GPU Call – GPU kernel Over Head Setup = Total Setup – GPU kernel Over Head Main = Total Main program – GPU Kernel

  11. Performance Testing CPU vs. GPU Table shows values in micro seconds. Run on GTX 480 pacman.ddns.uark.edu (Alternate Timer) Total Setup = Setup M,N + Setup GPU call Over Head GPU = Setup GPU Call – GPU kernel Over Head Setup = Total Setup – GPU kernel Over Head Main = Total Main program – GPU Kernel

  12. Performance Testing CPU vs. GPU Run on GTX 480 pacman.ddns.uark.edu

  13. Performance Testing CPU vs. GPU Table shows values in micro seconds. Run on GTX 295 stargate.uark.edu Total Setup = Setup M,N + Setup GPU call Over Head GPU = Setup GPU Call – GPU kernel Over Head Setup = Total Setup – GPU kernel Over Head Main = Total Main program – GPU Kernel

  14. Performance Testing CPU vs. GPU Run on GTX 295 stargate.uark.edu

More Related