1 / 14

ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU

ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu. Overview. Application: Object tracking in real time Challenges: Static Scene Moving objects Occluding Collision Disappearing Rotation Scaling

ruby
Download Presentation

ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu

  2. Overview • Application: • Object tracking in real time • Challenges: • Static Scene • Moving objects • Occluding • Collision • Disappearing • Rotation • Scaling • Divide and Conquer: • Feature Extraction and Tracking • Focus on: • Feature Extraction, used SIFT • Improve an existing implementation with GPU

  3. Scale Invariant Feature Transform (SIFT) Input: image Output: keypoints

  4. GPU Implementation • Selected the GPU implementation by Sinha et al. at UNC at Chapel Hill • Open-source SiftGPU available (latest V4.00, Sept. 2012) • SIFT well suited to be implemented on GPU • Tens of thousands of threads handle subsets of data without communication with each other

  5. Attempts to Speed Up • Tackled the 2 most time consuming processing steps • Blurring images with Gaussian low-pass filter • Changed pixel data access pattern • Used different schemes of data partitioning • Keypoint descriptor (128-element vector) calculations • Optimize code in the kernel • Used usual optimization techniques • Changed GPU memory usage • Threads management • Experimented with kernel parameters • Maximized usage of available threads Result: Reduced descriptor compute time from 73 to 22 ms (70%)

  6. Conclusion • Existing implementation is already pretty good • Hard to take full advantage of the architecture. Need to have good understanding of • Memory architecture • Thread usage • CUDA C/C++ compiler (nvcc) optimizes code in different ways. Need to experiment to gain performance • Hard to debug code running on GPU • Visual Profiler can provide valuable insights on code behaviors

  7. Backup Slides

  8. References • SiftGPU available at http://cs.unc.edu/~ccwu/siftgpu/ • D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, November 2004. • Sudipta N. Sinha et al., “GPU-based Video Feature Tracking And Matching,” Technical Report TR 06-012, Department of Computer Science, UNC Chapel Hill, May 2006. • NVIDIA GeForce GT 640M LE • CUDA Cores: 384 • Total available graphics memory: 4095 MB

  9. Test image with keypoints

  10. Algorithm

  11. Algorithm

  12. Algorithm

More Related