1 / 22

Accelerating image recognition on mobile devices using GPGPU

Accelerating image recognition on mobile devices using GPGPU. Jari Hannuksela, Olli Silvén Machine Vision Group, Infotech Oulu Department of Electrical and Information Engineeering University of Oulu, Finland.

long
Download Presentation

Accelerating image recognition on mobile devices using GPGPU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accelerating image recognition on mobile devices using GPGPU Jari Hannuksela, Olli Silvén Machine Vision Group, Infotech Oulu Department of Electrical and Information Engineeering University of Oulu, Finland Miguel Bordallo1, Henri Nykänen2, Jari Hannuksela1, Olli Silvén1 and Markku Vehviläinen3 1 University of Oulu, Finland 2 Visidon Ltd. Oulu, Finland 3 Nokia Research Center, Tampere, Finland

  2. Contents Introduction Mobile Image Recognition • Local Binary Pattern Graphics processor as a computing engine GPU accelerated image recognition • LBP Fragment Shader implementation • Image preprocessing Experiments and results • Speed • Power Consumptions

  3. Motivation • Face detection and recognition is a key component of future multimodal user interfaces • Mobile computation power still not harnessed properly for real-time computer vision • High demand computations compromise battery life. • Need for energy and computationally efficient solutions

  4. Face analysis using local binary patterns • Face analysis is one of the major challenges in computer vision • LBP method has already been adopted by many leading scientists • Excellent results in face recognition and authentication, face detection, facial expression recognition, gender classification

  5. Local Binary Pattern

  6. GPU as a computing engine GPU can be treated a an independent entity • Newer phones include a GPU chipset • OpenGL ES as a highly optimized and attractive accelerator interface • Emerging platforms (OpenCL EP) will facilitate using the GPU as a computing resource • Compatible data formats for graphics and camera sub-systems desirable

  7. Fixed pipeline (OpenGL ES 1.1) vs. programmable pipeline (OpenGL ES 2.0)

  8. Stream processing (OpenGL) vs. shared memory processing (CUDA)

  9. OpenCL (Embedded Profile) • Emerging platforms will offer needed flexibility • OpenCL Embedded Profile is a subset of OpenCL • Supports data and task parallel programming models • Code executed concurrently on CPU & GPU (& DSP) • Other current and future resources are compatible • Easier programming in a heterogeneous processor environment • High parallelization on image processing computations -> High efficiency

  10. GPU assisted face analysis process

  11. GPU-accelerated image recognition • Open GL ES 2.0: • Image features (LBP,...) extraction: • Image preprocessing • Image scaling • Displaying • C code: • Camera control • Classification • c

  12. LBP fragment shader implementation • Two versions: • Version 1: calculates LBP map in one grayscale channel • Version 2: calculates 4 LBP maps in RGBA channels • Access the image via texture lookup • Fetch the selected picture pixel • Fetch the neighbours values • Compute binary vector • Multiply by weighting factor

  13. Preprocessing Create quad Render each piece in one channel Divide texture & Convert to grayscale

  14. Experiments setup • OMAP 3 family (OMAP3530) • ARM Cortex A8 CPU • Power VRSGX535 GPU • 3 set-ups: • Beagleboard revision 3 • Zoom AM3517EVM (TI Sitara) • Nokia N900

  15. Processing times: LBP extraction • Computing LBP in four channels (version 2) faster than computing in one • CPU faster than GPU • Concurrent execution of algorithms in GPU + CPU increases performance

  16. Processing times: Preprocessing • GPU outperforms CPU in pixelwise simple operations (scaling + interpolation) • Concurrent execution of algorithms in GPU + CPU slower than GPU alone due to data transfers

  17. Speed (II): Preprocessing

  18. Speed (II): Preprocessing

  19. Power and Energy consumptions • Power consumption of GPU and CPU is independent • CPU – 190mW • GPU – 110mW-130mW (increases with image size) • Energy consumption depends on processing time • GPU has smaller energy per operation.

  20. Summary • GPUs can be used as a general purpose procesors • New platforms will offer more efficiency and flexibility • Not optimized interfaces include excesive overheads

  21. Future directions • Implementation of classifier • Implementations in OpenCL • Multi-scale LBP • Implementation of other feature extraction

  22. Thank you! • Any questions??? Thanks to Texas Instruments for the donation of the Hardware

More Related