1 / 13

Acceleration of motion estimation by edge detection algorithm using PLX sub-word parallel ISA

Acceleration of motion estimation by edge detection algorithm using PLX sub-word parallel ISA. Dongkeun Oh Sanghamitra Roy. Low bit rate Video coding(1). Block based algorithms H.263, MPEG-1,2 Good easy to implement, good image quality at low bit rates Bad

Download Presentation

Acceleration of motion estimation by edge detection algorithm using PLX sub-word parallel ISA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Acceleration of motion estimation by edge detection algorithm using PLX sub-word parallel ISA Dongkeun Oh Sanghamitra Roy

  2. Low bit rate Video coding(1) • Block based algorithms • H.263, MPEG-1,2 • Good • easy to implement, good image quality at low bit rates • Bad • Image quality degraded at very low bit rates

  3. Low bit rate video coding (2) • Object or Segmentation based algorithm • Subdividing an image into moving objects and background • Good : Efficient compression rate • Bad : Hard to implement • Necessary condition • Accurate representation of the shape of Objects

  4. Edge detection for object recognition • Block is visually continuous and discontinuous • Lines of discontinuous interface: edge • Coded edges : structure of an image • Edge detection • Sobel • Laplace • Canny’s

  5. Canny’s Edge detection • Stages • 1. Gaussian Smoothing • 2. First derivative for x,y of all pixels • 3. Magnitude of the gradient • 4. Non-maximal suppression • 5. Use hysteresis to mark the edge pixels • We simulate 2nd stages using PLX code

  6. Derivative Mask Gx(z5)=(z6-z4) Gy(z5)=(z8-z2 )

  7. Unfold C code for x-derivative calculation for(r=0; r < rows; r++) { pos = r * cols; del_x[pos] = s[pos + 1] – s[pos]; for(c = 1; c < (cols – 1); c++, pos++) { del_x[pos] = s[pos + 1] – s[pos – 1]; } del_x[pos] = s[pos] – s[pos – 1]; }

  8. Loop unfolded C code for sub-word parallel implementation for(r=0; r < 100; r++) { pos = r * cols; del_x[pos] = s[pos + 1] – s[pos]; for(c = 1; c < 24; c++, pos+= 4) { del_x[pos] = s[pos + 1] – s[pos – 1]; del_x[pos + 1] = s[pos + 2] – s[pos]; del_x[pos + 2] = s[pos + 3] – s[pos + 1]; del_x[pos + 3] = s[pos + 4] – s[pos + 2]; } …. del_x[pos] = s[pos] – s[pos – 1]; }

  9. PLX sub-word parallel ISA • Sub-word parallel ISA • 1, 2, 4, or 8 bytes sub-words • 32 general purpose registers • Aligned memory address • 4/8 bytes • SIMD instructions allow parallel operations with faster performance

  10. Issues in PLX implementation • Interfacing with C code • short int = 2 bytes • use fwrite/fread to write/read binary data from C • Memory aligned load • load address: multiple of 4 bytes to avoid trap • Load from aligned address and shift/add to get required sub-words • Loops • using predicated jump instruction

  11. Results PLX FFCF, FFB5, FFB5, 0002 C

  12. Snapshot of PLX code

  13. Thanks !

More Related