1 / 39

Meandering Based Parallel 3DRS Algorithm for The Multicore Era

Meandering Based Parallel 3DRS Algorithm for The Multicore Era. Ghiath Al- kadi ‡ , Jan Hoogerbrugge ‡ , Surendra Guntur‡ , Andrei Terechko *, Marc Duranton ‡ and Onno Eerenberg ‡ ‡NXP Semiconductors, Eindhoven, the Netherlands. *Vector Fabrics, Eindhoven, the Netherlands.

tracey
Download Presentation

Meandering Based Parallel 3DRS Algorithm for The Multicore Era

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Meandering Based Parallel 3DRS Algorithm for TheMulticore Era GhiathAl-kadi‡ , Jan Hoogerbrugge‡ , Surendra Guntur‡ , Andrei Terechko*, Marc Duranton‡ and OnnoEerenberg‡ ‡NXP Semiconductors, Eindhoven, the Netherlands. *Vector Fabrics, Eindhoven, the Netherlands This paper appears in:Consumer Electronics (ICCE), 2010 Digest of Technical Papers International Conference on

  2. I. INTRODUCTION • True motion estimation • 3DRS • II. THE SCALABLE MEANDERING BASED 3DRS • III. EVALUATION, RESULTS AND CONCLUSION

  3. introduction • true motion estimation --a method for finding objects motion • the motion vectors should represent true motion of the objects in the video sequence

  4. introduction For video compression applications it is enough to get a motion vector corresponding to best match. This in turns results in lower residual energy and better compression. Using traditional ME to find true motion vectors can only be estimated for blocks containing enough texture

  5. difficulty of true motion estimation • When the video sequence is complex, especially having small objects and fast moving objects, motion vector is not easy to estimation • blocking artifact • object movement result in cover/uncover criterions

  6. Main Application of true motion estimation • frame rate up-conversion(FRC) • Add frame rate to 120 frames per second is becoming increasingly necessary with the advent of advanced high resolution display technologies such as LCD and Plasma • Motion estimation is an integral part of FRC • The quality of the motion vector based interpolated • need true motion vector

  7. How to find true motion vector • 3-Dimensional Recursive Search(3DRS) algorithm is one of the most widely used methods to find true motion • The 3DRS algorithm is based on block matching and in order to find true motion the algorithm makes two assumptions • (i) Objects are larger than a block of pixels; • (ii)Objects have inertia.

  8. 3DRS • For all other blocks, we will have to rely on motion vector already estimated. • construct a small set of candidate vectors based on spatial relations • Motion vector can be refined according to the motion of neighboring blocks gradually pass by pass, and then true motion can be found with the spatial correlation of motion vectors

  9. 3DRS • However, since the picture is processed in a block based fashion according a specified scanning order, the motion information is only available for the blocks that have already been processed according to the scan order • those processed in a previous field are called temporal candidates.

  10. 3DRS • For one block, have these candidate motion vectors: • <1>spatial prediction candidate set: • :relative position of current block x and current frame n • <2>Temporal candidate set( estimated from previous frame):

  11. 3DRS • <3>Update candidates set : generated by adding small random vectors (u) to spatial candidate set, i.e. • Update vector relative small • theoretically ,update vector can be random variable e.g. Gussian or uniform probability distribution • These (random) update vectors are essential for the convergence of the motion field and to correctly track variable object motion

  12. General recursive process "True-Motion Estimation with 3-D Recursive Search Block Matching" Gerard de Haan, Paul W. A. C. Biezen, HenkHuijgen, and Olukayode A. Ojo

  13. Relative position of spatial and temporal predictor

  14. Relative position of spatial and temporal predictor r = 2 has been experimentally found to be best for a block size of 8*8 pixels.

  15. 3DRS • Each pass in 3DRS motion estimation is presented as follows: • : candidate vector in the i-1 pass candidate vector set • update vectors are randomly selected from the update set, US

  16. convergence

  17. I. INTRODUCTION II. THE SCALABLE MEANDERING BASED 3DRS III. EVALUATION, RESULTS AND CONCLUSION

  18. THE SCALABLE MEANDERING BASED 3DRS • The scan order of 3DRS algorithms could either follow a “ raster” or a “meandering” pattern as shown in Fig. 1. One possible method involves processing Macro Blocks (MB) in scan order. • While the raster scanning pattern is easily parallelizable it has inferior convergence properties compared to the meandering scan pattern. • the meandering scan is quite challenging to parallelize due to the frequently changing scan direction as shown in Fig.1(A). • This paper addresses the above problem and presents a scalablemulti-(co)processor friendly method to parallelize the meanderingbased 3DRS motion estimation algorithm without compromising picture quality.

  19. THE SCALABLE MEANDERING BASED 3DRS

  20. THE SCALABLE MEANDERING BASED 3DRS • An analysis of this algorithm allows to make the following observations: • (i) each meandering scan is composed of two raster scans operating on odd rows or even rows as depicted in Fig. 1(B); • (ii) the two raster scans depend on each other; • (iii) the relative position and temporal (spatial) nature of the candidates constantly change based on the current direction of the scan in progress.

  21. THE SCALABLE MEANDERING BASED 3DRS If MB(i,j) is the current MB under consideration, then the spatial and temporal MBs available for candidate selection in the traditional 3DRS algorithm are S1ij and T1ij respectively.

  22. THE SCALABLE MEANDERING BASED 3DRS S1 S3 S2 T1,T2 T3

  23. THE SCALABLE MEANDERING BASED 3DRS The variables α and β are presented for a left to right scan order, changing the scan order implies swapping the content of these variables.

  24. THE SCALABLE MEANDERING BASED 3DRS the motion information in the neighboring blocks that are processed in the same iteration (i.e. spatial candidates) is more accurate than the ones available from the previous scan iteration. With reference to the two raster scans shown in Fig.1(B), the currently processed block MB(i,j) denoted as B has only the MB denoted as A as a direct neighboring spatial candidate. All other direct neighboring candidates are temporal.

  25. THE SCALABLE MEANDERING BASED 3DRS

  26. THE SCALABLE MEANDERING BASED 3DRS • In order to maintain motion detection accuracy • the selection of spatial candidates is replaced to include MBs from the set S2ij instead of S1ij. • The temporal candidates are unchanged. • Thus, the parallel 3DRS algorithm constructs its candidate set from S2ij and T2ij.

  27. Parallelization of Raster Scan “ The 2D Wave” in Fig. 2, MB(i,j) can be processed as soon as MB(i-2,j+1) completes. This results in processing MBs in a diagonal wave front manner which is referred to as “ 2D-Wave”

  28. The runtime execution of the parallel 3DRS algorithm However, the quality of the motion detection can be compromised because the neighboring MBs are not used (other than α or β) as spatial candidates. To prevent the quality lost while still being able to find small objects, both raster scans can be simultaneously executed as shown in Fig. 1(B). This is done by assigning a Motion Estimator (ME) (co)processor to each row.

  29. The runtime execution of the parallel 3DRS algorithm

  30. The runtime execution of the parallel 3DRS algorithm Fig. 3, for example depicts a system in which the parallel 3DRS algorithm is mapped to four cores. The simultaneous execution of the 2D-wave processing of the two raster scans can be viewed as two distinct phases:

  31. The runtime execution of the parallel 3DRS algorithm Phase One: The execution from the start position of each row to around the middle of the row each raster scan executes the 2D wave with ME1 using the (S1ij, T1ij) candidate set for block matching while the other processors use the (S2ij, T2ij) set (α and β are swapped according to the scan direction).

  32. The runtime execution of the parallel 3DRS algorithm Phase Two: The execution from around the middle of the row to the end of the row (see Fig.3-right). The processors executing would have overlapped the eight neighboring MBs are spatial. Thus, ME1, ME2 and ME3 use the (S3ij, T3ij) candidate set while ME4 uses the (S1ij, T1ij) set for block matching.

  33. I. INTRODUCTION II. THE SCALABLE MEANDERING BASED 3DRS III. EVALUATION, RESULTS AND CONCLUSION

  34. EVALUATION The proposed parallel 3DRS algorithm is evaluated for various video streams by performing simulations on the NeXVP architecture The underlying architecture consists of 2 homogenous 4 issue slot Trimedia cores with a subset static interleaved multithreading (two foreground and two background threads)

  35. EVALUATION 3DRS motion estimation performs 125 scans/second for Full HD 1920x1080 stream compared to 29 scans/second on a single core running the parallel 3DRS code. For Quad HD 4096x2160 video, a rate of 100 scans/second was obtained on a similar architecture having 3 additional cores.

  36. RESULT Qualitative evaluation of the picture quality indicates that the parallel implementation of the algorithm performs as well as the traditional 3DRS algorithm with no visible degradation in picture quality.

  37. RESULT

  38. conclusion This paper presents a method to parallelize the meandering based 3D recursive search (3DRS) motion estimation algorithm used in scan-rate up-conversion. The proposed algorithm is scalable and can easily be mapped to multiple processing units such as multithreaded processors, multicores and/or co-processors in order to cope up with the increasingly hard to meet real time requirements of next generation video devices.

  39. conclusion Experiments show that the picture quality of the proposed parallel 3DRS algorithm is as good as the original nonparallelized algorithm for most video sequences.

More Related