1 / 27

Highly Parallel Framework for HEVC Motion Estimation on Many-core Platform

Data Compression Conference 2013. Highly Parallel Framework for HEVC Motion Estimation on Many-core Platform. Chenggang Yan, Yongdong Zhang, Feng Dai and Liang Li. Outline. Introduction Related Work Proposed Method Experimental Results Conclusion. Introduction (1/2). HEVC

ena
Download Presentation

Highly Parallel Framework for HEVC Motion Estimation on Many-core Platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Compression Conference 2013 Highly Parallel Framework for HEVC Motion Estimation on Many-core Platform • Chenggang Yan, Yongdong Zhang, Feng Dai and Liang Li

  2. Outline • Introduction • Related Work • Proposed Method • Experimental Results • Conclusion

  3. Introduction(1/2) • HEVC • coding tree unit (CTU)

  4. Introduction(2/2) • Local parallel method (LPM) • Maximum parallelism of LMP is equal or less than 8. • independent Pus (IPUs) • Directed acyclic graph(DAG)

  5. Related Work(1/2) • Local parallel method (LPM) [16] • Motion estimate region (MER) [16] Minhua Zhou, “AHG10: Configurable and CU-group level parallel merge/skip,” JCTVC-H0082, Feb. 2012

  6. Related Work(2/2) • Local parallel method (LPM) • 123 • M = 16 or 8 8

  7. Proposed Method • A. Data Dependency Analysis • B. DAG for CTUs • C. Highly Parallel Framework

  8. Proposed Method.A(1/3) • Independent PUs (IPUs) • The IPU’s left boundary and MER’s left boundary do not overlap. • The IPU’s upper boundary and MER’s upper boundary do not overlap. • 123

  9. Proposed Method.A(2/3)

  10. Proposed Method.A(3/3) • Neighboring CTUs • left • upper • upper-left • upper-right

  11. Proposed Method • A.Data Dependency Analysis • B. DAG for CTUs • C. Highly Parallel Framework

  12. Proposed Method.B(1/4) • Generate a DAG to capture the dependency relationships of CTUs.

  13. Proposed Method.B(2/4) • DAG • consists of a set of vertices V and edges E. • data dependency <=> an edge. • Processed <=> remove • 123

  14. Proposed Method.B(3/4) • Condition matrix (CM)

  15. Proposed Method.B(4/4)

  16. Proposed Method • A. Data Dependency Analysis • B. DAG for CTUs • C. Highly Parallel Framework

  17. Proposed Method.C(1/5)

  18. Proposed Method.C(2/5) • Step1 : Initialize DQ and CM. DQ is a waiting queue. CM is designed to record the number of related CTUs for each CTU. • Step2 : When some values in the CM become zero, get the corresponding coordinates and push them into DQ.

  19. Proposed Method.C(3/5) • Step3 : Get coordinates from DQ and process corresponding CTUs in parallel on many-core platform. • Step4 : Update CM. When a CTU with coordinate (i, j) inCM is processed, the values of coordinates (i+1, j), (i+1, j-1), (i,j+1) and (i+1,j+1) in CM will minus oneoperation. • Step5 : Repeat above steps 2~4 until each frame is over.

  20. Proposed Method.C(4/5) • Maximum parallelism of CTU • 123 • Maximum parallelism of highly parallel framework • 123 • Average parallelism of highly parallel framework • 123

  21. Proposed Method.C(5/5)

  22. Experimental Results(1/5)

  23. Experimental Results(2/5)

  24. Experimental Results(3/5)

  25. Experimental Results(4/5)

  26. Experimental Results(5/5)

  27. Conclusion(1/1) • Highly parallel framework provide sufficient parallelism for many-core platforms. • Use the DAG-based order to parallelize CTUs.

More Related