1 / 21

Woojae Kim 1 , Jongyoo Kim 2 , Sewoong Ahn 1 ,Jinwoo Kim 1 , and Sanghoon Lee 1

Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A convolutional Neural Aggregation Network. Woojae Kim 1 , Jongyoo Kim 2 , Sewoong Ahn 1 ,Jinwoo Kim 1 , and Sanghoon Lee 1 1 Department of Electrical and Electronic Engineering, Yonsei University

caspar
Download Presentation

Woojae Kim 1 , Jongyoo Kim 2 , Sewoong Ahn 1 ,Jinwoo Kim 1 , and Sanghoon Lee 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A convolutional Neural Aggregation Network Woojae Kim1, Jongyoo Kim2, Sewoong Ahn1,Jinwoo Kim1, and Sanghoon Lee1 1Department of Electrical and Electronic Engineering, Yonsei University 2Microsoft Research, Beijing, China Yuwen Li

  2. Motivations • Temporal motion effect: i)temporal masking effect; ii)a severe error in the motion map makes spatial errors more visible to humans Yuwen Li

  3. Motivations • Temporal memory for quality judgment Yuwen Li

  4. Contributions • A Deep Video Quality Assessor (DeepVQA) to predict the spatio-temporal sensitivity map • A Convolutional Neural Aggregation Network (CHAN) borrowing an idea from an 'attention mechanism' Yuwen Li

  5. Related Works • Spatio-temporal Visual Sensitivity: i)A spatio temporal contrast sensitivity function (CSF) ii)A natural video statistics (NVS) theory iii)Existing attempts using deep learning failed to consider motion properties. • Temporal Pooling: i)Average pooling ii)Adaptively pool the temporal scores from the HVS perspective iii)'Neural Aggregation Network for Video Face Recognition' (CVPR2017) Yuwen Li

  6. Framework Yuwen Li

  7. Framework-Step 1 • Input: Distorted frame: normalized after subtracting the lowpass filtered frames Spatial Error map: Yuwen Li

  8. Framework-Step 1 • Input: Frame Difference map: Temporal Error map: Yuwen Li

  9. Framework-Step 1 • Intermediate output: Spatio-temporal Sensitivity map: Yuwen Li

  10. Framework-Step 1 • Intermediate output: Perceptual Error map: Yuwen Li

  11. Framework-Step 1 Yuwen Li

  12. Framework-Step 2 Yuwen Li

  13. Experiments Yuwen Li

  14. Experiments Yuwen Li

  15. Experiments Yuwen Li

  16. Experiments Yuwen Li

  17. Experiments Yuwen Li

  18. Experiments Yuwen Li

  19. Experiments Yuwen Li

  20. Conclusion • +How to tell a good story • -Act like an integration of their previous work • -Generalization ability • -Hard to transform to NR-VQA Yuwen Li

  21. References • Yang, J., Ren, P., Zhang, D., Chen, D., Wen, F., Li, H., Hua, G., Yang, J., Li, H.,Dai, Y., et al.: Neural aggregation network for video face recognition. In: Proc.IEEE Conf. Comput. Vis. Pattern Recognit.(CVPR). 2492–2495 • Kim, J., Lee, S.: Deep learning of human visual sensitivity in image quality assessment framework. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit.(CVPR).(2017) Yuwen Li

More Related