1 / 45

ACM Multimedia October 20, 2009

Manipulating Lossless Video in the Compressed Domain William Thies 1 , Steven Hall 2 , Saman Amarasinghe 2 1 Microsoft Research India 2 Massachusetts Institute of Technology. ACM Multimedia October 20, 2009. Processing in the Compressed Domain. Multimedia archives are growing rapidly

unity-munoz
Download Presentation

ACM Multimedia October 20, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manipulating Lossless Videoin the Compressed DomainWilliam Thies1, Steven Hall2, Saman Amarasinghe21 Microsoft Research India2 Massachusetts Institute of Technology ACM Multimedia October 20, 2009

  2. Processing in the Compressed Domain • Multimedia archives are growing rapidly • Monsters vs. Aliens production 100 TB • Facebook photos 400 TB • YouTube 600 TB • How to analyze or modify the data? lossless prior to distribution Compressed Output Uncompress Recompress Compressed Input Process Typical practice Compressed Output Compressed Input Process Compressed-domain transformation

  3. Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998]

  4. Prior Work: Focus on Lossy Formats • DCT-based spatial compression (JPEG, MPEG stills) • Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002] • Edge detection [Shen & Sethi 1996] • Image segmentation [Feng & Jiang 2003] • Shearing and rotating inner blocks [Shen & Sethi 1998] • Linear combinations of pixels [Smith & Rowe 1996] • DCT-based temporal compression (MPEG video) • Captioning [Nang, Kwon, & Hong 2000] • Reversal [Vasudev 1998] • Distortion detection [Dorai, Ratha, & Bolle 2000] • Transcoding[Acharya & Smith 1998] • Almost no work on lossless formats • Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999] • Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003] • Modifying pitch and playback of audio [Levine 1998] Our Focus: Regular Processing of LZ77-Compressed Data Streams

  5. Example Input: O O O O L A L A L A to lowercase Output: o o o o l a l a l a

  6. Example Input: O O O O L A L A L A Compressed Input: O O O O L A L A L A L A L A L A Output: o o o o l a l a l a

  7. Example Input: O O O O L A L A L A Compressed Input: 4 2 O O O O L A L A L A Output: o o o o l a l a l a

  8. Example Input: O O O O L A L A L A Compressed Input: 2 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

  9. Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O O O O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

  10. Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Count Distance “Repeat Token” Output: o o o o l a l a l a

  11. Example Input: O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a

  12. Example Input: Compressed Domain Transformation O O O O L A L A L A Compressed Input: 1 2 3 4 O L A Compressed Output: 2 4 1 3 o l a Output: o o o o l a l a l a

  13. Example

  14. Our Contributions • Handle the general case • Produce and consumemore than one data item • Split and join data streams • Implement in a compiler • Programmer thinks in terms of uncompressed data • Compiler translates to work on compressed data • Relies on StreamIt programming language • Evaluate on video processing tasks • 12 videos in Apple Animation format • Adjust colors or overlay two videos • Speedups proportional to compression ratio (median 15x)

  15. In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation

  16. The StreamIt Language void->void pipelineFMRadio(freq1 low, float freq2, int N) { addAtoD(); addFMDemod(); addsplitjoin { split duplicate; for (inti=0; i<N; i++) { add pipeline { addLowPassFilter(freq1 + i*(freq2-freq1)/N); addHighPassFilter(freq2 + i*(freq2-freq1)/N); } } joinroundrobin(); } add Adder(); add Speaker(); } AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker

  17. The StreamIt Language • Applications • DES and Serpent [PLDI 05] • MPEG-2 [IPDPS 06] • SAR, DSP benchmarks, JPEG, … • Programmability • StreamIt Language (CC 02) • Teleport Messaging (PPOPP 05) • Programming Environment in Eclipse (P-PHEC 05) • Domain Specific Optimizations • Linear Analysis and Optimization (PLDI 03) • Optimizations for bit streaming (PLDI 05) • Linear State Space Analysis (CASES 05) • Architecture Specific Optimizations • Compiling for Communication-Exposed Architectures (ASPLOS 02 & 06, dasCMP 07) • Phased Scheduling (LCTES 03) • Cache Aware Optimization (LCTES 05) • Load-Balanced Rendering • (Graphics Hardware 05) • Migrating Legacy Code to a Stream Representation • Using a Dynamic Analysis (MICRO 07) AtoD FMDemod Duplicate LPF1 LPF2 LPF3 HPF1 HPF2 HPF3 RoundRobin Adder Speaker

  18. Language Primitives Filter Splitter Joiner • pop N push M • roundrobin(1,1) • pop 2 push 1 • roundrobin(2,2) • roundrobin(N,M) Filter Model of computation also known as cyclo-static dataflow

  19. Example: Video Compositing Source 1 Source 2 • roundrobin(1,1) 2 MultiplyPixels 1 Output

  20. In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation

  21. Transforming Windows of Data Input: O O O O O O O O L L A A L L A A L L A A HyphenatePairs Output: O O O O – – O O O O – – L L A A – – L L A A – – L L – – A A

  22. Transforming Windows of Data Input: O O O O O O O O L L A A L L A A L L A A HyphenatePairs Output: O O O O – – O O O O – – L L A A – – L L A A – – L L – – A A

  23. Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A Compressed Output: 6 3 L A – Output: O O – O O – L A – L A – L – A

  24. Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A Compressed Output: 6 3 L A – Output: O O – O O – L A – L A – L – A

  25. Transforming Windows of Data Input: O O O O L A L A L A 3 1 4 2 Compressed Input: O L A 2 2 4 2 Coarsened,Expanded O O L A Compressed Output: 3 3 6 3 O O – L A – Output: O O – O O – L A – L A – L – A

  26. General Case: Filters O I N D … … Filter Coarsen D’ = LCM (D, I) N’ = N – (D’ – D) O I N’ D’ … ..… Translate Filter N’’ = N’ – N % I O I N’’O/I D’O/I N’%I items … … … Filter

  27. Splitting Streams Output: Input: 1 1 L A L A L A L A L A 1 1 4 1 Compressed Input: CompressedOutput: 8 2 L A L A L A L A L A 4 1

  28. Splitting Streams Output: Input: 2 2 L A L A L A L A L A 2 2 Compressed Input: L A

  29. Splitting Streams 2 2 Coarsened, Expanded Input: 4 2 CompressedOutput: 6 4 L A L A L A L A L A 2 2

  30. Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O

  31. Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O

  32. Splitting and Joining: Transpose O O O O O O O O 1 4 1 4 X O O O X O O O

  33. Splitting and Joining: Transpose 3 1 3 1 O O O O O 1 4 1 4 X O O O X O 1 2 1 2

  34. Splitting and Joining: Transpose 3 1 3 1 3 1 3 1 O O O O O 1 4 4 1 4 X O X O X O 2 1 1 2 1 2 2

  35. General Case: Joiners N1 D1 D1(W1+W2) … … N’ W1 W1 … … N2 D2 W2 … … IfD1%W1=0 and D2%W2=0 and D1/W1=D2/W2

  36. In This Talk • StreamIt Language • Compressed Domain Transformation • Experimental Evaluation

  37. Implementation • Implemented subset of transformations in StreamIt • User can change graph connectivity + filter functions • Supported file format: Apple Animation (part of .MOV) • Standard format for interchange of lossless video • Compression: Run-length encoding within a line + difference encoding between frames • Emit executable plugins for MEncoder and Blender • Allows integration with standard video editing workflow 1 2 1 1-to-1 joinerwith 2-to-1 filter 1 1 1-to-1 filter 1

  38. Experimental Methodology • Evaluated on 12 videos drawn from Internet video, computer animation, and stock digital television content • Two classes of transformations: 1. Color adjustment: inverse, brightness, contrast 2. Composite transformations: alpha-under, multiply + = alpha under x =

  39. Results: Execution Time Color Adjustment: - 2.5x to 471x (median 17x) Compositing: - 1.1x to 32x (median 6.6x) Compression factor was low (≤1.1x) for one of source videos Compression Factor Following Re-compression

  40. Results: File Bloat Masked out areasnot re-compressed Saturated colors not re-compressed Compression Factor Following Re-compression

  41. Opportunity: Ignoring “Dead” Data • Some pixels in composite frames do not depend on both input frames • Example: digital television mask (a low-performance case) • If two data streams are multiplied, and one of them is repeatedly zero, then the repeat can be copied to the output (regardless of the values in the other stream) • We expect this would fix performance of our outlier cases • Requires pattern matching on stream graph x =

  42. Extension to Other File Formats • High-efficiency mappings • Flic Video • Microsoft RLE • Targa (with run-length encoding) • Medium-efficiency mappings • Open EXR • Planar RGB  Re-arranges data by color or by byte • Low-efficiency mappings • ZIP • GZIP • PNG  Performs Huffman coding prior to LZ77

  43. Conclusions • New method for direct processing of lossless-encoded data streams • Relies on LZ77 compression and stream programming model • Supports operations on windows of data • Supports splitting, joining, and reordering data • Preliminary implementation in an automatic compiler • Writeprogramonuncompresseddata,runoncompresseddata • Good speedups in the context of video processing • 15x speedup (median) on color adjustment and compositing • Across 12 videos in Apple Animation format • May prove useful as more content authored in lossless formats • Scope for extending technique, finding new applications

  44. Extra Slides

  45. General Case: Splitters N D U … … Split V Coarsen D’=LCM(D,U+V) N’ = N – (D’ – D) N’ D’ U … ..… Translate Split V N’’=N’–N%(U+V) N’’VU+V D’V U+V N’%(U+V) items U … Split … … V

More Related