70 likes | 222 Views
This study demonstrates the improvement in application speed for Discrete Cosine Transform (DCT) and quantization processes through the strategic combination of data buffers. By employing varying buffer sizes (8-bit and 16-bit), we analyze the impact on reading cycles, efficiency, and memory usage. The solutions highlight the advantages of large buffers, 16-bit input data, and temporary variables. Performance comparisons show the best approach, balancing memory consumption and speed, leading to a more efficient encoding process in video compression applications.
E N D
Increasing Application Speed by combining DCT and Quant Component
BUFFERS 18432 bytes 36976 bytes 37840 bytes FREAD Reading 8-bit input data from file DCT Discrete Cosine Transform Quant Quantizing 6 DCT blocks : 4Y+1U+1V VLCEnc Variable Length Coding Encoder QUANT-VLCEnc Buffer Macroblock (mb_frame_t : 37728 bytes) Parameter (travel_frame_pars_t : 112 bytes) Total Buffers : 37840 bytes Fread-DCT Buffer Y8-bit (Y_frame_t : 12288 bytes) U8-bit (UV_frame_t : 3072 bytes) V8-bit (UV_frame_t : 3072 bytes) Total Buffers : 18432 bytes DCT-QUANT Buffer Y16-bit (Y_frame16_t : 24576 bytes) U16-bit (UV_frame16_t : 6144 bytes) V16-bit (UV_frame16_t : 6144 bytes) Parameter (travel_frame_pars_t : 112 bytes) Total Buffers : 36976 bytes DCT and Quant as separated components exolTry.out Cycles=10340038
55408 bytes 37840 bytes DCT-Quant Fread-DCTQuantBuffer Y8-bit (Y_frame_t : 12288 bytes) U8-bit (UV_frame_t : 3072 bytes) V8-bit (UV_frame_t : 3072 bytes) Parameter (travel_frame_pars_t : 112 bytes) Y16-bit (Y_frame16_t : 24576 bytes) U16-bit (UV_frame16_t : 6144 bytes) V16-bit (UV_frame16_t : 6144 bytes) Total Buffers : 55408 bytes Inpacket Outpacket FREAD Reading 8-bit input data from file FREAD Reading 8-bit input data from file DCT Discrete Cosine Transform Quant Quantizing 6 DCT blocks : 4Y+1U+1V VLCEnc Variable Length Coding Encoder VLCEnc Variable Length Coding Encoder Y8-bit (12288) U8-bit (3072) V8-bit (3072) Parameter (112) Y16-bit (24576) U16-bit (6144) V16-bit (6144) Discrete Cosine Transform Processing Parameter (112) Macroblock (37728) DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks Quantizing Process Solution 1: DCT-Quant with a BIG Buffer 18432 bytes Advantage : 1. Combined Component Removing header process Faster Disadvantages : 1. Big Buffer Wasting Memory 2. Same Variable Structure in Buffer No Speed Reduction in Declaring Buffer exolTry.out Cycles= 10340038 exolTry.out Cycles=10281109
36976 bytes 37840 bytes DCT-Quant Fread-DCTQUANT Buffer Y16-bit (Y_frame16_t : 24576 bytes) U16-bit (UV_frame16_t : 6144 bytes) V16-bit (UV_frame16_t : 6144 bytes) Parameter (travel_frame_pars_t : 112 bytes) Total Buffers : 36976 bytes Inpacket Outpacket FREAD Reading 16-bit input data from File FREAD Reading 16-bit input data from file DCT Discrete Cosine Transform Quant Quantizing 6 DCT blocks : 4Y+1U+1V VLCEnc Variable Length Coding Encoder VLCEnc Variable Length Coding Encoder Y16-bit (24576) U16-bit (6144) V16-bit (6144) Parameter (112) Discrete Cosine Transform Processing Parameter (112) Macroblock (37728) DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks Quantizing Process Solution 2: DCT-Quant with 16-bit Input 18432 bytes Advantage : 1. Combined Component Removing header process Faster 2. Reusing the input Variables Saving the memory in Buffer Disadvantage : 1. Input data must be in 16-bit Need more processing if we have 8-bit input exolTry.out Cycles= 10340038 exolTry.out Cycles=10231500
18432 bytes 37840 bytes DCT-Quant Inpacket Outpacket FREAD Reading 8-bit input data from File FREAD Reading 8-bit input data from file DCT Discrete Cosine Transform Quant Quantizing 6 DCT blocks : 4Y+1U+1V VLCEnc Variable Length Coding Encoder VLCEnc Variable Length Coding Encoder Y8-bit (12288) U8-bit (3072) V8-bit (3072) Parameter (112) Discrete Cosine Transform Processing Quantizing Process Macroblock (37728) Parameter (112) Fread-DCT Buffer Y8-bit (Y_frame_t : 12288 bytes) U8-bit (UV_frame_t : 3072 bytes) V8-bit (UV_frame_t : 3072 bytes) Total Buffers : 18432 bytes DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks DCT-Quant Discrete Cosine Transform, then Quantizing the DCT blocks Y16-bit (24576) U16-bit (6144) V16-bit (6144) Temp Vars Solution 3: DCT-Quant with Temporary Variables 18432 bytes Advantages : 1. Less Buffer needed 2. Combined Component Removing header process Faster Disadvantage : 1. Declaring & Freeing memory for temporary variables Needs more time exolTry.out Cycles= 10340038 exolTry.out Cycles=10289104
Comparison Solution 1: BIG Buffer Solution 2: 16-bit Input Solution 3: Temp Vars. Big Buffer Wasting Memory No Speed Reduction in Declaring Buffer Reusing the input Variables Saving the memory in Buffer Input data must be in 16-bit Need more processing if we have 8-bit input Less Buffer needed Declaring & Freeing memory for temporary variables Needs more time Buffer : 55408 bytes Buffer : 36976 bytes Buffer : 18432 bytes Cycles=10281109 Cycles=10231500 Cycles=10289104