1 / 14

Speeding up VirtualDub

Speeding up VirtualDub . Presented by: Shmuel Habari Advisor: Zvika Guz. Software Systems Lab Technion. What is VirtualDub?. VirtualDub is an incredibly popular open source video processing tool.

clayland
Download Presentation

Speeding up VirtualDub

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speeding up VirtualDub Presented by: Shmuel Habari Advisor: Zvika Guz Software Systems Lab Technion

  2. What is VirtualDub? • VirtualDub is an incredibly popular open source video processing tool. • It is capable of merging videos, cutting scenes, adding subtitles and applying a wide variety of filters. It also supports third party video compression (i.e. DivX) • VirtualDub is constantly being refined, expanded and adapted by it’s original creator, Avery Lee . http://www.virtualdub.org/

  3. VirtualDub’s Benchmark • The benchmark chosen was to use the Resize filter, an often-used cpu heavy filter. • Choosing a vibrant color animation video, so every flaw, if any, will be visible. • The result video was made w/o audio filtering, and with no third party compression utilities.

  4. VTune Performance Analyzer • Analyzing the benchmark using VTune: • First step - VDFastMemcpyPartialMMX2

  5. Fast Memory Copy • This functions handles copying large quantities of data from a memory source address, into a memory destination address. • Each cycle copies the data into the registers, and then into the specified address. • Moving to the next 64 bytes, the loop continues, till all the data has been copied. • From observations, the function was called to read 2048 bytes every time.

  6. Clockticks Samples • Again using VTune it was seen that predictably, the most clockticks were when reading from the memory.

  7. Dummy Loop • Seeing that, the solution was to fill the cache before beginning to copy the data • I’ve added a dummy loop, a.k.a. @mainloop, reading 1024 bytes ahead, before running blastloop. • When the cache empties – if we did not reach the end of the source data, another 1024 bytes would be read. • Using the Dummy loop, a speedup of 4.21% was gained.

  8. Threads • As stated before, the original VirtualDub is a project in development. • The original creator had access to code optimizing programs – VTune included – allowing him to improve the code himself, removing many pitfalls and errors common to non-optimized code. • Also, VirtualDub proved to be multithreaded, to a point:

  9. Threads • The 1st thread is the processing thread - however, the 2nd thread is the audio thread – since we specifically disabled the audio, It did not contain almost any activity: • Therefore – theoretically, Multithreading the Process thread was still possible

  10. Threads • At first I had high hopes for multithreading VirtualDub – studying the code I came to the conclusion that it processed the video frame by frame, and in each frame it scanned line by line. • Two approachs I decided to try were: • Processing two frames in parallel • Cutting a frame in half, and processing the top and bottom in parallel.

  11. Threads • At first I had high hopes for multithreading VirtualDub – studying the code I came to the conclusion that it processed the video frame by frame, and in each frame it scanned line by line. • Two approachs I decided to try were: • Processing two frames in parallel • Cutting a frame in half, and processing the top and bottom in parallel.

  12. Threads • However, All my attempts at hyper threading VirtualDub’s processing failed. • At first believing that I’ve encountered global variables being addressed, I’ve discovered them to be private variables to a much higher level class. • Attempts to duplicate said class in order to split the workload failed.

  13. Threads • Lastly, I’ve turned to OpenMP, hoping to use it’s innate capabilities to duplicate the variables into each thread. • VirtualDub’s complexity made it impossible for me to covert it to Intel Compiler – every change resulted in a staggering amount of errors, each requiring many small code changes, and still more that couldn’t be solved. • Limiting the use of Intel compiler into the only necessary projects did not show an improvement.

  14. Conclusion • A lot of time and effort were put into this project. • To my dismay, it is not evident in percent of speedup, but rather as error messages and various versions of code, each a bit closer to a working version, but never quite there. • The bottom line, is that despite the promise initially shown by VirtualDub, ultimately too much had already been originally done in it – leaving it optimized, monstrously big and intricate for my optimization.

More Related