user benefits of non linear time compression n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
User Benefits of Non-Linear Time Compression PowerPoint Presentation
Download Presentation
User Benefits of Non-Linear Time Compression

Loading in 2 Seconds...

play fullscreen
1 / 24

User Benefits of Non-Linear Time Compression - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

User Benefits of Non-Linear Time Compression. Liwei He and Anoop Gupta Microsoft Research. Introduction. Time compression: key to browse AV content We focus on informational content Audio time compression algorithms Linear: speed up audio uniformly

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

User Benefits of Non-Linear Time Compression


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
user benefits of non linear time compression

User Benefits of Non-Linear Time Compression

Liwei He and Anoop Gupta

Microsoft Research

introduction
Introduction
  • Time compression: key to browse AV content
  • We focus on informational content
  • Audio time compression algorithms
    • Linear: speed up audio uniformly
    • Non-linear: exploit fine-grain structure of human speech (e.g. pause, phonemes)
  • How much more do users gain from more complex algorithms?
methodology
Methodology
  • Conduct user listening test
    • One Linear TC algorithm
    • Two Non-linear TC algorithms
      • Simple: Pause-removal followed by Linear TC
      • Sophisticated: Adaptive TC
  • Compare objective and subjective measurements
linear time compression
Linear Time Compression
  • Classic algorithms
    • Overlap Add (OLA) and Synchronized OLA (SOLA)
    • We use SOLA
non linear time compression
Non-Linear Time Compression
  • Algorithm 1: Pause removal plus TC
    • Energy and Zero Crossing Rate analysis
    • Leave 150ms untouched
    • Shorten >150ms to 150ms
    • Apply SOLA algorithm
    • PR shortens speech by 10-25%
non linear time compression cont
Non-Linear Time Compression (cont.)
  • Algorithm 2: Adaptive TC
    • Mimics people when talking fast
    • Pauses and silences are compressed the most
    • Stressed vowels are compressed the least
    • Consonants are compressed more than vowels
    • Consonants are compressed based on neighboring vowels
system implications
System Implications
  • Computational complexity
    • Adaptive TC 10x more costly than Linear TC
  • Complexity in client-server implementation
    • Buffer management required for non-linear TC
  • Audio-video synchronization quality
user study goals
User Study Goals
  • Highest intelligible speed
  • Comprehension
  • Subjective preference
  • Sustainable speed
experiment method
Experiment Method
  • 24 subjects
  • 4 tasks for each subject
  • 3 time compression algorithms
    • Linear TC using SOLA (Linear)
    • Pause removal plus Linear TC (PR-Lin)
    • Adaptive TC (Adapt)
  • Each test takes approximately 30 minutes
highest intelligible speed task
Highest Intelligible Speed Task
  • 3 clips from technical talks
  • Find the highest speed when most of words are understandable
comprehension task
Comprehension Task
  • 3 clips at 1.5x and 3 clips at 2.5x
  • Clips from TOEFL listening test
  • Answer 4 multiple choice questions
subjective preference task
Subjective Preference Task
  • 3 pairs of clips at 1.5x
  • 3 pairs of clips at 2.5x
  • Each pair contains the same clip compressed with 2 of the 3 TC algorithms
  • Indicate preference on 3-point scale
sustainable speed task
Sustainable Speed Task
  • 3 clips each 8 minute along
  • Clips from a CD audio book
  • Find the maximum comfortable speed
  • Write a 4-5 sentence summary at the end
highest intelligible speed task1
Highest Intelligible Speed Task
  • PR-Lin is significantly better than Adapt (p<.01)
comprehension task1
Comprehension Task

Adapt is better than PR-Lin (p=.083) at 2.5x

preference task at 1 5x
Preference Task at 1.5x
  • Slight preference for PR-Lin (p=.093)
preference task at 2 5x
Preference Task at 2.5x
  • PR-Lin and Adapt do significantly better than Linear
previous works
Previous Works
  • Mach1 (Covell et. al. ICASSP 98)
    • Comprehension and preference tasks
    • Comparing Linear and Mach1 (Adapt) at 2.6-4.2x
    • Comprehension scores 17% better w/ Mach1
    • 95% prefers Mach1 to Linear
    • No data on < 2.0x
  • Other works (Harrigan, Omoigui, Li, Foulke)
    • 1.2-1.7x is the sustainable listening speed
conclusions1
Conclusions
  • Trade off in TC algorithms is task-related
    • Listening: Linear TC is sufficient
    • Fast Forwarding: Non-linear TC is more suitable
  • Adapt TC is close to the way people talk fast
    • Limit lies in the human-listening and comprehension