1 / 52

Topic 2

Topic 2. Signal Processing Review. (Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music). Recording Sound. Mechanical Vibration. Pressure Waves. Motion->Voltage Transducer. Voltage over time. Microphones.

sani
Download Presentation

Topic 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic 2 Signal Processing Review (Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music)

  2. Recording Sound Mechanical Vibration Pressure Waves Motion->Voltage Transducer Voltage over time ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  3. Microphones http://www.mediacollege.com/audio/microphones/how-microphones-work.html ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  4. Pure Tone = Sine Wave Period T 440Hz Amplitude Time (ms) time amplitude frequency initial phase ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  5. Reminders • Frequency, , is measured in cycles per second , a.k.a. Hertz(Hz). • One cycle contains radians. • Angular frequency , is measured in radians per second and is related to frequency by . • So we can rewrite the sine wave as ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  6. Fourier Transform Amplitude Time (ms) | Amplitude 0 440 -440 Frequency (Hz) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  7. We can also write Amplitude Time (ms) | Amplitude 0 Angular Frequency (radians) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  8. Complex Tone = Sine Waves 220 Hz 660 Hz + 1100 Hz + = ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  9. Frequency Domain Amplitude Time (ms) | Amplitude 220 660 1100 Frequency (Hz) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  10. Harmonic Sound • 1 or more sine waves • Strong components at integer multiples of a fundamental frequency (F0) in the range of human hearing (20 Hz ~ 20,000 Hz) • Examples • 220 + 660 + 1100 is harmonic • 220 + 375 + 770 is not harmonic ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  11. Noise • Lots of sines at random freqs. = NOISE • Example: 100 sines with random frequencies, such that . ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  12. How strong is the signal? • Instantaneous value? • Average value? • Something else? ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  13. Acoustical or Electrical • Acoustical • Electrical Average intensity View as sound pressure sound speed density Average power View as electric voltage resistance ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  14. Root-Mean-Square (RMS) • should be long enough. • should have 0 mean, otherwise the DC component will be integrated. • For sinusoids ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  15. Sound Pressure Level (SPL) • Softest audible sound intensity 0.000000000001 watt/m2 • Threshold of pain is around 1 watt/m2 • 12 orders of magnitude difference • A log scale helps with this • The decibel (dB) scale is a log scale, with respect to a reference value ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  16. The Decibel • A logarithmic measurement that expresses the magnitude of a physical quantity (e.g. power or intensity) relative to a specified reference level. • Since it expresses a ratio of two (same unit) quantities, it is dimensionless. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  17. Lots of references! • dB SPL – A measure of sound pressure level. 0dB SPL is approximately the quietest sound a human can hear, roughly the sound of a mosquito flying 3 meters away. • dbFS– relative to digital full-scale. 0 VU is the maximum allowable signal. Values typically negative. • dBV – relative to 1 Volt RMS. 0dBV = 1V. • dBu – relative to 0.775 Volts RMS with an unloaded, open circuit. • dBmV – relative to 1 millivolt across 75 Ω. Widely used in cable television networks. • …… ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  18. Jet engine at 3m Pain threshold Loud motorcycle, 5m Vacuum cleaner Quiet restaurant Rustling leaves Human breathing, 3m Hearing threshold 140 db-SPL 130 db-SPL 110 db-SPL 80 db-SPL 50 db-SPL 20 db-SPL 10 db-SPL 0 db-SPL Typical Values ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  19. Digital Sampling 3 011 quantization increment 010 2 001 1 RECONSTRUCTION AMPLITUDE 000 0 100 -1 -2 101 TIME sample interval ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  20. More quantization levels = more dynamic range 6 0110 quantization increment 5 0101 4 0100 3 0011 2 0010 1 0001 0 0000 1000 -1 1001 -2 1010 -3 1011 -4 sample interval AMPLITUDE TIME ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  21. Bit Depth and Dynamics • More bits = more quantization levels = better sound • Compact Disc: 16 bits = 65,536 levels • POTS (plain old telephone service): 8 bits = 256 levels • Signal-to-quantization-noise ratio (SQNR), if the signal is uniformly distributed in the whole range • E.g. 16 bits depth gives about 96dB SQNR. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  22. RMS The red dots form the discrete signal Amplitude ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  23. Aliasing and Nyquist 6 5 4 3 2 AMPLITUDE 1 0 -1 -2 -3 -4 TIME sample interval ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  24. Aliasing and Nyquist 6 5 4 3 2 AMPLITUDE 1 0 -1 -2 -3 -4 TIME sample interval ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  25. Aliasing and Nyquist 6 5 4 3 2 AMPLITUDE 1 0 -1 -2 -3 -4 TIME sample interval ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  26. Nyquist-Shannon Sampling Theorem • You can’t reproduce the signal if your sample rate isn’t faster than twice the highest frequency in the signal. • Nyquist rate: twice the frequency of the highest frequency in the signal. • A property of the continuous-time signal. • Nyquist frequency: half of the sampling rate • A property of the discrete-time system. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  27. Discrete-Time Fourier Transform (DTFT) The red dots form the discrete signal , where Amplitude is Periodic. We often only show is a continuous variable | Amplitude 0 Angular frequency ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  28. Relation between FT and DTFT Sampling: FT: DTFT: • Scaling: , i.e. corresponds to , which corresponds to . • Repetition: contains infinite copies of , spaced by . Amplitude Time (ms) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  29. Aliasing Complex tone 900Hz + 1800Hz | Sampling rate = 8000Hz 0 0 | Sampling rate = 2000Hz | 0 200Hz ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  30. Fourier Series • FT and DTFT do not require the signal to be periodic, i.e. the signal may contain arbitrary frequencies, which is why the frequency domain is continuous. • Now, if the signal is periodic: • It can be reproduced by a series of sine and cosine functions: • In other words, the frequency domain is discrete. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  31. Discrete Fourier Transform (DFT) • FT and DTFT are great, but the infinite integral or summations are hard to deal with. • In digital computers, everything is discrete, including both the signal and its spectrum Length of the signal, i.e. length of DFT frequency domain index time domain index ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  32. DFT and IDFT • Both and are discrete and of length . • Treats as if it were infinite and periodic. • Treats as if it were infinite and periodic. • Only one period is involved in calculation. DFT: IDFT: ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  33. Discrete Fourier Transform Time domain Frequency domain Real portion Real portion DFT 0 N/2 N-1 0 N-1 Imaginary portion Imaginary portion IDFT 0 N-1 0 N/2 N-1 • If the time-domain signal has no imaginary part (like an audio signal) then the frequency-domain signal is conjugate symmetric around N/2. fs/2 DC ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  34. Kinds of Fourier Transforms Fourier Transform Signals: continuous, aperiodic Spectrum: aperiodic, continuous Fourier Series Signals: continuous, periodic Spectrum: aperiodic, discrete Discrete Time Fourier Transform Signals: discrete, aperiodic Spectrum: periodic, continuous Discrete Fourier Transform Signals: discrete, periodic Spectrum: periodic, discrete ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  35. The FFT • Fast Fourier Transform • A much, much faster way to do the DFT • Introduced by Carl F. Gauss in 1805 • Rediscovered by J.W. Cooley and John Tukeyin 1965 • The Cooley-Tukeyalgorithm is the one we use today (mostly) • Big O notation for this is O(N log N) • Matlab functions fft and ifft are standard. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  36. Windowing • A function that is zero-valued outside of some chosen interval. • When a signal (data) is multiplied by a window function, the product is zero-valued outside the interval: all that is left is the "view" through the window. x[n] z[n] w[n] x = Example: windowing x[n] with a rectangular window ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  37. Some famous windows • Rectangular • Triangular (Bartlett) • Hann Note: we assume w[n] = 0 outside some range [0,N] amplitude sample amplitude sample amplitude sample ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  38. Why window shape matters • Don’t forget that a DFT assumes the signal in the window is periodic • The boundary conditions mess things up…unless you manage to have a window whose length is exactly 1 period of your signal • Making the edges of the window less prominent helps suppress undesirable artifacts ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  39. Fourier Transform of Windows Main lobe • We want • Narrow main lobe • Low sidelobes Sidelobes ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  40. Which window is better? Hann window Hamming window ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  41. Multiplication v.s. Convolution • Windowing is multiplication in time domain, so the spectrum will be a convolution between the signal’s spectrum and the window’s spectrum • Convolution in time domain takes , but if we perform in the frequency domain… • FFT takes • Multiplication takes • IFFT takes ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  42. Windowed Signal ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  43. Spectrum of Windowed Signal • Two sinusoids: 1000Hz + 1500Hz • Sampling rate: 10KHz • Window length: 100 (i.e. 100/10K = 0.01s) • FFT length: 400 (i.e. 4 times zero padding) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  44. Zero Padding • Add zeros after (or before) the signal to make it longer • Perform DFT on the padded signal Windowed signal Padded zeros ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  45. Why Zero Padding? • Zero padding in time domain gives the ideal interpolation in the frequency domain. • It doesn’t increase (the real) frequency resolution! • 4 times is generally enough • Here the resolution is always fs/L=100Hz 8 times zero padding No zero padding 4 times zero padding ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  46. How to increase frequency resolution? • Time-frequency resolution tradeoff (second) (Hz) Window length: 10ms Window length: 20ms Window length: 40ms ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  47. Short time Fourier Transform • Break signal into windows • Calculate DFT of each window ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  48. The Spectrogram • There is a “spectrogram” function in matlab, but you can’t do zero padding using it. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  49. A Fun Example (Thanks to Robert Remez) ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

  50. Overlap-Add Synthesis • IDFT on each spectrum • The complex, full spectrum • Don’t forget the phase (often using the original phase). • If you do it right, the time signal you get is real. • Multiply with a synthesis window (e.g. Hamming) • Not dividing the analysis window • Overlap and add different frames together. ECE 492 - Computer Audition and Its Applications in Music, Zhiyao Duan 2013

More Related