200 likes | 337 Views
This guide explores the intersection of biometric technologies, specifically in speaker verification, utilizing MATLAB for signal analysis. By focusing on speech signals, we analyze waveforms, pitch, and spectra, applying tools like Praat. The physical audio characteristics of speech—including variations in air pressure, intensity, frequency, and fundamental frequency (F0)—are examined. The differences between male and female speech patterns, spectral tilt, and the role of vocal tract configurations in sound production are also discussed. This resource provides a foundational understanding for those interested in speech analysis and biometric verification technologies.
E N D
Lab Preparation • Initial focus on Speaker Verification • Tools • Expertise • Good example • “Biometric technologies are automated methods of verifying or recognising the identity of a living person based on a physical or behavioural characteristic”
MATLAB function sig = makesine (f, fs, timelen) t = 0:(1/fs):timelen-(1/fs); sig = sin(2*pi*f*t); plot (t, sig); grid;
Speech Signals • Praat • Waveforms • F0/Pitch • Spectra • Time domain measurements & analysis • Frequency domain measurements & analysis • Male vs female speech
Sounds and Speech • Words contain sequences of sounds • Each sound (phone) is produced by sending signals from the brain to the vocal articulators • The vocal articulators produce variations in air pressure • These variations are transmitted through the air as complex waves • These waves are received by the ear and signals are sent to the brain
Praat: Speech Analysis Tool Waveform, Spectrogram, Pitch, Formants
Waveforms • Plot of change in air pressure with time • Amplitude • Compression/Rarefaction • Speech: intensity/loudness • Frequency • Cycles per second (Hz) • Speed • Metres per second (ms-1) • Wavelength • Metres (m) / Microns / Angstroms (Å) • Related by Won’t concern us for now
Fundamental Frequency • F0 (pron. F-zero) • Rate of opening/closing of glottis • Vocal folds do not vibrate like strings but F0 is dependent on similar factors • Perceptual correlate is pitch • Do not confuse with formant frequencies F1, F2,…!!!
Spectra • Think of a graphic equalizer • Speech made from waves of many frequencies • Spectrum plots (log) power against frequency • Peaks related to resonant frequencies in VT • Formants • Centre frequency • Bandwidth • Spectral slice • Spectrogram • Overhead view of slices against time • Darkness related to power
Spectral Tilt • Refers to general slope of spectrum • Higher formants are weaker than lower formants • Phonation is most significant factor • Greater spectral tilt in female speech • Ratio of lower formant amplitude to higher formant amplitude greater in males