1 / 62

Microcomputer Systems 2

Microcomputer Systems 2. Analysis and Synthesis of Pole-Zero Speech Models. Introduction. Deterministic: Speech Sounds with periodic or impulse sources Stochastic: Speech Sounds with noise sources Goal is to derive vocal tract model of each class of sound source.

mkeeler
Download Presentation

Microcomputer Systems 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microcomputer Systems 2 Analysis and Synthesis of Pole-Zero Speech Models

  2. Introduction • Deterministic: • Speech Sounds with periodic or impulse sources • Stochastic: • Speech Sounds with noise sources • Goal is to derive vocal tract model of each class of sound source. • It will be shown that solution equations for the two classes are similar in structure. • Solution approach is referred to as linear predication analysis. • Linear prediction analysis leads to a method of speech synthesis based on the all-pole model. • Note that all-pole model is intimately associated with the concatenated lossless tube model of previous chapter (i.e., Chapter 4). Veton Këpuska

  3. All-Pole Modeling of Deterministic Signals • Consider a vocal tract transfer function during voiced source: Ug[n] A … s[n] GlottalModel Vocal TrackModel RadiationModel  Speech T=pitch V(z) G(z) R(z) Veton Këpuska

  4. All-Pole Modeling of Deterministic Signals • What about the fact that R(z) is a zero model? • A single zero function can be expressed as a infinite set of poles. Note: • From the above expression one can derive: Veton Këpuska

  5. All-Pole Modeling of Deterministic Signals • In practice infinite number of poles are approximated with a finite site of poles since ak0 as k∞. • H(z) can be considered all-pole representation: • representing a zero with large number of poles ⇒ inefficient • Estimating zeros directly is a more efficient approach (covered later in this chapter). Veton Këpuska

  6. Model Estimation • Goal - Estimate : • filter coefficients {a1, a2, …,ap}; for a particular order p, and • A, Over a short time span of speech signal (typically 20 ms) for which the signal is considered quasi-stationary. • Use linear prediction method: • Each speech sample is approximated as a linear combination of past speech samples ⇒ • Set of analysis techniques for estimating parameters of the all-pole model. Veton Këpuska

  7. Current Sample Past Samples Scaling Factor –Linear Prediction Coefficients Input Model Estimation • Consider z-transform of the vocal tract model: • Which can be transformed into: • In time domain it can be written as: • Referred to us as a autoregressive (AR) model. Veton Këpuska

  8. Model Estimation • Method used to predict current sample from linear combination of past samples is called linear prediction analysis. • LPC– Quantization of linear prediction coefficients or of a transformed version of these coefficients is called linear prediction coding. • For ug[n]=0 • This observation motivates the analysis technique of linear prediction. Veton Këpuska

  9. Estimate of s[n] Model Estimation: Definitions • A linear predictor of order p is defined by: Estimate of ak z Veton Këpuska

  10. s[n] ˜ P[z] e[n]=Aug[n] s[n] Model Estimation: Definitions • Prediction error sequenceis given as difference of the original sequence and its prediction: • Associated prediction error filter is defined as: • If {k}={ak} A(z) Veton Këpuska

  11. Aug[n] s[n] Model Estimation: Definitions • Note 1: • Recovery of s[n]: Veton Këpuska

  12. Model Estimation: Definitions • Note 2: If • Vocal tract contains finite number of poles and no zeros, • Prediction order is correct, then • {k}={ak}, and • e[n] is an impulse train for voiced speech and for impulse speech e[n] will be just an impulse. Veton Këpuska

  13. Example 5.1 • Consider an exponentially decaying impulse response of the form h[n]=anu[n] where u[n] is the unit step. Response to the scaled unit sample A[n] is: • Consider the prediction of s[n] using a linear predictor of order p=1. • It is a good fit since: • Prediction error sequence with 1=a is: • The prediction of the signal is exact except at the time origin. Veton Këpuska

  14. Covariance Method of Linear Prediction

  15. Error Minimization • Important question is: how to derive an estimate of the prediction coefficients al, for a particular order p, that would be optimal in some sense. • Optimality is measured based on a criteria. An appropriate measure of optimality is mean-squared error (MSE). • Goal is to minimize the mean-squared prediction error: E defined as: • In reality, a model must be valid over some short-time interval, say M samples on either side of n: Veton Këpuska

  16. Error Minimization • Thus in practice MSE is time-depended and is formed over a finite interval as depicted in previous figure. • [n-M,n+M] – prediction error interval. • Alternatively: where Veton Këpuska

  17. Error Minimization • Determine {k} for which En is minimal: • Which results in: Veton Këpuska

  18. Error Minimization • Last equation can be rewritten by multiplying through: • Define the function: • Which gives the following: • Referred to as the normal equations given in the matrix form bellow: Veton Këpuska

  19. Error Minimization • The minimum error for the optimal solution can be derived as follows: • Last term in the equation above can be rewritten as: Veton Këpuska

  20. Error Minimization • Thus error can be expressed as: Veton Këpuska

  21. Error Minimization • Remarks: • Order (p) of the actual underlying all-pole transfer function is not known. • Order can be estimated by observing the fact that a pth order predictor in theory equals that of a (p+1) order predictor. • Also predictor coefficients for k>p equal zero (or in practice close to zero and model only noise-random effects). • Prediction error en[m] is non-zero only “in the vicinity” of the time n: [n-M,n+M]. • In predicating values of the short-time sequence sn[m], p –values outside of the prediction error interval [n-M,n+M] are required. • Covariance method – uses values outside the interval to predict values inside the interval • Autocorrelation Method – assumes that speech samples are zero outside the interval. Veton Këpuska

  22. Error Minimization • Matrix formulation • Projection Theorem: • Columns of Sn – basis vectors • Error vector en is orthogonal to each basis vector: SnTen=0; where • Orthogonality leads to: Veton Këpuska

  23. Autocorrelation Method of Linear Prediction

  24. Autocorrelation Method • In previous section we have described a general method of linear prediction that uses samples outside the prediction error interval referred to as covariance method. • Alternative approach that does not consider samples outside analysis interval, referred to as autocorrelation method, will be presented next. • This method is: • Suboptimal, however it • Leads to an efficient and stable solution to normal equations. Veton Këpuska

  25. Autocorrelation Method • Assumes that the samples outside the time interval [n-M,n+M] are all zero, and • Extends the prediction error interval, i.e., the range over which we minimize the mean-squared error to ±∞. • Conventions: • Short-time interval: [n, n+Nw-1] where Nw=2M+1 (Note: it is not centered around sample n as in previous derivation). • Segment is shifted to the left by n samples so that the first nonzero sample falls at m=0. This operation is equivalent to: • Shifting of speech sequence s[m] by n-samples to the left and • Windowing by Nw -point rectangular window: Veton Këpuska

  26. Windowed sequence can be expressed as: This operation can be depicted in the figure presented on the right. Autocorrelation Method Veton Këpuska

  27. Autocorrelation Method • Important observations that are consequence of zeroing the signal outside of interval: • Prediction error is nonzero only in the interval [0,Nw+p-1] • Nw-window length • p-the predictor order • The prediction error is largest at the left and right ends of the segment. This is due to edge effects caused by the way the prediction is done: • from zeros – from the left of the window • to zeros – from the right of the window Veton Këpuska

  28. Autocorrelation Method • To compensate for edge effects typically tapered window is used (e.g., Hamming). • Removes the possibility that the mean-squared error be dominated by end (edge) effects. • Data becomes distorted hence biasing estimates: k. • Let the mean-squared prediction error be given by: • Limits of summation refer to new time origin, and • Prediction error outside this interval is zero. Veton Këpuska

  29. Autocorrelation Method • Normal equations take the following form (Exercise 5.1): • where Veton Këpuska

  30. Autocorrelation Method • Due to summation limits depicted in the figure on the right function n[i,k] can be written as: • Recognizing that only samples in the interval [i,k+Nw-1] contribute to the sum, and • Changing variable m⇒ m-i: Veton Këpuska

  31. Autocorrelation Method • Since the above expression is only function of difference i-k thus we denote it as: • Letting =i-k, referred to as correlation “lag”, leads to short-time autocorrelation function: Veton Këpuska

  32. Autocorrelation Method rn[]=sn[]*sn[-] • Autocorrelation method leads to computation of the short-time sequence sn[m] convolved with itself flipped in time. • Autocorrelation function is a measure of the “self-similarity” of the signal at different lags . • When rn[] is large then signal samples spaced by  are said to by highly correlated. Veton Këpuska

  33. Autocorrelation Method • Properties of rn[]: • For an N-point sequence, rn[] is zero outside the interval [-(N-1),N-1]. • rn[] is even function of  • rn[0] ≥ rn[] • rn[0] – energy of sn[m] ⇒ • Ifsn[m] is a segment of a periodic sequence, then rn[] is periodic-like with the same period: • Because sn[m] is short-time, the overlapping data in the correlation decreases as  increases ⇒ • Amplitude of rn[] decreases as  increases; • With rectangular window the envelope of rn[] decreases linearly. • If sn[m] is a random white noise sequence, then rn[] is impulse-like, reflecting self-similarity only within a small negihbourhood. Veton Këpuska

  34. Autocorrelation Method Veton Këpuska

  35. Autocorrelation Method • Letting n[i,k] = rn[i-k], normal equation take the form: • The expression represents p linear equations with p unknowns, k for 1≤k≤p. • Using the normal equation solution, it can be shown that the corresponding minimum mean-squared prediction error is given by: • Matrix form representation of normal equations: Rn=rn. Veton Këpuska

  36. Autocorrelation Method • Expanded form: • The Rn matrix is Toepliz: • Symmetric about the diagonal • All elements of the diagonal are equal. • Matrix is invertible • Implies efficient solution. Rn  rn Veton Këpuska

  37. h[n] A[n] s[n] Example 5.3 • Consider a system with an exponentially decaying impulse response of the form h[n] = anu[n], with u[n] being the unit step function. • Estimate a using the autocorrelation method of linear prediction. Z Veton Këpuska

  38. Example 5.3 • Apply N-point rectangular window [0,N-1] at n=0. • Compute r0[0] and r0[1]. • Using normal equations: Veton Këpuska

  39. Example 5.3 • Minimum squared error (from slide 33) is thus (Exercise 5.5): • For 1st order predictor, as in this example here, prediction error sequence for the true predictor (i.e., 1 = a) is given by: • e[n]=s[n]-as[n-1]=[n](see example 5.1 presented earlier). Thus the prediction of the signal is exact except at the time origin. • This example illustrates that with enough data the autocorrelation method yields a solution close to the true single-pole model for an impulse input. Veton Këpuska

  40. Limitations of the linear prediction model • When the underlying measured sequence is the impulse response of an arbitrary all-pole sequence, then autocorrelation methods yields correct result. • There are a number of speech sounds that even with an arbitrary long data sequence a true solution can not be obtained. • Consider a periodic sequence simulating a steady voiced sound formed by convolving a periodic impulse train p[n] with an all-pole impulse response h[n]. • Z-transform of h[n] is given by: Veton Këpuska

  41. h[n] P Limitations of the linear prediction model • Thus • Normal equations of this system are given by (see Exercise 5.7) • Where autocorrelation of h[n] is denoted by rh[]=h[]*h[-]. • Suppose now that the system is excited with an impulse train of the period P: Veton Këpuska

  42. Limitations of the linear prediction model • Normal equations associated with s[n] (windowed over multiple pitch periods) for an order p predictor are given by: • It can be shown that rn[] is equal to periodically repeated replicas of rh[]:but with decreasing amplitude due to the windowing (Exercise 5.7). Veton Këpuska

  43. Limitations of the linear prediction model • The autocorrelation function rn[] of the windowed signal s[n] can be thought of as “aliased” version of rh[] due to overlap which introduces distortion: • When aliasing is minor the two solutions are approximately equal. • Accuracy of this approximation decreases as the pitch period decreases (e.g., high pitch) due to increase in overlap of autocorrelation replicas repeated every P samples. Veton Këpuska

  44. Limitations of the linear prediction model • Sources of error: • Aliasing increases with high pitched speakers (smaller pitch period P). • Signal is not truly periodic. • Speech not always all-pole. • Autocorrelation is a suboptimal solution. • Covariance method capable of giving optimal solution, however, is not guaranteed to converge when underlying signal does not follow an all-pole model. Veton Këpuska

  45. The Levinson Recursion of the Autocorrelation method • Direct inversion method (Gaussian elimination):requires p3 multiplies and additions. • Levinson Recursion (1947): • Requires p2 multiplies and additions • Links directly to the concatenated lossless tube model (Chapter 4) and thus a mechanism for estimating the vocal tract area function from an all-pole-model estimation. Veton Këpuska

  46. The Levinson Recursion of the Autocorrelation method • Step 1: for i=1,2,…,p • Step 2: • Step 3: • Step 4: end ki-partial correlation coefficients - PARCOR Veton Këpuska

  47. The Levinson Recursion of the Autocorrelation method • It can be shown that on each iteration that the predictor coefficients k, can be written as solely functions of the autocorrelation coefficients (Exercise 5.11). • Desired transfer function is given by: • Gain A has yet to be determined. Veton Këpuska

  48. Properties of the Levinson Recursion of the Autocorrelation method • Magnitude of partial correlation coefficients is less than 1: |ki|<1 for all i. • Condition under 1 is sufficient for stability; if all |ki|<1 then all roots of A(z) are inside the unit circle. • Autocorrelation Method gives a minimum-phase solution even when the actual system is mixed-phase. Veton Këpuska

  49. Properties of the Levinson Recursion to Autocorrelation method • Reverse Levinson Recursion:How to obtain lower level model from higher ones? • Autocorrelation matching: Let rn[] be the autocorrelation of the speech signal s[n+m]w[m] and rh[] the autocorrelation of h[n]=-1{H(z)} then: rn[] = rh[] for ||≤p Veton Këpuska

  50. Autocorrelation Method • Gain Computation: En – is the average minimum prediction error for the pth-order predictor. • If the energy in the all-pole impulse response h[m] equals the energy in the measurement sn[m] ⇒ • Squared gain equal to the minimum prediction error. Veton Këpuska

More Related