Lectures 12&13: Persistent Excitation for Off-line and On-line Parameter Estimation

Lectures 12&13: Persistent Excitation for Off-line and On-line Parameter Estimation Dr Martin Brown Room: E1k Email: martin.brown@manchester.ac.uk Telephone: 0161 306 4672 http://www.eee.manchester.ac.uk/intranet/pg/coursematerial/

Outline 13&14 • Persistent excitation and identifiability • Structure of XTX • Role of signal magnitude • Role of signal correlation • Types of system identification signals for experimental design) • On-line estimation and persistent excitation • On-line persistent excitation • Time-varying parameters • Exponential Recursive Least Squares (RLS)

Resources 13&14 • Core reading • Ljung chapter 13 • On-line notes, chapter 5 • Norton, Chapter 8

Central Question: Experimental Design • An important part of system identification is experimental design • Experimental design is involved with answering the question of how experiments should be constructed to to maximise the information collected with the minimum amount of effort/cost • For system identification, this corresponds to how the input/control signal injected into the plant should be chosen to best identify the parameters • N.B. This is relative to the model structure (i.e. different model structures will have different optimal model designs).

What is Persistent Excitation • Persistent excitation refers to the design of a signal, u(t), that produces estimation data D={X,y} which is rich enough to satisfactorily identify the parameters • The parameter accuracy/covariance is determined by: • Ideally, and E(xi2)>>sy2 • The variance/covariance can be made smaller (better) by: • Reducing the measurement error variance (hard) • Collecting more data (but this often costs money) • Make the signals larger (but there are physical limits) • Make the signals independent (difficult for dynamics)

Review: XTX Matrix • The variance/covariance matrix, XTX, (and its inverse) is central in many system identification/parameter estimation tasks • Consider a model

Identifying Parameters • For a set of measured exemplars D={X,y}, there are several (related) concepts that determine how well the parameters can be estimated (off-line, in batch mode) • 1 , i.e. how well can the parameters be identified or equivalently, what is the region of uncertainty about the estimated values q. • Is (XTX) non-singular? i.e. can the normal equations be solved uniquely • Are the parameter estimates significantly non-zero? • All of these are related and influenced/determined by how the input data X is generated/collected. ^

Example: Signal Magnitude & Noise • Consider feeding steps of magnitude 0.01, 0.1 and 1 into the first order, electrical circuit with • The magnitude of the signals strongly influences the identifiability of the parameters. Typically, each signal should be of similar magnitude and high in relation to the measurement noise.

Example: Signals Interactions • Consider collecting data from a model of the form: • Each input is ui(t) = sin(0.5t), 20 samples: • Note that X = [u1u2] is singular • Now consider u1(t) = sin(0.5t), u2(t) = cos(0.5t), E(u1u2)0, E(ui2)=c • The input signals are ~orthogonal • This is difficult with feedback …

Good and Bad Covariance Matrices • Ideal structure of (XTX)-1 is • which means that: • Each parameter has the same variance, and the estimates are uncorrelated. In addition, if E(xi2)>>sy2, the parameter variances are small. • Each parameter can be identified to the same accuracy • For modelling and control, we want to feed an input signal in produces a matrix with these properties.

Well determined l = 12.8 l = 0.52 poorly determined How to Measure Goodness? • There are several ways to assess/compare how good a particular signal is: • Cond(XTX) = lmax/lmin • This measures the ratio of the maximum signal to the minimum signal correlations • Smaller Cond(XTX) is better • Cond(I) = 1 • Choose u to minu Cond(XTX) • Insensitive to the signal • magnitude, just measures the • degree of correlation

u(t) y(t) Signal Correlation and Dynamics • So far, we have just discussed choosing input signals that are uncorrelated/orthogonal • However, dynamics/feedback introduce correlation between individual signals (i.e. between u(t) and u(t-1) and y(t-1) and y(t)): • E(y(t-1)u(t))  0 • This is because y(t) is related to u(t), especially when they change slowly • A stable plant will track (correlate • with) the input signal • Condition will be worse

Example 1: Impulse/Step Signal u(t) u(t) u(t) • Any linear system is completely identified by its impulse (or step) response – because convolution can be used to calculate the output. • However, as shown in Slide 8, there are several aspects that may make this identification difficult • Magnitude of the step signal (relative to the noise) & impulse • Length of the transient period, relative to the steady state • Generation of the impulse/step signal which may be infeasible due to control magnitude and/or actuator dynamics limits • High correlation between u(t) and u(t-k), steady state adds little • Note that if the plant model is non-linear, an impulse/step only collects information at one operating point, so if the aim is to reject non-linear components, step/impulse trains of different amplitudes must be used t t t

Example 2: Sinusoidal Signal • While a sinusoid may look to be a rich enough signal to identify linear models • It can be used to identify the gain margin and phase advance for one particular frequency • However, can only be used when the maximum control delay is 1, because • u(t) = q1u(t-1) + q2u(t-2) • Similar for the output feedback delay as well (because in the steady state, the output is also sinusoidal).

Example 3: Random Signal • A random signal is persistently exciting for a linear model of any order • It involves a range of amplitudes and so can be used for non-linear terms as well. However, • It is a bit of a “scatter gun” approach • It can be wasteful when the model structure is reasonably well-known • There may be limits on the actuator dynamics • Difficult to use on-line, where the control action is “smooth”

On-Line Parameter Estimation • So far, it has been assumed that the parameter estimation is being performed off-line • Collect a fixed size data set • Estimate the parameters • Issues of parameter identifiability are related to a fixed data set • On-line parameter estimation is more complex • Typically a plant is controlled to a set-point for a long period of time • The recursive calculation is often re-set after fixed intervals (re-set floating point errors) • Sometimes need to track time-varying parameters

Time Varying Parameters • One reason for considering on-line/recursive parameter estimation is to model systems where the linear parameters vary slowly with time • Common parameter changes are step or slow drifts • The aim is to treat the systems as slowly changing, and the model must be kept “plastic enough” to respond to changes in the parameters • Note that, strictly speaking, this is now a non-linear system where the dynamics of the parameters are much slower than the dynamics of the system’s states.

Long Term Convergence & Plasticity • Using either the normal equations or the equivalent on-line, recursive version, when the amount of data increases, the parameter estimates tend to the true values and the effect of a new datum is close to zero. • To model parametric drifts, the parameter estimates must include a term that makes the model more dependent on recent large residuals • This can be achieved by defining a modified performance function where the residuals are weighted by a time decay factor

Exponential RLS • Form the new input vector x(t+1) using the new data • Form e(t+1) from the model using • Form P(t+1) using • Update the least squares estimate • Proceed with next time step

Example: Exponential RLS • Consider the first order electrical circuit example • Here a and k are functions of time and both linearly vary between 1 and 2 during the length of the simulation • Input signal is sinusoidal and noise N(0,0.01) is added • There is a balance between noise filtering and model/parameter plasticity

Parameter Convergence & Persistent Excitation • While this algorithm is relatively simple, it has two important, related aspects that must be considered • What is the value of l? • What form of persistently exciting input is needed? • When l is 1, this is just standard RLS estimation. • When l<0.9, the model is extremely adaptive and the parameters will not generally converge when the measurement noise is significant • As the model becomes more plastic, the input signal must be sufficiently persistently exciting over every significant time window to stop random parameter drift/premature convergence

Summary 13&14 • The engineer’s aim is to minimise the amount of data collected to identify the parameters sufficiently accurately • Signal magnitude should be as large as possible to improve the signal/noise ratio and to minimize the parameter covariances. However, the signal should not to large enough to violate any system constraints or to make the unknown system significantly non-linear • Signal type & frequency must be smooth enough not to exceed any dynamic constraints, however the dynamics must excite any potential dynamics. • When parameter estimation is on-line, this imposes additional constraints as the signals must be sufficiently exciting for each time period • Exponential-forgetting can be used to track time-varying parameters, but previous comments must hold

Laboratory 13&14 • 1. Prove Slide 14 relationship for a sin function – what are q1 and q2 • 2. Measure the Cond(XTX) and the parameter estimates for: • Step • Sin • Random • for the electrical simulation. Try varying the magnitudes of the step signal as well. • 3. Implement the exponential RLS for the electrical simulation for time-varying parameters on Slide 20. Try changing the input/control signals and compare the responses.

Lectures 12&13: Persistent Excitation for Off-line and On-line Parameter Estimation