Process and Disturbance Models

Process and Disturbance Models J. McLellan - Fall 2005

Outline • Types of Models • Model Estimation Methods • Identifying Model Structure • Model Diagnostics J. McLellan - Fall 2005

The Task of Dynamic Model Building partitioning process data into a deterministic component (the process) and a stochastic component (the disturbance) time series model process disturbance transfer function model ? J. McLellan - Fall 2005

Process Model Types • non-parametric • impulse response • step response • spectrum • parametric • transfer function models • numerator • denominator • difference equation models • equivalent to transfer function models with backshift operator } technically “parametric” when in finite form (e.g., FIR) J. McLellan - Fall 2005

Impulse and Step Process Models described as a set of weights: impulse model step model Note - typically treat u(t-N) as a step from 0 - i.e., u(t-N) = u(t-N) J. McLellan - Fall 2005

Process Spectrum Model represented as a set of frequency response values, or graphically amplitude ratio frequency (rad/s) J. McLellan - Fall 2005

Process Transfer Function Models numerator, denominator dynamics and time delay time delay extra 1 step delay introduced by zero order hold and sampling - f is pure time delay zeros poles q-1 is backwards shift operator: q-1 y(t)=y(t-1) J. McLellan - Fall 2005

Model Types for Disturbances • non-parametric • “impulse response” - infinite moving average • spectrum • parametric • “transfer function” form • autoregressive (denominator) • moving average (numerator) J. McLellan - Fall 2005

ARIMA Models for Disturbances • AutoRegressive Integrated Moving Average Model • Time Series Notation • - ARIMA(p,d,q) model has • pth-order denominator - AR • qth-order numerator - MA • d integrating poles (on the unit circle) moving average component autoregressive component random shock J. McLellan - Fall 2005

ARMA Models for Disturbances Simply have no integrating component moving average component autoregressive component random shock J. McLellan - Fall 2005

Typical Model Combinations • model predictive control • impulse/step process model + ARMA disturbance model • typically a step disturbance model which can be considered as a pure integrator driven by a single pulse • single-loop control • transfer function process model + ARMA disturbance model J. McLellan - Fall 2005

Classification of Models in Identification • AutoRegressive with eXogenous inputs (ARX) • Output Error (OE) • AutoRegressive Moving Average with eXogenous inputs (ARMAX) • Box-Jenkins (BJ) • per Ljung’s terminology J. McLellan - Fall 2005

ARX Models • u(t) is the exogenous input • same autoregressive component for process, disturbance • numerator term for process, no moving average in disturbance • physical interpretation - disturbance passes through entire process dynamics • e.g., feed disturbance J. McLellan - Fall 2005

Output Error Models • no disturbance dynamics • numerator and denominator process dynamics • physical interpretation - process subject to white noise disturbance (is this ever true?) J. McLellan - Fall 2005

ARMAX Models • process and disturbance have same denominator dynamics • disturbance has moving average dynamics • physical interpretation - disturbance passing though process which enters at a point away from the input • except if C(q-1) = B(q-1) J. McLellan - Fall 2005

Box-Jenkins Model • autoregressive component plus input, disturbance can have different dynamics • AR component A(q-1) represents dynamic elements common to both process and disturbance • physical interpretation - disturbance passes through other dynamic elements before entering process J. McLellan - Fall 2005

Range of Model Types Output Error ARX ARMAX Box-Jenkins least general most general J. McLellan - Fall 2005

Model Estimation - General Philosophy Form a “loss function” which is to be minimized to obtain the “best” parameter estimates Loss function • “loss” can be considered as missed trend or information • e.g. - linear regression • loss would represent left-over trends in residuals which could be explained by a model • if we picked up all trend, only the random noise e(t) would be left • additional trends drive up the variation of the residuals • loss function is the sum of squares of the residuals (related to the variance of the residuals) J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions First, consider the linear regression model: Least Squares estimation criterion - squared prediction error at point “i” J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions The model describes how the mean of Y varies: and the variance of Y is because the random component in Y comes from the additive noise “e”. The probability density function at point “i” is where ei is the noise at point “i” J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions We can write the joint probability density function for all observations in the data set: J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions Given parameters, we can use to determine probability that a given range of observations will occur. What if we have observations but don’t know parameters? • assume that we have the most common, or “likely”, observations - i.e., observations that have the greatest probability of occurrence • find the parameter values that maximize the probability of the observed values occurring • the joint density function becomes a “likelihood function” • the parameter estimates are “maximum likelihood estimates” J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions Maximum Likelihood Parameter Estimation Criterion - J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions Given the form of the likelihood function, maximizing is equivalent to minimizing the argument of the exponential, i.e., For the linear regression case, the maximum likelihood parameter estimates are equivalent to the least squares parameter estimates. J. McLellan - Fall 2005

Linear Regression - Types of Loss Functions Least Squares Estimation • loss function is sum of squared residuals = sum of squared prediction errors Maximum Likelihood • loss function is likelihood function, which in the linear regression case is equivalent to the sum of squared prediction errors Prediction Error = observation - predicted value J. McLellan - Fall 2005

Loss Functions for Identification Least Squares “minimize the sum of squared prediction errors” The loss function is where N is the number of points in the data record. J. McLellan - Fall 2005

Least Squares Identification Example Given an ARX(1) process+disturbance model: the loss function can be written as J. McLellan - Fall 2005

Least Squares Identification Example In matrix form, and the sum of squares prediction error is J. McLellan - Fall 2005

Least Squares Identification Example The least squares parameter estimates are: Note that the disturbance structure in the ARX model is such that the disturbance contribution appears in the formulation as a white noise additive error --> satisfies assumptions for this formulation. J. McLellan - Fall 2005

Least Squares Identification • ARX models fit into this framework • Output Error models - or in difference equation form: violates least squares assumptions of independent errors J. McLellan - Fall 2005

Least Squares Identification Any process+disturbance model other than the ARX model will not satisfy the structural requirements. Implications? • estimators are not consistent - don’t asymptotically tend to true values of parameters • potential for bias J. McLellan - Fall 2005

Prediction Error Methods Choose parameter estimates to minimize some function of the prediction errors. For example, for the Output Error Model, we have Use a numerical optimization routine to obtain “best” estimates. prediction error prediction J. McLellan - Fall 2005

Prediction Error Methods AR(1) Example - Use model to predict one step ahead given past values: This is an optimal predictor when e(t) is normally distributed, and can be obtained by taking the “conditional expectation” of y(t) given information up to and including time t-1. e(t) disappears because it has zero mean and adds no information on average. “one step ahead predictor” J. McLellan - Fall 2005

Prediction Error Methods Prediction Error for the one step ahead predictor: We could obtain parameter estimates to minimize sum of squared prediction errors: same as Least Squares Estimates for this ARX example J. McLellan - Fall 2005

Prediction Error Methods What happens if we have an ARMAX(1,1) model? One step ahead predictor is: But what is e(t-1)? • estimate it using measured y(t-1) and estimate of y(t-1) J. McLellan - Fall 2005

Prediction Error Methods Note that estimate of e(t-1) depends on e(t-2), which depends on e(t-3), and so forth • eventually end up with dependence on e(0), which is typically assumed to be zero • “conditional” estimates - conditional on assumed initial values • can also formulate in a way to avoid conditional estimates • impact is typically negligible for large data sets • during computation, it isn’t necessary to solve recursively all the way back to the original condition • use previous prediction to estimate previous prediction error J. McLellan - Fall 2005

Prediction Error Methods Formulation for General Case - given a process plus disturbance model: we can write so that the prediction is: The random shocks are estimated as J. McLellan - Fall 2005

Prediction Error Methods Putting these expressions together yields which is of the form The prediction error for use in the estimation loss function is J. McLellan - Fall 2005

Prediction Error Methods How does this look for a general ARMAX model? Getting ready for the prediction, we obtain J. McLellan - Fall 2005

Prediction Error Methods Note that the ability to estimate the random shocks depends on the ability to invert C(q-1) • invertibility discussed in moving average disturbances • ability to express shocks in terms of present and past outputs - convert to an infinite autoregressive sum Note that the moving average parameters appear in the denominator of the prediction • the model is nonlinear in the moving average parameters, and conditionally linear in the others J. McLellan - Fall 2005

Likelihood Function Methods Conditional Likelihood Function • assume initial conditions for outputs, random shocks • e.g., for ARX(1), values for y(0) • e.g., for ARMAX(1,1), values for y(0), e(0) General argument - • form joint distribution for this expressionover all times • find optimal parameter values to maximize likelihood normally distributed, zero mean, known variance J. McLellan - Fall 2005

Likelihood Function Methods Exact Likelihood Function Note that we can also form an exact likelihood function which includes the initial conditions • maximum likelihood estimation procedure estimates parameters AND initial conditions • exact likelihood function is more complex In either case, we use a numerical optimization procedure to solve for the maximum likelihood estimates. J. McLellan - Fall 2005

Likelihood Function Methods Final Comment - • derivation of likelihood function requires convergence of moving average, autoregressive elements • moving average --> invertibility • autoregressive --> stability Example - Box-Jenkins model:` can be re-arranged to yield the random shock inverted AR component inverted MA component J. McLellan - Fall 2005

Model-Building Strategy • graphical pre-screening • select initial model structure • estimate parameters • examine model diagnostics • examine structural diagnostics • validate model using additional data set modify model and re-estimate as required } J. McLellan - Fall 2005

Example - Debutanizer Objective - fit a transfer function +disturbance model describing changes in bottoms RVP in response to changes in internal reflux Data • step data • slow PRBS (switch down, switch up, switch down) J. McLellan - Fall 2005

Graphical Pre-Screening • examine time traces of outputs, inputs, secondary variables • are there any outliers or major shifts in operation? • could there be a model in this data? • engineering assessment • should there be a model in this data? J. McLellan - Fall 2005

Selecting Initial Model Structure • examine auto- and cross-correlations of output, input • look for autoregressive, moving average components • examine spectrum of output • indication of order of process • first-order • second-order underdamped - resonance • second or higher order overdamped J. McLellan - Fall 2005

Selecting Initial Model Structure... • examine correlation estimate of impulse or step response • available if input is not a step • what order is the process ? • 1st order, 2nd order over/underdamped • size of the time delay J. McLellan - Fall 2005

Process and Disturbance Models

Process and Disturbance Models

Presentation Transcript

Software Process and Models

Regulatory Process and Other Models

Process Models

Process Models

Process Models

Disturbance and Succession

Process Models

Specialized Process Models

Process Models

Disturbance

Software Process Models

Disturbance and development

Disturbance

Hidden Process Models

Disturbance and Succession

Disturbance

Process Models

Hidden Process Models