Making the most of process information via multiscale and bayesian methods
Download
1 / 48

bakshi - PowerPoint PPT Presentation


  • 300 Views
  • Updated On :

Making the Most of Process Information via Multiscale and Bayesian Methods Bhavik R. Bakshi Department of Chemical Engineering Ohio State University Columbus, OH 43210 CPACT Conference, Edinburgh, April 25-26, 2002 Overview of Research Group

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'bakshi' - paul2


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Making the most of process information via multiscale and bayesian methods l.jpg

Making the Most of Process Information viaMultiscale and Bayesian Methods

Bhavik R. Bakshi

Department of Chemical Engineering

Ohio State University

Columbus, OH 43210

CPACT Conference, Edinburgh, April 25-26, 2002


Overview of research group l.jpg
Overview of Research Group

  • Goal: Develop tools and techniques for efficient and sustainable process engineering

  • Projects focus on process and global scales

  • Process scale

    • Multiscale and Bayesian methods for extracting knowledge from process data

  • Global scale

    • Economically and ecologically conscious process engineering

  • Develop rigorous and systematic methods and explore their applications


Motivation for multiscale and bayesian methods l.jpg
Motivation for Multiscale and Bayesian Methods

  • Processes and data are usually multiscale in nature

    • Events and features at multiple scales

    • Multirate measurements

    • Autocorrelated stochastic processes

  • Variety of process knowledge and information available

    • Measured data

    • Fundamental, empirical or heuristic knowledge

  • Single-scale and non-Bayesian methods lead to

    • Inferior analysis and modeling

    • Inefficient computation and use of available information

    • Disintegrated operation

  • Multiscale and Bayesian methods can perform better


Multiphase flow l.jpg
Multiphase Flow

  • Flow regimes in fluidized bed

  • Partial models and data are available for each regime

intensity

time

Homogeneous Flow

Heterogeneous Flow

Slug Flow


Sheet and film manufacturing l.jpg
Sheet and Film Manufacturing

  • Different sampling interval in each channel

  • Dynamic models are also available

sensor

direction

machine direction


Chemical process operation l.jpg
Chemical Process Operation

Planning

Scheduling

Planning

Scheduling

Supervisory Control

Supervisory Control

Monitoring and Diagnosis

Monitoring and Diagnosis

Regulatory Control

Regulatory Control

Data Acquisition

Data Acquisition

Process

Process

  • Efficient operation requires reasoning at different scales

  • Process data and knowledge are available


Objectives l.jpg
Objectives

  • Develop methods for efficient process operation that can exploit

    • Multiscale nature of processes

    • All available process data and knowledge

  • Focus on the following tasks

    • Process Monitoring

    • Fault Diagnosis

    • Empirical Modeling

    • Data Rectification and Estimation

    • Analysis of complex chemical and biological systems

  • Integrate process operation tasks


Outline l.jpg
Outline

  • Introduction to

    • Bayesian methods

    • Wavelet analysis

  • General Approach for Multiscale Methods

  • Fault Detection and Diagnosis

    • MSPCA, MSART

  • Empirical Modeling

    • Bayesian PCA, Bayesian Latent Variable Regression

  • Dynamic Data Rectification

    • Linear systems with and without accurate models

    • Nonlinear systems

  • Approaches are general and broadly applicable to variety of modeling and analysis tasks


Bayesian estimation l.jpg
Bayesian Estimation

Prior knowledge,

P(H)

Rev. Thomas Bayes

1702-1761

Bayesian

estimate, H

Posterior,P(H|D)

(Current Belief)

^

Info. from data,

P(D|H)

(New Belief)

(New Information)

Loss

Function

(Select sample

from posterior)

  • Statistical framework for combining priorknowledge with empirical observations

  • Posterior becomes prior at next time

  • Bayes Rule, P(H | D) =P(D | H) P(H)

  • P(D)


Illustration of bayesian estimation l.jpg
Illustration of Bayesian Estimation

  • P(H|D) 1 as t

t

t=1

t=2

t=3

...

Prior

Posterior

Posterior/

Prior

Posterior/

Prior

Posterior/

Prior

Data

Data

Data

  • A newly born baby sees the sun setting and wonders, “Will it be back?” (Malakoff, 1999)

  • Prior knowledge: sun may or may not rise, P(H) = 0.5

  • Data obtained everyday = Sun rises

  • Posterior at t=k becomes prior at t=k+1


Challenges in bayesian analysis l.jpg
Challenges in Bayesian Analysis

  • Need distributions for prior and likelihood

  • Bad prior can give slow convergence and misleading answer

  • Gaussian densities are mathematically convenient but may not represent reality

  • Can be computationally expensive, particularly for non-Gaussian densities

  • Potential solutions

    • Use Empirical Bayes methods - estimate prior from measured data

    • Combine Bayesian analysis with Multiscale analysis

    • Markov Chain Monte Carlo methods


Multiscale nature of variables l.jpg
Multiscale Nature of Variables

Equipment

degradation

w

Sensor failure

Noise

Sensor failure

Disturbance

Equipment failure

Equipment

  • Delta functions

  • Fourier Transform

degradation

  • Linear Filters

  • Wavelet Transform

Disturbance

Noise

0

20

40

60

80

100

Equipment failure

time, t

Process Signal

0

20

40

60

80

Time

100

w

t


Wavelets l.jpg
Wavelets

Haar wavelet

Haar scaling function

m=1, k=0

m=2, k=4

m=1, k=0

m=2, k=4

(x)

y(x)

x

Daubechies-6 scaling function

Daubechies-6 wavelet

(x)

y(x)

x

  • Family of basis functions of fixed shape

  • Translations and dilations of mother wavelet

    ymk(x) = 2-m/2y(2-mx - k) m, k are integers


Wavelet decomposition l.jpg
Wavelet Decomposition

G

H

m=1

G

H

m=2

w

Original signal

m=0

t

Scaled signals, ym

Wavelet Transform/Detail signal, dm


Properties of wavelets l.jpg
Properties of Wavelets

  • Represents signals and functions as

    • y(t) = SSdmkymk(t) + SyLkfLk(t)

  • Localized in time and frequency

    • Deterministic features are captured by few large coefficients

  • Approximate eigenfunctions

    • Stochastic processes are approximately decorrelated

  • Can be orthonormal

    • Fast computation, O(N)

  • Extended to libraries of basis functions

    • Wavelet packets, cosine packets, etc.


Multiscale feature extraction l.jpg
Multiscale Feature Extraction

Original Signal

Wavelet Coef. m=1

Wavelet Coef. m=2

Wavelet Coef. m=3

Scaled Coef. m=3

Threshold & Reconstruct


Analysis of stochastic processes l.jpg
Analysis of Stochastic Processes

  • Wavelet coefficients are approximately uncorrelated and Gaussian

    ARIMA ACF PDF

    Original, y0

    Wavelet

    coeffs.,

    d1, d2

    Last Scaled

    Signal, y2


Process operation tasks l.jpg
Process Operation Tasks

  • Process Monitoring / Fault Detection

    • Detect abnormal operation from measured data

  • Empirical Modeling

    • Determine relationship between variables based on measured data

  • Data Rectification

    • Clean measured data by removing errors and satisfying process models


General multiscale methodology l.jpg
General Multiscale Methodology

coarse

Operate on HLX

.

.

.

.

Operate on GLX

.

.

^

^

WT

X, q

X

W

Operate on GmX

fine

Operate on G1X

  • Convert traditional to multiscale methods (Bakshi, 1999)

  • Can use models at each scale and across scales


Multiscale statistical process control bakshi 1998 aradhye et al 2000a b l.jpg
Multiscale Statistical Process Control(Bakshi, 1998, Aradhye et al., 2000a, b)

  • SPC detects abnormal behavior from measured data

  • Lacks generality, best for certain types of changes

    • Shewhart charts for large shifts

    • CUSUM, EWMA for small shifts

  • Assumes uncorrelated measurements

  • Multivariate SPC reduces dimensionality by linear or nonlinear modeling

  • Normal and abnormal behavior usually occur at different scales

    • MSSPC should perform better


Detecting mean shift by msspc l.jpg
Detecting Mean Shift by MSSPC

8

4

0

-4

3

0

6

3

-3

W

WT

0

0

3

-4

-2

60

140

140

0

40

80

100

120

60

80

100

120

20

0

20

40

0

time

-3

4

0

-4

  • Uncorrelated data with mean shift of 2s

  • First shift detection at scale m=2

  • Current shift detection in last scaled signal


Example of univariate msspc l.jpg
Example of Univariate MSSPC

SPC

MSSPC

  • Mean shift of size 5 in iid Gaussian measurements

  • MSSPC detection limits adapt to signal features


General framework for spc l.jpg
General Framework for SPC

  • Existing SPC filters operate at different fixed scales

  • MSSPC subsumes existing methods

CUSUM

Shewhart

MA

EWMA

Haar

Daubechies-4, boundary corrected


Library of msspc filters l.jpg
Library of MSSPC Filters

Moving Avg.

CUSUM

Moving Avg.

Shewhart


Multivariate spc l.jpg
Multivariate SPC

x2

X2

Normal

PC1

*

*

*

*

*

PC2

***

*

**

*

*

*

*

**

**

*

**

*

*

**

*

*

**

+

**

+

**

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

Abnormal

x1

X1

  • Univariate charts are inconvenient for multivariate tasks

  • Multivariate modeling reduces dimensionality

    • Linear modeling (PCA, PLS)

    • Nonlinear (clustering, NLPCA)

  • Detect changes in transformed space


Clustering with art l.jpg
Clustering with ART

Typical process data

  • Features of Adaptive Resonance Theory (ART)

    • Adaptive clustering

    • Inspired by neural networks (Carpenter and Grossberg)

  • Useful for change detection and diagnosis

X2

Normal

*

*

*

*

*

***

*

**

*

*

*

*

**

**

*

**

*

*

**

*

*

**

+

**

+

**

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

Known operational event

X1


Msspc industrial validation l.jpg
MSSPC - Industrial Validation

  • Case Studies

    • Change in Furnace Feed

    • Valve Leak Malfunction

    • Cold Weather Malfunction

    • Feed Malfunction

  • Event start and end determined with operator input

  • Cannot perform ARL analysis

  • Plot “Missed Alarm Rate” versus “False Alarm Rate” for different detection parameters

  • Better method has smaller missed alarm rate for same number of false alarms


Data valve leak malfunction l.jpg
Data - Valve Leak Malfunction

  • Three redundant sensors


Performance valve leak l.jpg
Performance - Valve Leak

  • Multiscale methods do better

ART

PCA

Missed

Alarm

Rate

MSART

MSPCA

False Alarm Rate


Msart vs operator valve leak l.jpg
MSART vs. Operator - Valve Leak

  • MSART detects leak ~ 200 minutes before operator

Abnormal

Operator

Normal

Time step (minutes)


Data cold weather event l.jpg
Data - Cold Weather Event

  • Valve failure due to low ambient temperature

  • Single measured variable


Performance cold weather event l.jpg
Performance - Cold Weather Event

  • Approximately stationary and Gaussian data

  • MSPCA does best

ART

Missed

Alarm

Rate

MSART

PCA

MSPCA

False Alarm Rate


Msspc summary l.jpg
MSSPC - Summary

  • MSSPC provides better average performance for a variety of types and magnitudes of faults

  • Recommended when nature of features representing process change is unknown

  • If type of feature to be detected is known a priori, better to use traditional methods

  • Extension to reduce user-defined parameters, and to bigger library of basis functions is in progress

  • Bayesian MSSPC can do better, but requires probability of faults


Linear regression l.jpg
Linear Regression

  • All methods determine a model of the form

    Y = Zb

  • Inputs, Z, may be combined to form latent variables, T, in reduced dimension space (PCA, PLS)

    T = ZP

  • Latent Variable Regression (LVR) model

    Y = ZPb

  • Ideal method

    • Handles collinear variables

    • Accounts for errors in both input and output variables

    • Integrates regression and filtering

    • Incorporates external information and multiscale behavior

^

^

^

^

^

^


Bayesian pca and lvr l.jpg
Bayesian PCA and LVR

  • Maximize posterior

    P(T, P, r, b|Z, Y) = P(Z, Y |T, P, b, r) P(T, P, r, b)

  • Approach

    • Solve conventional regression problem

    • Estimate prior from conventional solution

    • Solve Bayesian regression problem by iterating between

      • Rectification to estimate T, P

      • Parameter estimation to obtain b

  • Assumptions

    • Noise and underlying measurements are Gaussian

    • Regression parameters are Gaussian

    • Rank is known

^

^


Bpca example l.jpg
BPCA - Example

  • Three correlated variables

    u3 = u1 + u2; u1 ~ N(3,1); u2 ~ N(1,4)

  • Measurements corrupted by additive Gaussian noise

    Z = U + e

  • MSE for 100 realizations

  • Smaller coeffs. MSE for higher dimensional problems

Method Prior Inputs Coeffs.

PCA uniform 2.72 0.187

MLPCA uniform 2.11 0.093

BPCA empirical 1.40 0.092

BPCA exact 1.22 0.000


Slide37 l.jpg

BLVR - Example

  • Three correlated variables

    u3 = u1 + u2; u1 ~ N(3,2); u2 ~ N(1,4)

  • Noise-free output

    x = 0.8u1 + 0.8u2

  • Measurements corrupted by additive Gaussian noise

    y = x + ex; Z = U + eu

  • MSE for 100 realizations

Method Prior Inputs Outputs Coeffs.

OLS uniform 1.32 0.66 0.010

PLS uniform 1.18 0.71 0.012

BLVR empirical 0.69 0.60 0.007

BLVR exact 0.66 0.55 0.000


Bayesian regression summary l.jpg
Bayesian Regression - Summary

  • Bayesian approach can improve PCA and LVR without additional data

  • Can deal with

    • Errors in all variables

    • Correlated variables

    • External information

  • Prior knowledge may be obtained from

    • Data being modeled, via empirical Bayes approach

    • Historical data

  • Many opportunities for further work


Data rectification and estimation l.jpg
Data Rectification and Estimation

  • Estimate measured variables and unknown quantities

  • Bayesian problem formulation

    Given y1:k = {y1, y2, ..., yk}

    maximize P(xk|y1:k)

    subject to xk = fk-1(xk-1, wk-1) state eqn.

    yk = hk(xk, vk) measurement eqn.

    g1(xk) = 0 equality constr.

    g2(xk) ≥ 0 inequality constr.

  • Existing methods rely on many assumptions

.


Existing methods for nddr l.jpg
Existing Methods for NDDR

  • Extended Kalman Filtering (Jazwinski, 1970)

    • Assumes fixed Gaussian distributions,

    • Uses linearized models

    • Cannot satisfy constraints

  • Moving Horizon Estimation (Robertson, Lee and Rawlings, 1996; Rao and Rawlings, 2002)

    • Satisfies constraints

    • Assumes fixed Gaussian distributions

    • Computationally expensive due to non-recursive solution

  • Existing methods solve the convenient NDDR problem, not the real one

  • Actual probability distributions are infinite dimensional and change in size and shape


Evolution of probability distributions l.jpg
Evolution of Probability Distributions

  • Evolution of posterior for popular adiabatic CSTR

  • Gaussian approximation is even more inaccurate with constraints


Results of cstr example l.jpg
Results of CSTR Example

  • Perfect initial guess

  • 100 realizations, 1600 measurements /realization,

    500 samples/realization

  • Work in progress

  • Relevant to model predictive control, Bayesian neural networks, etc.


Rectification without accurate models l.jpg
Rectification without Accurate Models

^

y

x

-1

-1

m

m

  • Most processes are dynamic but lack accurate models

  • Wavelet representation captures dynamics in variation of variance across scales

    w(m) ~ N(0,Pd ); Ax(m) = Hm; Bx(m) = Gm

  • Rectify coefficients at each scale (Bakshi et al., 2001)

    dm = Km(CTRm dm + Pd md )

  • Features of multiscale approach

    • More accurate than single-scale approaches

    • More computationally efficient since scales with less information can be identified before rectification

m


Example l.jpg
Example

  • Level control process (Bellingham and Lees, 1977)

    hk+1 = 0.995 -0.1373 hk + 0.00012 0 F3k

    xk+1 0 1 xk 0 1 ek

  • F3k and ek are iid Gaussian

  • [hk xk F3k ek] are corrupted by iid Gaussian noise

  • None None 1.00

  • Max. Likelihood Steady state 0.67

  • Single scale Bayes Steady state 0.40

  • Multiscale Bayes Steady state 0.06

  • Single scale Bayes Dynamic 0.05

  • Multiscale Bayes Dynamic 0.03

Method Model MSE


Data rectification summary l.jpg
Data Rectification - Summary

  • Existing approaches to nonlinear estimation and rectification requires assumptions

    • Gaussian noise, prior

    • Non-time varying distributions

  • Assumptions are readily violated

  • Proposed approach relies on Monte Carlo sampling

    • More accurate that existing methods

    • Computationally less expensive than MHE

  • Many opportunities for further work


Summary l.jpg
Summary

  • Large amounts of measured data and process knowledge are available

  • Existing methods do not make the most of available data and knowledge

    • Processes are multiscale, but methods are single-scale

    • Fundamental models and partial knowledge are underutilized

  • Developed new multiscale and Bayesian methods for,

    • Fault detection and diagnosis,

    • Dynamic data rectification, and

    • Empirical modeling

  • Significant opportunities for future research and applications


Future work l.jpg
Future Work

  • Nonlinear dynamic data rectification

  • Bayesian nonlinear regression/neural networks

  • Estimation of multirate systems and missing data

  • Integrated rectification, monitoring, diagnosis, and supervision

  • Bioinformatics and genomics

  • Process scale-up


Acknowledgments l.jpg
Acknowledgments

  • Graduate students and post-docs

    • Prof. Sridhar Ungarala

    • Dr. Hrishikesh Aradhye

  • Collaborators

    • Prof. Prem K. Goel

    • Dr. Manabu Kano

  • Financial Support

    • National Science Foundation (CTS 9733627)

    • Abnormal Situation Management Consortium

    • Du Pont Education Fund

    • Technical Association of Pulp and Paper Industry

    • American Chemical Society - Petroleum Research Fund

  • Dr. Mohamed Nounou

  • Mr. Wen-Shiang Chen

  • Prof. Xiaotong Shen