making the most of process information via multiscale and bayesian methods l.
Download
Skip this Video
Download Presentation
Making the Most of Process Information via Multiscale and Bayesian Methods

Loading in 2 Seconds...

play fullscreen
1 / 48

Making the Most of Process Information via Multiscale and Bayesian Methods - PowerPoint PPT Presentation


  • 302 Views
  • Uploaded on

Making the Most of Process Information via Multiscale and Bayesian Methods Bhavik R. Bakshi Department of Chemical Engineering Ohio State University Columbus, OH 43210 CPACT Conference, Edinburgh, April 25-26, 2002 Overview of Research Group

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Making the Most of Process Information via Multiscale and Bayesian Methods' - paul2


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
making the most of process information via multiscale and bayesian methods

Making the Most of Process Information viaMultiscale and Bayesian Methods

Bhavik R. Bakshi

Department of Chemical Engineering

Ohio State University

Columbus, OH 43210

CPACT Conference, Edinburgh, April 25-26, 2002

overview of research group
Overview of Research Group
  • Goal: Develop tools and techniques for efficient and sustainable process engineering
  • Projects focus on process and global scales
  • Process scale
    • Multiscale and Bayesian methods for extracting knowledge from process data
  • Global scale
    • Economically and ecologically conscious process engineering
  • Develop rigorous and systematic methods and explore their applications
motivation for multiscale and bayesian methods
Motivation for Multiscale and Bayesian Methods
  • Processes and data are usually multiscale in nature
    • Events and features at multiple scales
    • Multirate measurements
    • Autocorrelated stochastic processes
  • Variety of process knowledge and information available
    • Measured data
    • Fundamental, empirical or heuristic knowledge
  • Single-scale and non-Bayesian methods lead to
    • Inferior analysis and modeling
    • Inefficient computation and use of available information
    • Disintegrated operation
  • Multiscale and Bayesian methods can perform better
multiphase flow
Multiphase Flow
  • Flow regimes in fluidized bed
  • Partial models and data are available for each regime

intensity

time

Homogeneous Flow

Heterogeneous Flow

Slug Flow

sheet and film manufacturing
Sheet and Film Manufacturing
  • Different sampling interval in each channel
  • Dynamic models are also available

sensor

direction

machine direction

chemical process operation
Chemical Process Operation

Planning

Scheduling

Planning

Scheduling

Supervisory Control

Supervisory Control

Monitoring and Diagnosis

Monitoring and Diagnosis

Regulatory Control

Regulatory Control

Data Acquisition

Data Acquisition

Process

Process

  • Efficient operation requires reasoning at different scales
  • Process data and knowledge are available
objectives
Objectives
  • Develop methods for efficient process operation that can exploit
    • Multiscale nature of processes
    • All available process data and knowledge
  • Focus on the following tasks
    • Process Monitoring
    • Fault Diagnosis
    • Empirical Modeling
    • Data Rectification and Estimation
    • Analysis of complex chemical and biological systems
  • Integrate process operation tasks
outline
Outline
  • Introduction to
    • Bayesian methods
    • Wavelet analysis
  • General Approach for Multiscale Methods
  • Fault Detection and Diagnosis
    • MSPCA, MSART
  • Empirical Modeling
    • Bayesian PCA, Bayesian Latent Variable Regression
  • Dynamic Data Rectification
    • Linear systems with and without accurate models
    • Nonlinear systems
  • Approaches are general and broadly applicable to variety of modeling and analysis tasks
bayesian estimation
Bayesian Estimation

Prior knowledge,

P(H)

Rev. Thomas Bayes

1702-1761

Bayesian

estimate, H

Posterior,P(H|D)

(Current Belief)

^

Info. from data,

P(D|H)

(New Belief)

(New Information)

Loss

Function

(Select sample

from posterior)

  • Statistical framework for combining priorknowledge with empirical observations
  • Posterior becomes prior at next time
  • Bayes Rule, P(H | D) =P(D | H) P(H)
  • P(D)
illustration of bayesian estimation
Illustration of Bayesian Estimation
  • P(H|D) 1 as t

t

t=1

t=2

t=3

...

Prior

Posterior

Posterior/

Prior

Posterior/

Prior

Posterior/

Prior

Data

Data

Data

  • A newly born baby sees the sun setting and wonders, “Will it be back?” (Malakoff, 1999)
  • Prior knowledge: sun may or may not rise, P(H) = 0.5
  • Data obtained everyday = Sun rises
  • Posterior at t=k becomes prior at t=k+1
challenges in bayesian analysis
Challenges in Bayesian Analysis
  • Need distributions for prior and likelihood
  • Bad prior can give slow convergence and misleading answer
  • Gaussian densities are mathematically convenient but may not represent reality
  • Can be computationally expensive, particularly for non-Gaussian densities
  • Potential solutions
    • Use Empirical Bayes methods - estimate prior from measured data
    • Combine Bayesian analysis with Multiscale analysis
    • Markov Chain Monte Carlo methods
multiscale nature of variables
Multiscale Nature of Variables

Equipment

degradation

w

Sensor failure

Noise

Sensor failure

Disturbance

Equipment failure

Equipment

  • Delta functions
  • Fourier Transform

degradation

  • Linear Filters
  • Wavelet Transform

Disturbance

Noise

0

20

40

60

80

100

Equipment failure

time, t

Process Signal

0

20

40

60

80

Time

100

w

t

wavelets
Wavelets

Haar wavelet

Haar scaling function

m=1, k=0

m=2, k=4

m=1, k=0

m=2, k=4

(x)

y(x)

x

Daubechies-6 scaling function

Daubechies-6 wavelet

(x)

y(x)

x

  • Family of basis functions of fixed shape
  • Translations and dilations of mother wavelet

ymk(x) = 2-m/2y(2-mx - k) m, k are integers

wavelet decomposition
Wavelet Decomposition

G

H

m=1

G

H

m=2

w

Original signal

m=0

t

Scaled signals, ym

Wavelet Transform/Detail signal, dm

properties of wavelets
Properties of Wavelets
  • Represents signals and functions as
    • y(t) = SSdmkymk(t) + SyLkfLk(t)
  • Localized in time and frequency
    • Deterministic features are captured by few large coefficients
  • Approximate eigenfunctions
    • Stochastic processes are approximately decorrelated
  • Can be orthonormal
    • Fast computation, O(N)
  • Extended to libraries of basis functions
    • Wavelet packets, cosine packets, etc.
multiscale feature extraction
Multiscale Feature Extraction

Original Signal

Wavelet Coef. m=1

Wavelet Coef. m=2

Wavelet Coef. m=3

Scaled Coef. m=3

Threshold & Reconstruct

analysis of stochastic processes
Analysis of Stochastic Processes
  • Wavelet coefficients are approximately uncorrelated and Gaussian

ARIMA ACF PDF

Original, y0

Wavelet

coeffs.,

d1, d2

Last Scaled

Signal, y2

process operation tasks
Process Operation Tasks
  • Process Monitoring / Fault Detection
    • Detect abnormal operation from measured data
  • Empirical Modeling
    • Determine relationship between variables based on measured data
  • Data Rectification
    • Clean measured data by removing errors and satisfying process models
general multiscale methodology
General Multiscale Methodology

coarse

Operate on HLX

.

.

.

.

Operate on GLX

.

.

^

^

WT

X, q

X

W

Operate on GmX

fine

Operate on G1X

  • Convert traditional to multiscale methods (Bakshi, 1999)
  • Can use models at each scale and across scales
multiscale statistical process control bakshi 1998 aradhye et al 2000a b
Multiscale Statistical Process Control(Bakshi, 1998, Aradhye et al., 2000a, b)
  • SPC detects abnormal behavior from measured data
  • Lacks generality, best for certain types of changes
    • Shewhart charts for large shifts
    • CUSUM, EWMA for small shifts
  • Assumes uncorrelated measurements
  • Multivariate SPC reduces dimensionality by linear or nonlinear modeling
  • Normal and abnormal behavior usually occur at different scales
    • MSSPC should perform better
detecting mean shift by msspc
Detecting Mean Shift by MSSPC

8

4

0

-4

3

0

6

3

-3

W

WT

0

0

3

-4

-2

60

140

140

0

40

80

100

120

60

80

100

120

20

0

20

40

0

time

-3

4

0

-4

  • Uncorrelated data with mean shift of 2s
  • First shift detection at scale m=2
  • Current shift detection in last scaled signal
example of univariate msspc
Example of Univariate MSSPC

SPC

MSSPC

  • Mean shift of size 5 in iid Gaussian measurements
  • MSSPC detection limits adapt to signal features
general framework for spc
General Framework for SPC
  • Existing SPC filters operate at different fixed scales
  • MSSPC subsumes existing methods

CUSUM

Shewhart

MA

EWMA

Haar

Daubechies-4, boundary corrected

library of msspc filters
Library of MSSPC Filters

Moving Avg.

CUSUM

Moving Avg.

Shewhart

multivariate spc
Multivariate SPC

x2

X2

Normal

PC1

*

*

*

*

*

PC2

***

*

**

*

*

*

*

**

**

*

**

*

*

**

*

*

**

+

**

+

**

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

Abnormal

x1

X1

  • Univariate charts are inconvenient for multivariate tasks
  • Multivariate modeling reduces dimensionality
    • Linear modeling (PCA, PLS)
    • Nonlinear (clustering, NLPCA)
  • Detect changes in transformed space
clustering with art
Clustering with ART

Typical process data

  • Features of Adaptive Resonance Theory (ART)
    • Adaptive clustering
    • Inspired by neural networks (Carpenter and Grossberg)
  • Useful for change detection and diagnosis

X2

Normal

*

*

*

*

*

***

*

**

*

*

*

*

**

**

*

**

*

*

**

*

*

**

+

**

+

**

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

Known operational event

X1

msspc industrial validation
MSSPC - Industrial Validation
  • Case Studies
    • Change in Furnace Feed
    • Valve Leak Malfunction
    • Cold Weather Malfunction
    • Feed Malfunction
  • Event start and end determined with operator input
  • Cannot perform ARL analysis
  • Plot “Missed Alarm Rate” versus “False Alarm Rate” for different detection parameters
  • Better method has smaller missed alarm rate for same number of false alarms
data valve leak malfunction
Data - Valve Leak Malfunction
  • Three redundant sensors
performance valve leak
Performance - Valve Leak
  • Multiscale methods do better

ART

PCA

Missed

Alarm

Rate

MSART

MSPCA

False Alarm Rate

msart vs operator valve leak
MSART vs. Operator - Valve Leak
  • MSART detects leak ~ 200 minutes before operator

Abnormal

Operator

Normal

Time step (minutes)

data cold weather event
Data - Cold Weather Event
  • Valve failure due to low ambient temperature
  • Single measured variable
performance cold weather event
Performance - Cold Weather Event
  • Approximately stationary and Gaussian data
  • MSPCA does best

ART

Missed

Alarm

Rate

MSART

PCA

MSPCA

False Alarm Rate

msspc summary
MSSPC - Summary
  • MSSPC provides better average performance for a variety of types and magnitudes of faults
  • Recommended when nature of features representing process change is unknown
  • If type of feature to be detected is known a priori, better to use traditional methods
  • Extension to reduce user-defined parameters, and to bigger library of basis functions is in progress
  • Bayesian MSSPC can do better, but requires probability of faults
linear regression
Linear Regression
  • All methods determine a model of the form

Y = Zb

  • Inputs, Z, may be combined to form latent variables, T, in reduced dimension space (PCA, PLS)

T = ZP

  • Latent Variable Regression (LVR) model

Y = ZPb

  • Ideal method
    • Handles collinear variables
    • Accounts for errors in both input and output variables
    • Integrates regression and filtering
    • Incorporates external information and multiscale behavior

^

^

^

^

^

^

bayesian pca and lvr
Bayesian PCA and LVR
  • Maximize posterior

P(T, P, r, b|Z, Y) = P(Z, Y |T, P, b, r) P(T, P, r, b)

  • Approach
    • Solve conventional regression problem
    • Estimate prior from conventional solution
    • Solve Bayesian regression problem by iterating between
      • Rectification to estimate T, P
      • Parameter estimation to obtain b
  • Assumptions
    • Noise and underlying measurements are Gaussian
    • Regression parameters are Gaussian
    • Rank is known

^

^

bpca example
BPCA - Example
  • Three correlated variables

u3 = u1 + u2; u1 ~ N(3,1); u2 ~ N(1,4)

  • Measurements corrupted by additive Gaussian noise

Z = U + e

  • MSE for 100 realizations
  • Smaller coeffs. MSE for higher dimensional problems

Method Prior Inputs Coeffs.

PCA uniform 2.72 0.187

MLPCA uniform 2.11 0.093

BPCA empirical 1.40 0.092

BPCA exact 1.22 0.000

slide37

BLVR - Example

  • Three correlated variables

u3 = u1 + u2; u1 ~ N(3,2); u2 ~ N(1,4)

  • Noise-free output

x = 0.8u1 + 0.8u2

  • Measurements corrupted by additive Gaussian noise

y = x + ex; Z = U + eu

  • MSE for 100 realizations

Method Prior Inputs Outputs Coeffs.

OLS uniform 1.32 0.66 0.010

PLS uniform 1.18 0.71 0.012

BLVR empirical 0.69 0.60 0.007

BLVR exact 0.66 0.55 0.000

bayesian regression summary
Bayesian Regression - Summary
  • Bayesian approach can improve PCA and LVR without additional data
  • Can deal with
    • Errors in all variables
    • Correlated variables
    • External information
  • Prior knowledge may be obtained from
    • Data being modeled, via empirical Bayes approach
    • Historical data
  • Many opportunities for further work
data rectification and estimation
Data Rectification and Estimation
  • Estimate measured variables and unknown quantities
  • Bayesian problem formulation

Given y1:k = {y1, y2, ..., yk}

maximize P(xk|y1:k)

subject to xk = fk-1(xk-1, wk-1) state eqn.

yk = hk(xk, vk) measurement eqn.

g1(xk) = 0 equality constr.

g2(xk) ≥ 0 inequality constr.

  • Existing methods rely on many assumptions

.

existing methods for nddr
Existing Methods for NDDR
  • Extended Kalman Filtering (Jazwinski, 1970)
    • Assumes fixed Gaussian distributions,
    • Uses linearized models
    • Cannot satisfy constraints
  • Moving Horizon Estimation (Robertson, Lee and Rawlings, 1996; Rao and Rawlings, 2002)
    • Satisfies constraints
    • Assumes fixed Gaussian distributions
    • Computationally expensive due to non-recursive solution
  • Existing methods solve the convenient NDDR problem, not the real one
  • Actual probability distributions are infinite dimensional and change in size and shape
evolution of probability distributions
Evolution of Probability Distributions
  • Evolution of posterior for popular adiabatic CSTR
  • Gaussian approximation is even more inaccurate with constraints
results of cstr example
Results of CSTR Example
  • Perfect initial guess
  • 100 realizations, 1600 measurements /realization,

500 samples/realization

  • Work in progress
  • Relevant to model predictive control, Bayesian neural networks, etc.
rectification without accurate models
Rectification without Accurate Models

^

y

x

-1

-1

m

m

  • Most processes are dynamic but lack accurate models
  • Wavelet representation captures dynamics in variation of variance across scales

w(m) ~ N(0,Pd ); Ax(m) = Hm; Bx(m) = Gm

  • Rectify coefficients at each scale (Bakshi et al., 2001)

dm = Km(CTRm dm + Pd md )

  • Features of multiscale approach
    • More accurate than single-scale approaches
    • More computationally efficient since scales with less information can be identified before rectification

m

example
Example
  • Level control process (Bellingham and Lees, 1977)

hk+1 = 0.995 -0.1373 hk + 0.00012 0 F3k

xk+1 0 1 xk 0 1 ek

  • F3k and ek are iid Gaussian
  • [hk xk F3k ek] are corrupted by iid Gaussian noise
  • None None 1.00
  • Max. Likelihood Steady state 0.67
  • Single scale Bayes Steady state 0.40
  • Multiscale Bayes Steady state 0.06
  • Single scale Bayes Dynamic 0.05
  • Multiscale Bayes Dynamic 0.03

Method Model MSE

data rectification summary
Data Rectification - Summary
  • Existing approaches to nonlinear estimation and rectification requires assumptions
    • Gaussian noise, prior
    • Non-time varying distributions
  • Assumptions are readily violated
  • Proposed approach relies on Monte Carlo sampling
    • More accurate that existing methods
    • Computationally less expensive than MHE
  • Many opportunities for further work
summary
Summary
  • Large amounts of measured data and process knowledge are available
  • Existing methods do not make the most of available data and knowledge
    • Processes are multiscale, but methods are single-scale
    • Fundamental models and partial knowledge are underutilized
  • Developed new multiscale and Bayesian methods for,
    • Fault detection and diagnosis,
    • Dynamic data rectification, and
    • Empirical modeling
  • Significant opportunities for future research and applications
future work
Future Work
  • Nonlinear dynamic data rectification
  • Bayesian nonlinear regression/neural networks
  • Estimation of multirate systems and missing data
  • Integrated rectification, monitoring, diagnosis, and supervision
  • Bioinformatics and genomics
  • Process scale-up
acknowledgments
Acknowledgments
  • Graduate students and post-docs
    • Prof. Sridhar Ungarala
    • Dr. Hrishikesh Aradhye
  • Collaborators
    • Prof. Prem K. Goel
    • Dr. Manabu Kano
  • Financial Support
    • National Science Foundation (CTS 9733627)
    • Abnormal Situation Management Consortium
    • Du Pont Education Fund
    • Technical Association of Pulp and Paper Industry
    • American Chemical Society - Petroleum Research Fund
  • Dr. Mohamed Nounou
  • Mr. Wen-Shiang Chen
  • Prof. Xiaotong Shen