1 / 29

Process Monitoring with Supervised Learning and Artificial Contrasts

Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering School of Computing, Informatics, and Decision Systems Engineering Arizona State University Eugene Tuv Intel.

brice
Download Presentation

Process Monitoring with Supervised Learning and Artificial Contrasts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wookyeon HwangUniv. of South Carolina George RungerIndustrial Engineering Industrial, Systems, and Operations Engineering School of Computing, Informatics, and Decision Systems Engineering Arizona State University Eugene TuvIntel Process Monitoring with Supervised Learning and Artificial Contrasts runger@asu.edu

  2. Statistical Process Control/Anomaly Detection Objective is to detect change in a system Transportation, environmental, security, health, processes, etc. In modern approach, leverage massive data Continuous, categorical, missing, outliers, nonlinear relationships Goal is a widely-applicable, flexible method Normal conditions and fault typeunknown Capture relationships between multiple variables Learn patterns, exploit patterns Traditional Hotelling’s T2 captures structure, provides control region (boundary), quantifies false alarms runger@asu.edu 2

  3. Traditional Monitoring Traditional approach is Hotelling’s (1948) T-squared chart Numerical measurements, based on multivariate normality Simple elliptical pattern (Mahalanobis distance) • Time-weighted extensions, exponentially weighted moving average, and cumulative sum • More efficient, but same elliptical patterns runger@asu.edu runger@asu.edu 3

  4. Transform to Supervised Learning • Process monitoring can be transformed to a supervised learning problem • One approach--supplement with artificial, contrasting data • Any one of multiple learners can be used, without pre-specified faults • Results can generalize monitoring in several directions—such as arbitrary (nonlinear) in-control conditions, fault knowledge, and categorical variables • High-dimensional problems can be handled with an appropriate learner runger@asu.edu

  5. Learn Process Patterns Learn pattern compared to “structureless” alternative Generate noise, artificial data without structure to differentiate For example, f(x) = f1(x1)… f2(x2) joint distribution as product of marginals (enforce independence) Or f(x) = product of uniforms Define & assign y = +/–1 to “actual” and “artificial” data, artificial contrast Use supervised (classification) learner to distinguish the data sets Only simple examples used here runger@asu.edu 5

  6. Learn Pattern from Artificial Contrast runger@asu.edu 6

  7. Regularized Least Squares (Kernel Ridge) Classifier with Radial Basis Functions Model with a linear combination of basis functions Smoothness penalty controls complexity Tightly related to Support Vector Machines (SVM) Regularized least squares allows closed form solution, trades it for sparsity, may not want to trade! Previous example: challenge for a generalized learner--multivariate normal data! f(x) x2 x1 runger@asu.edu 7

  8. RLS Classifier where with parameters g, s Solution runger@asu.edu 8

  9. Patterns Learned from Artificial Contrast RLSC True Hotelling’s 95% probability bound Red: learned contour function to assign +/-1 Actual: n = 1000 Artificial: n = 2000 Complexity: 4/3000 Sigma2 = 5 runger@asu.edu 9

  10. More Challenging Example withHotelling’s Contour runger@asu.edu 10

  11. Patterns Learned from Artificial Contrast RLSC Actual: n = 1000 Artificial: n = 2000 Complexity: 4/3000 Sigma2 = 5 runger@asu.edu 11

  12. Patterns Learned from Artificial Contrast RLSC • Actual: n = 1000 Artificial: n = 1000 • Complexity: 4/2000 • Sigma2 = 5 runger@asu.edu 12

  13. RLSC for p = 10 dimensions runger@asu.edu 13

  14. Tree-Based Ensembles p = 10 • Alternative learner • works with mixed data • elegantly handle missing data • scale invariant • outlier resistance • insensitive to extraneous predictors • Provide an implicit ability to select key variables runger@asu.edu 14

  15. Nonlinear Patterns • Hotelling’s boundary—not a good solution when patterns are not linear • Control boundaries from supervised learning captures the normal operating condition runger@asu.edu

  16. Tuned Control • Extend to incorporate specific process knowledge of faults • Artificial contrasts generated from the specified fault distribution • or from a mixture of samples from different fault distributions • Numerical optimization to design a control statistic can be very complicated • maximizes the likelihood function under a specified fault (alternative) runger@asu.edu

  17. Tuned Control Fault: means of both variables x1 and x2 are known to increase Artificial data (black) are sampled from 12 independent normal distributions Mean vectors are selected from a grid over the area [0, 3] x [0, 3] Learned control region is shown in the right panel—approx. matches the theoretical result in Testik et al., 2004. runger@asu.edu

  18. Incorporate Time-Weighted Rules • What form of statistic should be filtered and monitored? • Log likelihood ratio • Some learners provide call probability estimates • Bayes’ theorem (for equal sample size) gives • Log likelihood ratio for an observation xt estimated as • Apply EWMA (or CUSUM, etc.) to lt runger@asu.edu

  19. Time-Weighted ARLs • ARLs for selected schemes applied to ltstatistic • 10-dimensional, independent normal runger@asu.edu

  20. Example: 50 Dimensions runger@asu.edu runger@asu.edu 20

  21. Example: 50 Dimensions Hotelling’s: left Artificial contrast: right runger@asu.edu runger@asu.edu 21

  22. Example: Credit Data (UCI) 20 attributes: 7 numerical and 13 categorical Associated class label of “good” or “bad” credit risk Artificial data generated from continuous and discrete uniform distributions, respectively, independently for each attribute Ordered by 300 “good” instances followed by 300 “bad” runger@asu.edu runger@asu.edu 22

  23. Artificial Contrasts for Credit Data Plot of ltover time runger@asu.edu runger@asu.edu 23

  24. Diagnostics: Contribution Plots • 50 dimensions: 2 contributors, 48 noise variables (scatter plot projections to contributor variables) runger@asu.edu

  25. Contributor Plots from PCA T2 runger@asu.edu

  26. Contributor Plots from PCA SPE runger@asu.edu

  27. Contributor Plots from Artificial Contrast Ensemble (ACE) • Impurity importance weighted by Dmeans of split variable runger@asu.edu

  28. Contributor Plots for Nonlinear System • Contributor plots from SPE, T2 and ACE in left, center, right, respectively runger@asu.edu

  29. Conclusions Can/must leverage the automated-ubiquitous, data-computational environment Professional obsolesce Employ flexible, powerful control solution, for broad applications: environment, health, security, etc., as well as manufacturing “Normal” sensors not obvious, patterns not known Include automated diagnosis Tools to filter to identify contributors Computational feasibility in embedded software This material is based upon work supported by the National Science Foundation under Grant No. 0355575. runger@asu.edu 29

More Related