The automatic explanation of multivariate time series mts
This presentation is the property of its rightful owner.
Sponsored Links
1 / 32

The Automatic Explanation of Multivariate Time Series (MTS) PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on
  • Presentation posted in: General

The Automatic Explanation of Multivariate Time Series (MTS). Allan Tucker. The Problem - Data. Datasets which are Characteristically: High Dimensional MTS Large Time Lags Changing Dependencies Little or No Available Expert Knowledge. The Problem - Requirement.

Download Presentation

The Automatic Explanation of Multivariate Time Series (MTS)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The automatic explanation of multivariate time series mts

The Automatic Explanation of Multivariate Time Series (MTS)

Allan Tucker


The problem data

The Problem - Data

  • Datasets which are Characteristically:

    • High Dimensional MTS

    • Large Time Lags

    • Changing Dependencies

    • Little or No Available Expert Knowledge


The problem requirement

The Problem - Requirement

  • Lack of Algorithms to Assist Users in Explaining Events where:

    • Model Complex MTS Data

    • Learnable from Data with Little or No User Intervention

    • Transparency Throughout the Learning and Explaining Process is Vital


Contribution to knowledge

Contribution to Knowledge

  • Using a Combination of Evolutionary Programming (EP) and Bayesian Networks (BNs) to Overcome Issues Outlined

  • Extending Learning Algorithms for BNs to Dynamic Bayesian Networks (DBNs) with Comparison of Efficiency

  • Introduction of an Algorithm for Decomposing High Dimensional MTS into Several Lower Dimensional MTS


Contribution to knowledge continued

Contribution to Knowledge (Continued)

  • Introduction of New EP-Seeded GA Algorithm

  • Incorporating Changing Dependencies

  • Application to Synthetic and Real-World Chemical Process Data

  • Transparency Retained Throughout Each Stage


Framework

Framework

Pre-processing

Data Preparation

Variable Groupings

Model

Building

Search Methods

Synthetic Data

Evaluation

Real Data

Changing

Dependencies

Explanation


Key technical points 1 comparing adapted algorithms

Key Technical Points 1Comparing Adapted Algorithms

  • New Representation

  • K2/K3 [Cooper and Herskovitz]

  • Genetic Algorithm [Larranaga]

  • Evolutionary Algorithm [Wong]

  • Branch and Bound [Bouckaert]

  • Log Likelihood / Description Length

  • Publications:

    • International Journal of Intelligent Systems, 2001


Key technical points 2 grouping

Key Technical Points 2Grouping

  • A Number of Correlation Searches

  • A Number of Grouping Algorithms

  • Designed Metrics

  • Comparison of All Combinations

  • Synthetic and Real Data

  • Publications:

    • IDA99

    • IEEE Trans System Man and Cybernetics 2001

    • Expert Systems 2000


Key technical points 3 ep seeded ga

Key Technical Points 3EP-Seeded GA

  • Approximate Correlation Search Based on the One Used in Grouping Strategy

  • Results Used to Seed Initial Population of GA

  • Uniform Crossover

  • Specific Lag Mutation

  • Publications:

    • Genetic Algorithms and Evolutionary Computation Conference 1999 (GECCO99)

    • International Journal of Intelligent Systems, 2001

    • IDA2001


Key technical points 4 changing dependencies

Key Technical Points 4Changing Dependencies

  • Dynamic Cross Correlation Function for Analysing MTS

  • Extend Representation Introduce a Heuristic Search - Hidden Controller Hill Climb (HCHC)

    • Hidden Variables to Model State of the System

    • Search for Structure and Hidden States Iteratively


Future work

Future Work

  • Parameter Estimation

  • Discretisation

  • Changing Dependencies

  • Efficiency

  • New Datasets

    • Gene Expression Data

    • Visual Field Data


Dbn representation

DBN Representation

a0(t)

(3,1,4)

(4,2,3)

(2,3,2)

(3,0,2)

(3,4,2)

a1(t)

a2(t-2)

a2(t)

a3(t-4)

a3(t-2)

a3(t)

a4(t-3)

a4(t)

t-4 t-3 t-2 t-1 t


Sample dbn search results

Sample DBN Search Results

N = 5, MaxT = 10

N = 10, MaxT = 60


The automatic explanation of multivariate time series mts

1. Correlation

Search (EP)

2. Grouping

Algorithm (GGA)

Several Lower

Dimensional

MTS

Grouping

One High

Dimensional

MTS (A)

List

1

2

R

(a, b, lag)

(a, b, lag)

(a, b, lag)

G

{0,3}

{1,4,5}

{2}


Sample grouping results

Original Synthetic MTS

Groupings

Groupings Discovered

from Synthetic Data

Sample of Variables from a Discovered Oil Refinery Data Group

0 1 2

3 4 5 6 7

8 9 10 11 12

13 14 15 16 17 18 19 20 21 22

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

0 6

1

2

3 4 5 7

8

9 10

11 12

13

14 15 20 21 22

16 17 18 19

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

Sample Grouping Results


Parameter estimation

Parameter Estimation

  • Simulate Random Bag (Vary R, s and c, e)

  • Calculate Mean and SD for Each Distribution (the Probability of Selecting e from s)

  • Test for Normality (Lilliefors’ Test)

  • Symbolic Regression (GP) to Determine the Function for Mean and SD from R, s and c (e will be Unknown)

  • Place Confidence Limits on the P(Number of Correlations Found e)


The automatic explanation of multivariate time series mts

Final EPList

EP-Seeded GA

0: (a,b,l)

1:(a,b,l)

2:(a,b,l)

EPListSize: (a,b,l)

EP

DBN

Initial GAPopulation

0: ((a,b,l),(a,b,l)…(a,b,l))

1: ((a,b,l),(a,b,l)…(a,b,l))

2: ((a,b,l),(a,b,l)…(a,b,l))

GAPopsize: ((a,b,l) … (a,b,l))

GA


Ep seeded ga results

EP-Seeded GA Results

N = 10, MaxT = 60

N = 20, MaxT = 60


Varying the value of c

Varying the value of c


The automatic explanation of multivariate time series mts

TimeExplanation

t

t-1

t-11

t-13

t-16

t-20

t-60

P(TT instate_0) = 1.0

P(TGF instate_0) = 1.0

P(BPF instate_3) = 1.0

P(TGF instate_3) = 1.0

P(TT instate_1) = 0.446

P(SOT instate_0) = 0.314

P(C2% instate_0) = 0.279

P(T6T instate_0) = 0.347

P(RinT instate_0) = 0.565


Changing dependencies

50

10.5

10

45

9.5

40

9

A/M_GB

Variable Magnitude

35

TGF

8.5

30

8

25

7.5

20

7

1

501

1001

1501

2001

2501

3001

3501

Time (Minutes)

Changing Dependencies


Dynamic cross correlation function

Dynamic Cross- Correlation Function


Hidden variable opstate

Hidden Variable - OpState

a0(t-4)

a2(t-1)

a2(t)

OpState2

a3(t-2)

t-4 t-3 t-2 t-1 t


Hidden controller hill climb

< DBN_List >

< Segment_Lists >

Update Segment_Lists

through Op_State Parameter Estimation

Score

Update DBN_List

through DBN Structure

Search

Hidden Controller Hill Climb


Hchc results oil refinery data

HCHC Results - Oil Refinery Data


Hchc results synthetic data

HCHC Results - Synthetic Data

Generate Data from Several DBNs

Append each Section of Data Together to Form One MTS with Changing Dependencies

Run HCHC


The automatic explanation of multivariate time series mts

Time Explanation

t

t-1

t-3

t-5

t-6

t-9

P(OpState1 is 0) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 0) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 0) = 0.758

P(OpState0 is 0) = 0.519

P(a0 is 0) = 0.968

P(OpState0 is 0) = 0.720

P(a0 is 1) = 0.778

P(a2 is 0) = 0.545

P(a0 is 1) = 0.517


The automatic explanation of multivariate time series mts

Time Explanation

t

t-1

t-3

t-5

t-6

t-7

t-9

P(OpState1 is 4) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 4) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 1) = 0.570

P(a0 is 0) = 0.506

P(OpState2 is 3) = 0.210

P(a2 is 1) = 0.974

P(OpState2 is 4) = 0.222

P(a2 is 0) = 0.882

P(a0 is 1) = 0.549


Process diagram

TGF

%C3

Process Diagram

TT

T6T

PGM

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2


Typical discovered relationships

TGF

%C3

Typical Discovered Relationships

PGM

TT

T6T

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2


Parameters

Parameters

DBN SearchGAEP

PopSize10010

MR0.10.8

CR0.8---

GenBased on FCBased on FC

Correlation Search

c - Approx. 20% of s

R - Approx. 2.5% of s

Grouping GA Synth. 1Synth. 2-6 Oil

PopSize150100150

CR0.80.80.8

MR0.10.10.1

Gen 150100 (1000 for GPV)150


Parameters1

Parameters

EP-Seeded GA

c- Approx. 20% of s

EPListSize- Approx. 2.5% of s

GAPopSize - 10

MR- 0.1

CR- 0.8

LMR-0.1

Gen- Based on FC

HCHC

OilSynthetic

DBN_Iterations1×1065000

Winlen1000200

Winjump50050


  • Login