- 98 Views
- Uploaded on
- Presentation posted in: General

The Automatic Explanation of Multivariate Time Series (MTS)

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

The Automatic Explanation of Multivariate Time Series (MTS)

Allan Tucker

- Datasets which are Characteristically:
- High Dimensional MTS
- Large Time Lags
- Changing Dependencies
- Little or No Available Expert Knowledge

- Lack of Algorithms to Assist Users in Explaining Events where:
- Model Complex MTS Data
- Learnable from Data with Little or No User Intervention
- Transparency Throughout the Learning and Explaining Process is Vital

- Using a Combination of Evolutionary Programming (EP) and Bayesian Networks (BNs) to Overcome Issues Outlined
- Extending Learning Algorithms for BNs to Dynamic Bayesian Networks (DBNs) with Comparison of Efficiency
- Introduction of an Algorithm for Decomposing High Dimensional MTS into Several Lower Dimensional MTS

- Introduction of New EP-Seeded GA Algorithm
- Incorporating Changing Dependencies
- Application to Synthetic and Real-World Chemical Process Data
- Transparency Retained Throughout Each Stage

Pre-processing

Data Preparation

Variable Groupings

Model

Building

Search Methods

Synthetic Data

Evaluation

Real Data

Changing

Dependencies

Explanation

- New Representation
- K2/K3 [Cooper and Herskovitz]
- Genetic Algorithm [Larranaga]
- Evolutionary Algorithm [Wong]
- Branch and Bound [Bouckaert]
- Log Likelihood / Description Length
- Publications:
- International Journal of Intelligent Systems, 2001

- A Number of Correlation Searches
- A Number of Grouping Algorithms
- Designed Metrics
- Comparison of All Combinations
- Synthetic and Real Data
- Publications:
- IDA99
- IEEE Trans System Man and Cybernetics 2001
- Expert Systems 2000

- Approximate Correlation Search Based on the One Used in Grouping Strategy
- Results Used to Seed Initial Population of GA
- Uniform Crossover
- Specific Lag Mutation
- Publications:
- Genetic Algorithms and Evolutionary Computation Conference 1999 (GECCO99)
- International Journal of Intelligent Systems, 2001
- IDA2001

- Dynamic Cross Correlation Function for Analysing MTS
- Extend Representation Introduce a Heuristic Search - Hidden Controller Hill Climb (HCHC)
- Hidden Variables to Model State of the System
- Search for Structure and Hidden States Iteratively

- Parameter Estimation
- Discretisation
- Changing Dependencies
- Efficiency
- New Datasets
- Gene Expression Data
- Visual Field Data

a0(t)

(3,1,4)

(4,2,3)

(2,3,2)

(3,0,2)

(3,4,2)

a1(t)

a2(t-2)

a2(t)

a3(t-4)

a3(t-2)

a3(t)

a4(t-3)

a4(t)

t-4 t-3 t-2 t-1 t

N = 5, MaxT = 10

N = 10, MaxT = 60

1. Correlation

Search (EP)

2. Grouping

Algorithm (GGA)

Several Lower

Dimensional

MTS

Grouping

One High

Dimensional

MTS (A)

List

1

2

R

(a, b, lag)

(a, b, lag)

(a, b, lag)

G

{0,3}

{1,4,5}

{2}

Original Synthetic MTS

Groupings

Groupings Discovered

from Synthetic Data

Sample of Variables from a Discovered Oil Refinery Data Group

0 1 2

3 4 5 6 7

8 9 10 11 12

13 14 15 16 17 18 19 20 21 22

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

0 6

1

2

3 4 5 7

8

9 10

11 12

13

14 15 20 21 22

16 17 18 19

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

- Simulate Random Bag (Vary R, s and c, e)
- Calculate Mean and SD for Each Distribution (the Probability of Selecting e from s)
- Test for Normality (Lilliefors’ Test)
- Symbolic Regression (GP) to Determine the Function for Mean and SD from R, s and c (e will be Unknown)
- Place Confidence Limits on the P(Number of Correlations Found e)

Final EPList

EP-Seeded GA

0: (a,b,l)

1:(a,b,l)

2:(a,b,l)

EPListSize: (a,b,l)

EP

DBN

Initial GAPopulation

0: ((a,b,l),(a,b,l)…(a,b,l))

1: ((a,b,l),(a,b,l)…(a,b,l))

2: ((a,b,l),(a,b,l)…(a,b,l))

GAPopsize: ((a,b,l) … (a,b,l))

GA

N = 10, MaxT = 60

N = 20, MaxT = 60

TimeExplanation

t

t-1

t-11

t-13

t-16

t-20

t-60

P(TT instate_0) = 1.0

P(TGF instate_0) = 1.0

P(BPF instate_3) = 1.0

P(TGF instate_3) = 1.0

P(TT instate_1) = 0.446

P(SOT instate_0) = 0.314

P(C2% instate_0) = 0.279

P(T6T instate_0) = 0.347

P(RinT instate_0) = 0.565

50

10.5

10

45

9.5

40

9

A/M_GB

Variable Magnitude

35

TGF

8.5

30

8

25

7.5

20

7

1

501

1001

1501

2001

2501

3001

3501

Time (Minutes)

a0(t-4)

a2(t-1)

a2(t)

OpState2

a3(t-2)

t-4 t-3 t-2 t-1 t

< DBN_List >

< Segment_Lists >

Update Segment_Lists

through Op_State Parameter Estimation

Score

Update DBN_List

through DBN Structure

Search

Generate Data from Several DBNs

Append each Section of Data Together to Form One MTS with Changing Dependencies

Run HCHC

Time Explanation

t

t-1

t-3

t-5

t-6

t-9

P(OpState1 is 0) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 0) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 0) = 0.758

P(OpState0 is 0) = 0.519

P(a0 is 0) = 0.968

P(OpState0 is 0) = 0.720

P(a0 is 1) = 0.778

P(a2 is 0) = 0.545

P(a0 is 1) = 0.517

Time Explanation

t

t-1

t-3

t-5

t-6

t-7

t-9

P(OpState1 is 4) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 4) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 1) = 0.570

P(a0 is 0) = 0.506

P(OpState2 is 3) = 0.210

P(a2 is 1) = 0.974

P(OpState2 is 4) = 0.222

P(a2 is 0) = 0.882

P(a0 is 1) = 0.549

TGF

%C3

TT

T6T

PGM

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2

TGF

%C3

PGM

TT

T6T

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2

DBN SearchGAEP

PopSize10010

MR0.10.8

CR0.8---

GenBased on FCBased on FC

Correlation Search

c - Approx. 20% of s

R - Approx. 2.5% of s

Grouping GA Synth. 1Synth. 2-6 Oil

PopSize150100150

CR0.80.80.8

MR0.10.10.1

Gen 150100 (1000 for GPV)150

EP-Seeded GA

c- Approx. 20% of s

EPListSize- Approx. 2.5% of s

GAPopSize - 10

MR- 0.1

CR- 0.8

LMR-0.1

Gen- Based on FC

HCHC

OilSynthetic

DBN_Iterations1×1065000

Winlen1000200

Winjump50050