the automatic explanation of multivariate time series mts
Download
Skip this Video
Download Presentation
The Automatic Explanation of Multivariate Time Series (MTS)

Loading in 2 Seconds...

play fullscreen
1 / 32

The Automatic Explanation of Multivariate Time Series (MTS) - PowerPoint PPT Presentation


  • 151 Views
  • Uploaded on

The Automatic Explanation of Multivariate Time Series (MTS). Allan Tucker. The Problem - Data. Datasets which are Characteristically: High Dimensional MTS Large Time Lags Changing Dependencies Little or No Available Expert Knowledge. The Problem - Requirement.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' The Automatic Explanation of Multivariate Time Series (MTS)' - fai


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the problem data
The Problem - Data
  • Datasets which are Characteristically:
    • High Dimensional MTS
    • Large Time Lags
    • Changing Dependencies
    • Little or No Available Expert Knowledge
the problem requirement
The Problem - Requirement
  • Lack of Algorithms to Assist Users in Explaining Events where:
    • Model Complex MTS Data
    • Learnable from Data with Little or No User Intervention
    • Transparency Throughout the Learning and Explaining Process is Vital
contribution to knowledge
Contribution to Knowledge
  • Using a Combination of Evolutionary Programming (EP) and Bayesian Networks (BNs) to Overcome Issues Outlined
  • Extending Learning Algorithms for BNs to Dynamic Bayesian Networks (DBNs) with Comparison of Efficiency
  • Introduction of an Algorithm for Decomposing High Dimensional MTS into Several Lower Dimensional MTS
contribution to knowledge continued
Contribution to Knowledge (Continued)
  • Introduction of New EP-Seeded GA Algorithm
  • Incorporating Changing Dependencies
  • Application to Synthetic and Real-World Chemical Process Data
  • Transparency Retained Throughout Each Stage
framework
Framework

Pre-processing

Data Preparation

Variable Groupings

Model

Building

Search Methods

Synthetic Data

Evaluation

Real Data

Changing

Dependencies

Explanation

key technical points 1 comparing adapted algorithms
Key Technical Points 1Comparing Adapted Algorithms
  • New Representation
  • K2/K3 [Cooper and Herskovitz]
  • Genetic Algorithm [Larranaga]
  • Evolutionary Algorithm [Wong]
  • Branch and Bound [Bouckaert]
  • Log Likelihood / Description Length
  • Publications:
    • International Journal of Intelligent Systems, 2001
key technical points 2 grouping
Key Technical Points 2Grouping
  • A Number of Correlation Searches
  • A Number of Grouping Algorithms
  • Designed Metrics
  • Comparison of All Combinations
  • Synthetic and Real Data
  • Publications:
    • IDA99
    • IEEE Trans System Man and Cybernetics 2001
    • Expert Systems 2000
key technical points 3 ep seeded ga
Key Technical Points 3EP-Seeded GA
  • Approximate Correlation Search Based on the One Used in Grouping Strategy
  • Results Used to Seed Initial Population of GA
  • Uniform Crossover
  • Specific Lag Mutation
  • Publications:
    • Genetic Algorithms and Evolutionary Computation Conference 1999 (GECCO99)
    • International Journal of Intelligent Systems, 2001
    • IDA2001
key technical points 4 changing dependencies
Key Technical Points 4Changing Dependencies
  • Dynamic Cross Correlation Function for Analysing MTS
  • Extend Representation Introduce a Heuristic Search - Hidden Controller Hill Climb (HCHC)
    • Hidden Variables to Model State of the System
    • Search for Structure and Hidden States Iteratively
future work
Future Work
  • Parameter Estimation
  • Discretisation
  • Changing Dependencies
  • Efficiency
  • New Datasets
    • Gene Expression Data
    • Visual Field Data
dbn representation
DBN Representation

a0(t)

(3,1,4)

(4,2,3)

(2,3,2)

(3,0,2)

(3,4,2)

a1(t)

a2(t-2)

a2(t)

a3(t-4)

a3(t-2)

a3(t)

a4(t-3)

a4(t)

t-4 t-3 t-2 t-1 t

sample dbn search results
Sample DBN Search Results

N = 5, MaxT = 10

N = 10, MaxT = 60

slide14

1. Correlation

Search (EP)

2. Grouping

Algorithm (GGA)

Several Lower

Dimensional

MTS

Grouping

One High

Dimensional

MTS (A)

List

1

2

R

(a, b, lag)

(a, b, lag)

(a, b, lag)

G

{0,3}

{1,4,5}

{2}

sample grouping results

Original Synthetic MTS

Groupings

Groupings Discovered

from Synthetic Data

Sample of Variables from a Discovered Oil Refinery Data Group

0 1 2

3 4 5 6 7

8 9 10 11 12

13 14 15 16 17 18 19 20 21 22

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

0 6

1

2

3 4 5 7

8

9 10

11 12

13

14 15 20 21 22

16 17 18 19

23 24 25 26 27 28 29 30 31 32

33 34 35 36 37 38 39 40 41 42

43 44 45 46 47 48 49

50 51 52 53 54 55

56 57 58

59 60

Sample Grouping Results
parameter estimation
Parameter Estimation
  • Simulate Random Bag (Vary R, s and c, e)
  • Calculate Mean and SD for Each Distribution (the Probability of Selecting e from s)
  • Test for Normality (Lilliefors’ Test)
  • Symbolic Regression (GP) to Determine the Function for Mean and SD from R, s and c (e will be Unknown)
  • Place Confidence Limits on the P(Number of Correlations Found e)
slide17

Final EPList

EP-Seeded GA

0: (a,b,l)

1: (a,b,l)

2: (a,b,l)

EPListSize: (a,b,l)

EP

DBN

Initial GAPopulation

0: ((a,b,l),(a,b,l)…(a,b,l))

1: ((a,b,l),(a,b,l)…(a,b,l))

2: ((a,b,l),(a,b,l)…(a,b,l))

GAPopsize: ((a,b,l) … (a,b,l))

GA

ep seeded ga results
EP-Seeded GA Results

N = 10, MaxT = 60

N = 20, MaxT = 60

slide20

Time Explanation

t

t-1

t-11

t-13

t-16

t-20

t-60

P(TT instate_0) = 1.0

P(TGF instate_0) = 1.0

P(BPF instate_3) = 1.0

P(TGF instate_3) = 1.0

P(TT instate_1) = 0.446

P(SOT instate_0) = 0.314

P(C2% instate_0) = 0.279

P(T6T instate_0) = 0.347

P(RinT instate_0) = 0.565

changing dependencies

50

10.5

10

45

9.5

40

9

A/M_GB

Variable Magnitude

35

TGF

8.5

30

8

25

7.5

20

7

1

501

1001

1501

2001

2501

3001

3501

Time (Minutes)

Changing Dependencies
hidden variable opstate
Hidden Variable - OpState

a0(t-4)

a2(t-1)

a2(t)

OpState2

a3(t-2)

t-4 t-3 t-2 t-1 t

hidden controller hill climb

< DBN_List >

< Segment_Lists >

Update Segment_Lists

through Op_State Parameter Estimation

Score

Update DBN_List

through DBN Structure

Search

Hidden Controller Hill Climb
hchc results synthetic data
HCHC Results - Synthetic Data

Generate Data from Several DBNs

Append each Section of Data Together to Form One MTS with Changing Dependencies

Run HCHC

slide27

Time Explanation

t

t-1

t-3

t-5

t-6

t-9

P(OpState1 is 0) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 0) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 0) = 0.758

P(OpState0 is 0) = 0.519

P(a0 is 0) = 0.968

P(OpState0 is 0) = 0.720

P(a0 is 1) = 0.778

P(a2 is 0) = 0.545

P(a0 is 1) = 0.517

slide28

Time Explanation

t

t-1

t-3

t-5

t-6

t-7

t-9

P(OpState1 is 4) = 1.0

P(a1 is 0) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(OpState1 is 4) = 1.0

P(a1 is 1) = 1.0

P(a0 is 0) = 1.0

P(a2 is 1) = 1.0

P(a2 is 1) = 0.570

P(a0 is 0) = 0.506

P(OpState2 is 3) = 0.210

P(a2 is 1) = 0.974

P(OpState2 is 4) = 0.222

P(a2 is 0) = 0.882

P(a0 is 1) = 0.549

process diagram

TGF

%C3

Process Diagram

TT

T6T

PGM

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2

typical discovered relationships

TGF

%C3

Typical Discovered Relationships

PGM

TT

T6T

PGB

SOTT11

SOFT13

RINT

C11/3T

T36T

AFT

FF

RBT

BPF

%C2

parameters
Parameters

DBN SearchGA EP

PopSize 100 10

MR0.1 0.8

CR0.8 ---

GenBased on FC Based on FC

Correlation Search

c - Approx. 20% of s

R - Approx. 2.5% of s

Grouping GA Synth. 1 Synth. 2-6 Oil

PopSize150 100 150

CR 0.8 0.8 0.8

MR0.1 0.1 0.1

Gen 150 100 (1000 for GPV) 150

parameters1
Parameters

EP-Seeded GA

c - Approx. 20% of s

EPListSize - Approx. 2.5% of s

GAPopSize - 10

MR - 0.1

CR - 0.8

LMR -0.1

Gen - Based on FC

HCHC

Oil Synthetic

DBN_Iterations 1×106 5000

Winlen 1000 200

Winjump 500 50

ad