# F4: Large Scale Automated Forecasting Using Fractals - PowerPoint PPT Presentation

1 / 53

F4: Large Scale Automated Forecasting Using Fractals. -Deepayan Chakrabarti -Christos Faloutsos. Outline. Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions. ?. General Problem Definition.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

F4: Large Scale Automated Forecasting Using Fractals

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## F4: Large Scale Automated Forecasting Using Fractals

-Deepayan Chakrabarti

-Christos Faloutsos

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

?

### General Problem Definition

Value

Time

Given a time series {xt}, predict its future course, that is, xt+1, xt+2, ...

CIKM 2002

### Motivation

• Financial data analysis

• Physiological data, elderly care

• Weather, environmental studies

Sensor Networks(MEMS, “SmartDust”)

• Long / “infinite” series

• No human intervention  “black box”

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

### How to forecast?

• ARIMA but linearity assumption

• Neural Networks  but large number of parameters and long training times [Wan/1993, Mozer/1993]

• Hidden Markov Models  O(N2) in number of nodes N; also fixing N is a problem [Ge+/2000]

• Lag Plots

CIKM 2002

Q0: Interpolation Method

Q1: Lag = ?

Q2: K = ?

Interpolate these…

To get the final prediction

4-NN

New Point

### Lag Plots

xt

xt-1

CIKM 2002

Using SVD (state of the art) [Sauer/1993]

xt

Xt-1

CIKM 2002

### Why Lag Plots?

• Based on the “Takens’ Theorem” [Takens/1981]

• which says that delay vectors can be used for predictive purposes

CIKM 2002

Extra

### Inside Theory

Example: Lotka-Volterra equations

ΔH/Δt = rH – aH*P ΔP/Δt = bH*P – mP

H is density of preyP is density of predators

Suppose only H(t) is observed. Internal state is (H,P).

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

### Problem at hand

• Given {x1, x2, …, xN}

• Automatically set parameters - L(opt) (from Q1) - k(opt) (from Q2)

• in Linear time on N

• to minimise Normalized Mean Squared Error (NMSE) of forecasting

CIKM 2002

### Previous work/Alternatives

• Manual Setting : BUT infeasible [Sauer/1992]

• CrossValidation : BUT Slow; leave-one-out crossvalidation ~ O(N2logN) or more

• “False Nearest Neighbors” : BUT Unstable [Abarbanel/1996]

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

X(t)

Intrinsic Dimensionality

≈ Degrees of Freedom

≈ Information about Xt given Xt-1

X(t-1)

### Intuition

x(t)

time

The Logistic Parabola xt = axt-1(1-xt-1) + noise

CIKM 2002

x(t)

x(t-1)

x(t-2)

x(t)

x(t)

x(t-1)

x(t-1)

x(t-2)

x(t-2)

x(t)

x(t-1)

CIKM 2002

### Intuition

• To find L(opt):

• Go further back in time (ie., consider Xt-2, Xt-3 and so on)

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

### Fractal Dimensions

• FD = intrinsic dimensionality

“Embedding” dimensionality = 3

Intrinsic dimensionality = 1

CIKM 2002

### Fractal Dimensions

FD = intrinsic dimensionality [Belussi/1995]

log( # pairs)

• Points to note:

• FD can be a non-integer

• There are fast methods to compute it

CIKM 2002

log(r)

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

epsilon

f

L(opt)

### Q1: Finding L(opt)

• Use Fractal Dimensions to find the optimal lag length L(opt)

Fractal Dimension

Lag (L)

CIKM 2002

### Q2: Finding k(opt)

• To find k(opt)

• Conjecture: k(opt) ~ O(f)

We choose k(opt) = 2*f + 1

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

Value

### Datasets

• Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Time

CIKM 2002

Value

### Datasets

• Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Time

• LORENZ: Models convection currents in the air

CIKM 2002

Value

### Datasets

• Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Error NMSE = ∑(predicted-true)2/σ2

Time

• LORENZ: Models convection currents in the air

• LASER: fluctuations in a Laser over time (from the Santa Fe Time Series Competition, 1992)

CIKM 2002

Value

Timesteps

FD

### Logistic Parabola

Lag

• FD vs L plot flattens out

• L(opt) = 1

CIKM 2002

### Logistic Parabola

Our Prediction from here

Value

Timesteps

CIKM 2002

Value

### Logistic Parabola

Comparison of prediction to correct values

Timesteps

CIKM 2002

### Logistic Parabola

FD

Our L(opt) = 1, which exactly minimizes NMSE

NMSE

CIKM 2002

Lag

FD

Value

Timesteps

Lag

• L(opt) = 5

CIKM 2002

### LORENZ

Our Prediction from here

Value

Timesteps

CIKM 2002

### LORENZ

Value

Comparison of prediction to correct values

Timesteps

CIKM 2002

### LORENZ

FD

L(opt) = 5

Also NMSE is optimal at Lag = 5

NMSE

CIKM 2002

Lag

FD

Value

Lag

• L(opt) = 7

Timesteps

CIKM 2002

### Laser

Our Prediction starts here

Value

Timesteps

CIKM 2002

### Laser

Value

Comparison of prediction to correct values

Timesteps

CIKM 2002

FD

### Laser

L(opt) = 7

Corresponding NMSE is close to optimal

NMSE

CIKM 2002

Lag

### Speed and Scalability

• Preprocessing is linear in N

• Proportional to time taken to calculate FD

CIKM 2002

### Outline

• Introduction/Motivation

• Survey and Lag Plots

• Exact Problem Formulation

• Proposed Method

• Fractal Dimensions Background

• Our method

• Results

• Conclusions

CIKM 2002

### Conclusions

Our Method:

• Automatically set parameters

• In linear time on N

• CIKM 2002

### Conclusions

• Black-box non-linear time series forecasting

• Fractal Dimensions give a fast, automated method to set all parameters

• So, given any time series, we can automatically build a prediction system

• Useful in a sensor network setting

CIKM 2002

Extra

### Snapshot

http://snapdragon.cald.cs.cmu.edu/TSP

CIKM 2002

Extra

### Future Work

• Feature Selection

• Multi-sequence prediction

CIKM 2002

Extra

### Discussion – Some other problems

How to forecast?

Given:

• x1, x2, …, xN

• L(opt)

• k(opt)

How to find the k(opt) nearest neighbors quickly?

CIKM 2002

Extra

### Motivation

• Forecasting also allows us to

• Find outliers  anything that doesn’t match our prediction! 

• Find patterns  if different circumstances lead to similar predictions, they may be related.

CIKM 2002

Extra

### Motivation (Examples)

• EEGs : Patterns of electromagnetic impulses in the brain

• Intensity variations of white dwarf stars

• Highway usage over time

Sensors

• “Active Disks” for forecasting / prefetching / buffering

• “Smart House”  sensors monitor situation in a house

• Volcano monitoring

CIKM 2002

Extra

• Store all the delay vectors {xt-1, …, xt-L(opt)} and corresponding prediction xt

• Find the latest delay vector

xt

• Find nearest neighbors

Interpolate

• Interpolate

Xt-1

L(opt) = ?

K(opt) = ?

CIKM 2002

Extra

### Intuition

Fractal dimension

• The FD vs L plot does flatten out

• L(opt) = 1

CIKM 2002

Lag

Extra

### Inside Theory

• Internal state may be unobserved

• But the delay vector space is a faithful reconstruction of the internal system state

• So prediction in delay vector space is as good as prediction in state space

CIKM 2002

Extra

### Fractal Dimensions

• Many real-world datasets have fractional intrinsic dimension

• There exist fast (O(N)) methods to calculate the fractal dimension of a cloud of points [Belussi/1995]

CIKM 2002

Extra

### Speed and Scalability

• Preprocessing varies as L(opt)2

CIKM 2002