Loading in 5 sec....

F4: Large Scale Automated Forecasting Using FractalsPowerPoint Presentation

F4: Large Scale Automated Forecasting Using Fractals

- By
**yanka** - Follow User

- 70 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' F4: Large Scale Automated Forecasting Using Fractals' - yanka

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### F4: Large Scale Automated Forecasting Using Fractals

Outline

Outline

Outline

Outline

Outline

-Deepayan Chakrabarti

-Christos Faloutsos

CIKM 2002

Outline

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

General Problem Definition

Value

Time

Given a time series {xt}, predict its future course, that is, xt+1, xt+2, ...

CIKM 2002

Motivation

Traditional fields

- Financial data analysis
- Physiological data, elderly care
- Weather, environmental studies

Sensor Networks(MEMS, “SmartDust”)

- Long / “infinite” series
- No human intervention “black box”

CIKM 2002

Outline

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

How to forecast?

- ARIMA but linearity assumption
- Neural Networks but large number of parameters and long training times [Wan/1993, Mozer/1993]
- Hidden Markov Models O(N2) in number of nodes N; also fixing N is a problem [Ge+/2000]
- Lag Plots

CIKM 2002

Q1: Lag = ?

Q2: K = ?

Interpolate these…

To get the final prediction

4-NN

New Point

Lag Plotsxt

xt-1

CIKM 2002

Why Lag Plots?

- Based on the “Takens’ Theorem” [Takens/1981]
- which says that delay vectors can be used for predictive purposes

CIKM 2002

Inside Theory

Example: Lotka-Volterra equations

ΔH/Δt = rH – aH*P ΔP/Δt = bH*P – mP

H is density of preyP is density of predators

Suppose only H(t) is observed. Internal state is (H,P).

CIKM 2002

Outline

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

Problem at hand

- Given {x1, x2, …, xN}
- Automatically set parameters - L(opt) (from Q1) - k(opt) (from Q2)
- in Linear time on N
- to minimise Normalized Mean Squared Error (NMSE) of forecasting

CIKM 2002

Previous work/Alternatives

- Manual Setting : BUT infeasible [Sauer/1992]
- CrossValidation : BUT Slow; leave-one-out crossvalidation ~ O(N2logN) or more
- “False Nearest Neighbors” : BUT Unstable [Abarbanel/1996]

CIKM 2002

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

Intrinsic Dimensionality

≈ Degrees of Freedom

≈ Information about Xt given Xt-1

X(t-1)

Intuitionx(t)

time

The Logistic Parabola xt = axt-1(1-xt-1) + noise

CIKM 2002

Intuition

- To find L(opt):
- Go further back in time (ie., consider Xt-2, Xt-3 and so on)
- Till there is no more information gained about Xt

CIKM 2002

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

Fractal Dimensions

- FD = intrinsic dimensionality

“Embedding” dimensionality = 3

Intrinsic dimensionality = 1

CIKM 2002

Fractal Dimensions

FD = intrinsic dimensionality [Belussi/1995]

log( # pairs)

- Points to note:
- FD can be a non-integer
- There are fast methods to compute it

CIKM 2002

log(r)

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

f

L(opt)

Q1: Finding L(opt)- Use Fractal Dimensions to find the optimal lag length L(opt)

Fractal Dimension

Lag (L)

CIKM 2002

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

Datasets

- Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Time

CIKM 2002

Datasets

- Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Time

- LORENZ: Models convection currents in the air

CIKM 2002

Datasets

- Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Error NMSE = ∑(predicted-true)2/σ2

Time

- LORENZ: Models convection currents in the air

- LASER: fluctuations in a Laser over time (from the Santa Fe Time Series Competition, 1992)

CIKM 2002

Speed and Scalability

- Preprocessing is linear in N
- Proportional to time taken to calculate FD

CIKM 2002

- Introduction/Motivation
- Survey and Lag Plots
- Exact Problem Formulation
- Proposed Method
- Fractal Dimensions Background
- Our method

- Results
- Conclusions

CIKM 2002

Conclusions In linear time on N

Our Method:

- Automatically set parameters
- L(opt) (answers Q1)
- k(opt) (answers Q2)

CIKM 2002

Conclusions

- Black-box non-linear time series forecasting
- Fractal Dimensions give a fast, automated method to set all parameters
- So, given any time series, we can automatically build a prediction system
- Useful in a sensor network setting

CIKM 2002

Discussion – Some other problems

How to forecast?

Given:

- x1, x2, …, xN
- L(opt)
- k(opt)

How to find the k(opt) nearest neighbors quickly?

CIKM 2002

Motivation

- Forecasting also allows us to
- Find outliers anything that doesn’t match our prediction!
- Find patterns if different circumstances lead to similar predictions, they may be related.

CIKM 2002

Motivation (Examples)

Traditional

- EEGs : Patterns of electromagnetic impulses in the brain
- Intensity variations of white dwarf stars
- Highway usage over time

Sensors

- “Active Disks” for forecasting / prefetching / buffering
- “Smart House” sensors monitor situation in a house
- Volcano monitoring

CIKM 2002

- Store all the delay vectors {xt-1, …, xt-L(opt)} and corresponding prediction xt

- Find the latest delay vector

xt

- Find nearest neighbors

Interpolate

- Interpolate

Xt-1

General MethodL(opt) = ?

K(opt) = ?

CIKM 2002

Inside Theory

- Internal state may be unobserved
- But the delay vector space is a faithful reconstruction of the internal system state
- So prediction in delay vector space is as good as prediction in state space

CIKM 2002

Fractal Dimensions

- Many real-world datasets have fractional intrinsic dimension
- There exist fast (O(N)) methods to calculate the fractal dimension of a cloud of points [Belussi/1995]

CIKM 2002

Download Presentation

Connecting to Server..