slide1 n.
Download
Skip this Video
Download Presentation
Nonlinear Models 8 February 1999

Loading in 2 Seconds...

play fullscreen
1 / 16

Nonlinear Models 8 February 1999 - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Data Mining in Finance. Andreas S. Weigend Leonard N. Stern School of Business, New York University. Nonlinear Models 8 February 1999. The seven steps of model building. 1. Task

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Nonlinear Models 8 February 1999' - betty_james


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Data Mining in Finance

Andreas S. Weigend

Leonard N. Stern School of Business, New York University

Nonlinear Models

8 February 1999

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

the seven steps of model building
The seven steps of model building
  • 1. Task
    • Predict distribution of portfolio returns, understand structure in yield curves, find profitable time scales, discover trade styles, …
  • 2. Data
    • Which data to use, and how to code/ preprocess/ represent them
  • 3. Architecture
  • 4. Objective/ Cost function (in-sample)
  • 5. Search/ Optimization/ Estimation
  • 6. Evaluation
  • 7. Analysis and Interpretation

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

how to make predictions
How to make predictions?
    • “Pattern” = Input + Output Pair
  • Keep all data
    • Nearest neighbor lookup
    • Local constant model
    • Local linear model
  • Throw away data, only keep model
    • Global linear model
    • Global nonlinear model
      • Neural network with hidden units
        • Sigmoids or hyperbolic tangents (tanh)
      • Radial basis functions
  • Keep only a few representative data point
      • Support vector machines

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

training data inputs and corresponding outputs
Training data: Inputs and corresponding outputs

output

input2

input1

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

what is the prediction for a new input
What is the prediction for a new input?

output

input2

input1

new input

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

nearest neighbor
Nearest neighbor
  • Use output value of nearest neighbor in input space as prediction

prediction

nearest neighbor

output

input2

input1

new input

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

local constant model
Local constant model
  • Use average of the outputs of nearby points in input space

output

input2

input1

new input

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

local linear model
Local linear model
  • Find best-fitting plane (linear model) through nearby points in input space

output

input2

input1

new input

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

nonlinear regression surface
Nonlinear regression surface
  • Minimize “energy” stored in the “springs”

output

input2

input1

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

throw away the data just keep the surface
Throw away the data… just keep the surface!

output

input2

input1

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

modeling an iterative process
Modeling – an iterative process

 Step 1: Task/ Problem definition

 Step 2: Data and Representation

 Step 3: Architecture

 Step 4: Objective/ Cost function (in-sample)

 Step 5: Search/ Optimization/ Estimation

 Step 6: Evaluation (out-of-sample)

 Step 7: Analysis and Interpretation

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

modeling issues
Modeling issues

 Step 1: Task and Problem definition

 Step 2: Data and Representation

 Step 3: Architecture

  • What are the “primitives” that make up the surface?

 Step 4: Objective/ Cost function (in-sample)

  • How flexible should the surface be?
    • Too rigid model: stiff board (global linear model)
    • Too flexible model: cellophane going through all points
    • Penalize too flexible models (regularization)

 Step 5: Search/ Optimization/ Estimation

  • How do we find the surface?

 Step 6: Evaluation (out-of-sample)

 Step 7: Analysis and Interpretation

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

step 3 architecture example of neural networks
Step 3: Architecture – Example of neural networks
  • Project the input vector x onto a weight vector w
    • w * x
  • This projection is then be nonlinearly “squashed” to give a hidden unit activation
    • h = tanh (w * x)
  • Usually, a constant c in the argument allows the shifting of the location
    • h = tanh (w * x + c)
  • There are several such hidden units, responding to different projections of the input vectors
  • Their activations are combined with weights v to form the output (and another constant b can be added)
    • output = v * h + b

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

neural networks compared to standard statistics
Neural networks compared to standard statistics
  • Comparison between neural nets and standard statistics
    • Complexity
      • Statistics: Fix order of interactions
      • Neural nets: Fix number of features
    • Estimation
      • Statistics: Find exact solution
      • Neural nets: Focus on path
  • Dimensionality
    • Number of inputs: Curse of dimensionality
      • Points far away in input space
    • Number of parameters: Blessing of dimensionality
      • Many hidden units make it easier to find good local minimum
      • But need to control for model complexity

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

step 4 cost function
Step 4: Cost function
  • Key problem:
    • Want to be good on new data...
    • ...but we only have data from the past
  • Always

observation y = f(input) + noise

  • Assume
    • Large sudden variations in output are due to noise
    • Small variation (systematic) are signal, expressed as f(input)
  • Flexible models
      • Good news: can fit any signal
      • Bad news: can also fit any noise
  • Requires modeling decisions:
    • Assumptions about model complexity
      • Weight decay, weight elimination, smoothness
    • Assumptions about noise: error model or noise model

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

step 5 determining the parameters
Step 5: Determining the parameters
  • Search with gradient descent: iterative
    • Vice to virtue: path important
    • Guide network through solution space
      • Hints
      • Weight pruning
      • Early stopping
      • Weight-elimination
      • Pseudo-data
      • Add noise
  • Alternative approaches:
    • Model to match the local noise level of the data
      • Local error bars
      • Gated experts architecture with adaptive variances

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU