Data Mining in Finance
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Nonlinear Models 8 February 1999 PowerPoint PPT Presentation


  • 64 Views
  • Uploaded on
  • Presentation posted in: General

Data Mining in Finance. Andreas S. Weigend Leonard N. Stern School of Business, New York University. Nonlinear Models 8 February 1999. The seven steps of model building. 1. Task

Download Presentation

Nonlinear Models 8 February 1999

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The seven steps of model building

Data Mining in Finance

Andreas S. Weigend

Leonard N. Stern School of Business, New York University

Nonlinear Models

8 February 1999

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


The seven steps of model building

The seven steps of model building

  • 1. Task

    • Predict distribution of portfolio returns, understand structure in yield curves, find profitable time scales, discover trade styles, …

  • 2. Data

    • Which data to use, and how to code/ preprocess/ represent them

  • 3. Architecture

  • 4. Objective/ Cost function (in-sample)

  • 5. Search/ Optimization/ Estimation

  • 6. Evaluation

  • 7. Analysis and Interpretation

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


How to make predictions

How to make predictions?

  • “Pattern” = Input + Output Pair

  • Keep all data

    • Nearest neighbor lookup

    • Local constant model

    • Local linear model

  • Throw away data, only keep model

    • Global linear model

    • Global nonlinear model

      • Neural network with hidden units

        • Sigmoids or hyperbolic tangents (tanh)

      • Radial basis functions

  • Keep only a few representative data point

    • Support vector machines

  • RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Training data inputs and corresponding outputs

    Training data: Inputs and corresponding outputs

    output

    input2

    input1

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    What is the prediction for a new input

    What is the prediction for a new input?

    output

    input2

    input1

    new input

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Nearest neighbor

    Nearest neighbor

    • Use output value of nearest neighbor in input space as prediction

    prediction

    nearest neighbor

    output

    input2

    input1

    new input

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Local constant model

    Local constant model

    • Use average of the outputs of nearby points in input space

    output

    input2

    input1

    new input

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Local linear model

    Local linear model

    • Find best-fitting plane (linear model) through nearby points in input space

    output

    input2

    input1

    new input

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Nonlinear regression surface

    Nonlinear regression surface

    • Minimize “energy” stored in the “springs”

    output

    input2

    input1

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Throw away the data just keep the surface

    Throw away the data… just keep the surface!

    output

    input2

    input1

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Modeling an iterative process

    Modeling – an iterative process

     Step 1: Task/ Problem definition

     Step 2: Data and Representation

     Step 3: Architecture

     Step 4: Objective/ Cost function (in-sample)

     Step 5: Search/ Optimization/ Estimation

     Step 6: Evaluation (out-of-sample)

     Step 7: Analysis and Interpretation

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Modeling issues

    Modeling issues

     Step 1: Task and Problem definition

     Step 2: Data and Representation

     Step 3: Architecture

    • What are the “primitives” that make up the surface?

       Step 4: Objective/ Cost function (in-sample)

    • How flexible should the surface be?

      • Too rigid model: stiff board (global linear model)

      • Too flexible model: cellophane going through all points

      • Penalize too flexible models (regularization)

         Step 5: Search/ Optimization/ Estimation

    • How do we find the surface?

       Step 6: Evaluation (out-of-sample)

       Step 7: Analysis and Interpretation

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Step 3 architecture example of neural networks

    Step 3: Architecture – Example of neural networks

    • Project the input vector x onto a weight vector w

      • w * x

    • This projection is then be nonlinearly “squashed” to give a hidden unit activation

      • h = tanh (w * x)

    • Usually, a constant c in the argument allows the shifting of the location

      • h = tanh (w * x + c)

    • There are several such hidden units, responding to different projections of the input vectors

    • Their activations are combined with weights v to form the output (and another constant b can be added)

      • output = v * h + b

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Neural networks compared to standard statistics

    Neural networks compared to standard statistics

    • Comparison between neural nets and standard statistics

      • Complexity

        • Statistics: Fix order of interactions

        • Neural nets: Fix number of features

      • Estimation

        • Statistics: Find exact solution

        • Neural nets: Focus on path

    • Dimensionality

      • Number of inputs: Curse of dimensionality

        • Points far away in input space

      • Number of parameters: Blessing of dimensionality

        • Many hidden units make it easier to find good local minimum

        • But need to control for model complexity

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Step 4 cost function

    Step 4: Cost function

    • Key problem:

      • Want to be good on new data...

      • ...but we only have data from the past

    • Always

      observation y = f(input) + noise

    • Assume

      • Large sudden variations in output are due to noise

      • Small variation (systematic) are signal, expressed as f(input)

    • Flexible models

      • Good news: can fit any signal

      • Bad news: can also fit any noise

  • Requires modeling decisions:

    • Assumptions about model complexity

      • Weight decay, weight elimination, smoothness

    • Assumptions about noise: error model or noise model

  • RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


    Step 5 determining the parameters

    Step 5: Determining the parameters

    • Search with gradient descent: iterative

      • Vice to virtue: path important

      • Guide network through solution space

        • Hints

        • Weight pruning

        • Early stopping

        • Weight-elimination

        • Pseudo-data

        • Add noise

    • Alternative approaches:

      • Model to match the local noise level of the data

        • Local error bars

        • Gated experts architecture with adaptive variances

    RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU


  • Login