Toward a unified approach to fitting loss models
Download
1 / 34

Toward a unified approach to fitting loss models - PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on

Toward a unified approach to fitting loss models. Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004. Handout/slides. E-mail me [email protected] Overview. What problem is being addressed? The general idea The specific ideas Models to consider

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Toward a unified approach to fitting loss models' - meadow


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Toward a unified approach to fitting loss models

Toward a unified approach to fitting loss models

Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004



Overview
Overview

  • What problem is being addressed?

  • The general idea

  • The specific ideas

    • Models to consider

    • Recording the data

    • Representing the data

    • Testing a model

    • Selecting a model


The problem
The problem

  • Too many models

    • Two books – 26 distributions!

    • Can mix or splice to get even more

  • Data can be confusing

    • Deductibles, limits

  • Too many tests and plots

    • Chi-square, K-S, A-D, p-p, q-q, D


The general idea
The general idea

  • Limited number of distributions

  • Standard way to present data

  • Retain flexibility on testing and selection


Distributions
Distributions

  • Should be

    • Familiar

    • Few

    • Flexible


A few familiar distributions
A few familiar distributions

  • Exponential

    • Only one parameter

  • Gamma

    • Two parameters, a mode if a>1.

  • Lognormal

    • Two parameters, a mode

  • Pareto

    • Two parameters, a heavy right tail


Flexible
Flexible

  • Add by allowing mixtures

  • That is,

    where

    and all

  • Some restrictions:

    • Only the exponential can be used more than once.

    • Cannot use both the gamma and lognormal.


Why mixtures
Why mixtures?

  • Allows different shape at beginning and end (e.g. mode from lognormal, tail from Pareto).

  • By using several exponentials can have most any tail weight (see Keatinge).


Estimating parameters
Estimating parameters

  • Use only maximum likelihood

    • Asymptotically optimal

    • Can be applied in all settings, regardless of the nature of the data

    • Likelihood value can be used to compare different models


Representing the data
Representing the data

  • Why do we care?

    • Graphical tests require a graph of the empirical density or distribution function.

    • Hypothesis tests require the functions themselves.


What is the issue
What is the issue?

  • None if,

    • All observations are discrete or grouped

    • No truncation or censoring

  • But if so,

    • For discrete data the Kaplan-Meier product-limit estimator provides the empirical distribution function (and is the nonparametric mle as well).


Issue grouped data
Issue – grouped data

  • For grouped data,

    • If completely grouped, the histogram represents the pdf, the ogive the cdf.

    • If some grouped, some not, or multiple deductibles, limits, our suggestion is to replace the observations in the interval with that many equally spaced points.


Review
Review

  • Given a data set, we have the following:

    • A way to represent the data.

    • A limited set of models to consider.

    • Parameter estimates for each model.

  • The remaining tasks are:

    • Decide which models are acceptable.

    • Decide which model to use.


Example
Example

  • The paper has two example, we will look only at the second one.

  • Data are individual payments, but the policies that produced them had different deductibles (100, 250, 500) and different maximum payments (1,000, 3,000, 5,000).

  • There are 100 observations.



Distribution function plot
Distribution function plot

  • Plot the empirical and model cdfs together. Note, because in this example the smallest deductible is 100, the empirical cdf begins there.

  • To be comparable, the model cdf is calculated as


Example model
Example model

  • All plots and tests that follow are for a mixture of a lognormal and exponential distribution. The parameters are



Confidence bands
Confidence bands

  • It is possible to create 95% confidence bands. That is, we are 95% confident that the true distribution is completely within these bands.

  • Formulas adapted from Klein and Moeschberger with a modification for multiple truncation points (their formula allows only multiple censoring points).



Other cdf pictures
Other CDF pictures

  • Any function of the cdf, such as the limited expected value, could be plotted.

  • The only one shown here is the difference plot – magnify the previous plot by plotting the difference of the two distribution functions.



Histogram plot
Histogram plot

  • Plot a histogram of the data against the density function of the model.

  • For data that were not grouped, can use the empirical cdf to get cell probabilities.



Hypothesis tests
Hypothesis tests

  • Null-model fits

  • Alternative-it doesn’t

  • Three tests

    • Kolmogorov-Smirnov

    • Anderson-Darling

    • Chi-square


Kolmogorov smirnov
Kolmogorov-Smirnov

  • Test statistic is maximum difference between the empirical and model cdfs. Each difference is multiplied by a scaling factor related to the sample size at that point.

  • Critical values are way off when parameters estimated from data.


Anderson darling
Anderson-Darling

  • Test statistic looks complex:

  • where e is empirical and m is model.

  • The paper shows how to turn this into a sum.

  • More emphasis on fit in tails than for K-S test.


Chi square test
Chi-square test

  • You have seen this one before.

  • It is the only one with an adjustment for estimating parameters.


Results
Results

  • K-S: 0.5829

  • A-D: 0.2570

  • Chi-square p-value of 0.5608

  • The model is clearly acceptable. Simulation study needed to get p-values for these tests. Simulation indicates that the p-values are over 0.9.


Comparing models
Comparing models

  • Good picture

  • Better test numbers

  • Likelihood criterion such as Schwarz Bayesian. The SBC is the loglikelihood minus (r/2)ln(n) where r is the number of parameters and n is the sample size.



Which is the winner
Which is the winner?

  • Referee A – loglikelihood rules – pick gamma/exp/exp mixture

    • This is a world of one big model and the best is the best, simplicity is never an issue.

  • Referee B – SBC rules – pick exponential

    • Parsimony is most important, pay a penalty for extra parameters.

  • Me – lognormal/exp. Great pictures, better numbers than exponential, but simpler than three component mixture.


Can this be automated
Can this be automated?

  • We are working on software

  • Test version can be downloaded at www.cbpa.drake.edu/mixfit.

  • MLEs are good. Pictures and test statistics are not quite right.

  • May crash.

  • Here is a quick demo.


ad