Loading in 5 sec....

Toward a unified approach to fitting loss modelsPowerPoint Presentation

Toward a unified approach to fitting loss models

- 63 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Toward a unified approach to fitting loss models' - meadow

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Toward a unified approach to fitting loss models

Jacques Rioux and Stuart Klugman, for presentation at the IAC, Feb. 9, 2004

Handout/slides

- E-mail me
- [email protected]

Overview

- What problem is being addressed?
- The general idea
- The specific ideas
- Models to consider
- Recording the data
- Representing the data
- Testing a model
- Selecting a model

The problem

- Too many models
- Two books – 26 distributions!
- Can mix or splice to get even more

- Data can be confusing
- Deductibles, limits

- Too many tests and plots
- Chi-square, K-S, A-D, p-p, q-q, D

The general idea

- Limited number of distributions
- Standard way to present data
- Retain flexibility on testing and selection

Distributions

- Should be
- Familiar
- Few
- Flexible

A few familiar distributions

- Exponential
- Only one parameter

- Gamma
- Two parameters, a mode if a>1.

- Lognormal
- Two parameters, a mode

- Pareto
- Two parameters, a heavy right tail

Flexible

- Add by allowing mixtures
- That is,
where

and all

- Some restrictions:
- Only the exponential can be used more than once.
- Cannot use both the gamma and lognormal.

Why mixtures?

- Allows different shape at beginning and end (e.g. mode from lognormal, tail from Pareto).
- By using several exponentials can have most any tail weight (see Keatinge).

Estimating parameters

- Use only maximum likelihood
- Asymptotically optimal
- Can be applied in all settings, regardless of the nature of the data
- Likelihood value can be used to compare different models

Representing the data

- Why do we care?
- Graphical tests require a graph of the empirical density or distribution function.
- Hypothesis tests require the functions themselves.

What is the issue?

- None if,
- All observations are discrete or grouped
- No truncation or censoring

- But if so,
- For discrete data the Kaplan-Meier product-limit estimator provides the empirical distribution function (and is the nonparametric mle as well).

Issue – grouped data

- For grouped data,
- If completely grouped, the histogram represents the pdf, the ogive the cdf.
- If some grouped, some not, or multiple deductibles, limits, our suggestion is to replace the observations in the interval with that many equally spaced points.

Review

- Given a data set, we have the following:
- A way to represent the data.
- A limited set of models to consider.
- Parameter estimates for each model.

- The remaining tasks are:
- Decide which models are acceptable.
- Decide which model to use.

Example

- The paper has two example, we will look only at the second one.
- Data are individual payments, but the policies that produced them had different deductibles (100, 250, 500) and different maximum payments (1,000, 3,000, 5,000).
- There are 100 observations.

Distribution function plot

- Plot the empirical and model cdfs together. Note, because in this example the smallest deductible is 100, the empirical cdf begins there.
- To be comparable, the model cdf is calculated as

Example model

- All plots and tests that follow are for a mixture of a lognormal and exponential distribution. The parameters are

Confidence bands

- It is possible to create 95% confidence bands. That is, we are 95% confident that the true distribution is completely within these bands.
- Formulas adapted from Klein and Moeschberger with a modification for multiple truncation points (their formula allows only multiple censoring points).

Other CDF pictures

- Any function of the cdf, such as the limited expected value, could be plotted.
- The only one shown here is the difference plot – magnify the previous plot by plotting the difference of the two distribution functions.

Histogram plot

- Plot a histogram of the data against the density function of the model.
- For data that were not grouped, can use the empirical cdf to get cell probabilities.

Hypothesis tests

- Null-model fits
- Alternative-it doesn’t
- Three tests
- Kolmogorov-Smirnov
- Anderson-Darling
- Chi-square

Kolmogorov-Smirnov

- Test statistic is maximum difference between the empirical and model cdfs. Each difference is multiplied by a scaling factor related to the sample size at that point.
- Critical values are way off when parameters estimated from data.

Anderson-Darling

- Test statistic looks complex:
- where e is empirical and m is model.
- The paper shows how to turn this into a sum.
- More emphasis on fit in tails than for K-S test.

Chi-square test

- You have seen this one before.
- It is the only one with an adjustment for estimating parameters.

Results

- K-S: 0.5829
- A-D: 0.2570
- Chi-square p-value of 0.5608
- The model is clearly acceptable. Simulation study needed to get p-values for these tests. Simulation indicates that the p-values are over 0.9.

Comparing models

- Good picture
- Better test numbers
- Likelihood criterion such as Schwarz Bayesian. The SBC is the loglikelihood minus (r/2)ln(n) where r is the number of parameters and n is the sample size.

Which is the winner?

- Referee A – loglikelihood rules – pick gamma/exp/exp mixture
- This is a world of one big model and the best is the best, simplicity is never an issue.

- Referee B – SBC rules – pick exponential
- Parsimony is most important, pay a penalty for extra parameters.

- Me – lognormal/exp. Great pictures, better numbers than exponential, but simpler than three component mixture.

Can this be automated?

- We are working on software
- Test version can be downloaded at www.cbpa.drake.edu/mixfit.
- MLEs are good. Pictures and test statistics are not quite right.
- May crash.
- Here is a quick demo.

Download Presentation

Connecting to Server..