1 / 45

On Predictive Modeling for Claim Severity

On Predictive Modeling for Claim Severity. Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005. Problems with Experience Rating for Excess of Loss Reinsurance. Use submission claim severity data Relevant, but Not credible Not developed Use industry distributions

alvinwalter
Download Presentation

On Predictive Modeling for Claim Severity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

  2. Problems with Experience Rating for Excess of Loss Reinsurance • Use submission claim severity data • Relevant, but • Not credible • Not developed • Use industry distributions • Credible, but • Not relevant (???)

  3. General Problems withFitting Claim Severity Distributions • Parameter uncertainty • Fitted parameters of chosen model are estimates subject to sampling error. • Model uncertainty • We might choose the wrong model. There is no particular reason that the models we choose are appropriate. • Loss development • Complete claim settlement data is not always available.

  4. Outline of Remainder of Talk • Quantifying Parameter Uncertainty • Likelihood ratio test • Incorporating Model Uncertainty • Use Bayesian estimation with likelihood functions • Uncertainty in excess layer loss estimates • Bayesian estimation with prior models based on data reported to a statistical agent • Reflect insurer heterogeneity • Develops losses

  5. How Paper is Organized • Start with classical hypothesis testing. • Likelihood ratio test • Calculate a confidence region for parameters. • Calculate a confidence interval for a function of the parameters. • For example, the expected loss in a layer • Introduce a prior distribution of parameters. • Calculate predictive mean for a function of parameters.

  6. The Likelihood Ratio Test

  7. The Likelihood Ratio Test

  8. An Example – The Pareto Distribution • Simulate random sample of size 1000 a = 2.000, q = 10,000

  9. Hypothesis Testing Example • Significance level = 5% c2 critical value = 5.991 • H0: (q,a) = (10000, 2) • H1: (q,a) ≠ (10000, 2) • lnLR = 2(-10034.660 + 10035.623) =1.207 • Accept H0

  10. Hypothesis Testing Example • Significance level = 5% c2 critical value = 5.991 • H0: (q,a) = (10000, 1.7) • H1: (q,a) ≠ (10000, 1.7) • lnLR = 2(-10034.660 + 10045.975) =22.631 • Reject H0

  11. Confidence Region • X% confidence region corresponds to the 1-X% level hypothesis test. • The set of all parameters (q,a) that fail to reject corresponding H0. • For the 95% confidence region: • (10000, 2.0) is in. • (10000, 1.7) out.

  12. Confidence Region Outer Ring 95%, Inner Ring 50%

  13. Grouped Data • Data grouped into four intervals • 562 under 5000 • 181 between 5000 and 10000 • 134 between 10000 and 20000 • 123 over 20000 • Same data as before, only less information is given.

  14. Confidence Region for Grouped Data Outer Ring 95%, Inner Ring 50%

  15. Confidence Region for Ungrouped Data Outer Ring 95%, Inner Ring 50%

  16. Estimation with Model UncertaintyCOTOR Challenge – November 2004 • COTOR published 250 claims • Distributional form not revealed to participants • Participants were challenged to estimate the cost of a $5M x $5M layer. • Estimate confidence interval for pure premium

  17. You want to fit a distribution to 250 Claims • Knee jerk first reaction, plot a histogram.

  18. This will not do! Take logs • And fit some standard distributions.

  19. Still looks skewed. Take double logs. • And fit some standard distributions.

  20. Still looks skewed. Take triple logs. • Still some skewness. • Lognormal and gamma fits look somewhat better.

  21. Candidate #1Quadruple lognormal

  22. Candidate #2Triple loggamma

  23. Candidate #3Triple lognormal

  24. All three cdf’s are within confidence interval for the quadruple lognormal.

  25. Elements of Solution • Three candidate models • Quadruple lognormal • Triple loggamma • Triple lognormal • Parameter uncertainty within each model • Construct a series of models consisting of • One of the three models. • Parameters within a broad confidence interval for each model. • 7803 possible models

  26. Steps in Solution • Calculate likelihood (given the data) for each model. • Use Bayes’ Theorem to calculate posterior probability for each model • Each model has equal prior probability.

  27. Steps in Solution • Calculate layer pure premium for 5 x 5 layer for each model. • Expected pure premium is the posterior probability weighted average of the model layer pure premiums. • Second moment of pure premium is the posterior probability weighted average of the model layer pure premiums squared.

  28. CDF of Layer Pure Premium Probability that layer pure premium ≤ x equals Sum of posterior probabilities for which the model layer pure premium is ≤ x

  29. Numerical Results

  30. Histogram of Predictive Pure Premium

  31. Example with Insurance Data • Continue with Bayesian Estimation • Liability insurance claim severity data • Prior distributions derived from models based on individual insurer data • Prior models reflect the maturity of claim data used in the estimation

  32. Initial Insurer Models • Selected 20 insurers • Claim count in the thousands • Fit mixed exponential distribution to the data of each insurer • Initial fits had volatile tails • Truncation issues • Do small claims predict likelihood of large claims?

  33. Initial Insurer Models

  34. Low Truncation Point

  35. High Truncation Point

  36. Selections Made • Truncation point = $100,000 • Family of cdf’s that has “correct” behavior • Admittedly the definition of “correct” is debatable, but • The choices are transparent!

  37. Selected Insurer Models

  38. Selected Insurer Models

  39. Each model consists of • The claim severity distribution for all claims settled within 1 year • The claim severity distribution for all claims settled within 2 years • The claim severity distribution for all claims settled within 3 years • The ultimate claim severity distribution for all claims • The ultimate limited average severity curve

  40. Three Sample Insurers Small, Medium and Large • Each has three years of data • Calculate likelihood functions • Most recent year with #1 on prior slide • 2nd most recent year with #2 on prior slide • 3rd most recent year with #3 on prior slide • Use Bayes theorem to calculate posterior probability of each model

  41. Formulas for Posterior Probabilities Model (m) Cell Probabilities Number of claims Likelihood (m) Using Bayes’ Theorem

  42. ResultsTaken from paper.

  43. Formulas for Ultimate Layer Pure Premium • Use #5 on model (3rd previous) slide to calculate ultimate layer pure premium

  44. Results • All insurers were simulated from same population. • Posterior standard deviation decreases with insurer size.

  45. Possible Extensions • Obtain model for individual insurers • Obtain data for insurer of interest • Calculate likelihood, Pr{data|model}, for each insurer’s model. • Use Bayes’ Theorem to calculate posterior probability of each model • Calculate the statistic of choice using models and posterior probabilities • e.g. Loss reserves

More Related