260 likes | 433 Views
2. Themes. The HistoryWhat Does the Question Mean?Simpson's Paradox - Need for Multivariate AnalysisWhat Has Been Done So Far?Our Large-Scale Data Mining ExperienceGoing Beyond CreditConclusions. 3. The History. Pricing/Class PlansFew factors before World War II Explosion of class plan factors after the WarCurrent class plans (Auto) ? territory, driver, vehicle, loss and violation, others, tiers/company, etc.Actuarial techniques ? Minimum Bias
E N D
1. 1 Does Credit Score Really Help Explain Insurance Losses?
Cheng-Sheng Peter Wu, FCAS, ASA, MAAA,
Jim Guszcza, ACAS, MAAA, Ph. D.
2. 2 Themes The History
What Does the Question Mean?
Simpson’s Paradox - Need for Multivariate Analysis
What Has Been Done So Far?
Our Large-Scale Data Mining Experience
Going Beyond Credit
Conclusions
3. 3 The History Pricing/Class Plans
Few factors before World War II
Explosion of class plan factors after the War
Current class plans (Auto) – territory, driver, vehicle, loss and violation, others, tiers/company, etc.
Actuarial techniques – Minimum Bias & GLM
4. 4 The History Credit
First important factor identified over the past 2 decades
Composite multivariate score vs. raw credit information
Introduced in late 80’s and early 90’s
Viewed at first as a “secret weapon”
Currently almost everyone is using it
Industry scores vs. proprietary scores
Quiet, confidential, controversial, black-box, …etc
5. 5 What Does the Question Mean? Can Credit Score Really “Explain” Ins Losses?
“X explains Y”
Weaker than claiming that X causes Y
Stronger than merely reporting that X is correlated with Y
6. 6 What Does the Question Mean? Working Definition
We say that “X helps explain Y” if:
X is correlated with Y
The correlation does not go away when other available, measurable information is introduced
7. 7 What Does the Question Mean? Intuition Behind the Definition
It might be okay for X to be a proxy for a “true” cause of Y
Testosterone level might be a true cause of auto losses…. But it’s not available
Age/Gender is a reasonable proxy
It might not be okay for X to be a proxy for other available predictive information
8. 8 What Does the Question Mean? Applying the Definition
Suppose we see that credit score plays an important role in a multivariate regression equation that predicts loss ratio
Then it is fair to say the credit helps explain insurance losses
A multivariate study is needed
9. 9 Simpson’s Paradox – Need for Multivariate Analysis Statistics can lie
Illustrates how a univariate association can lead to a spurious conclusion
The “true” explanatory factor is masked by the spurious correlation
Famous example: 1973 Berkeley admissions data
10. 10 Simpson’s Paradox – Need for Multivariate Analysis The Berkeley Example (stylized)
2200 people applied for admission
1100 men; 1100 women
210 men, 120 women were accepted.
Clear-cut case of gender discrimination…
…. Or is it?
11. 11 Simpson’s Paradox – Need for Multivariate Analysis
12. 12 Simpson’s Paradox – Need for Multivariate Analysis
13. 13 Simpson’s Paradox – Need for Multivariate Analysis
14. 14 What Has Been Done So Far We (actuaries) have been quiet
Few published actuarial studies/opinions
NAIC/Tillinghast (1997)
Monaghan’s Study (2000)
Recent/related studies
Virginia State Study (1999)
CAS Sub-Committee (2002)
Washington State Study (2003)
University of Texas Study (2003)
15. 15 What Has Been Done So Far Relevant Actuarial/Statistical Principles
Pure premium vs. loss ratio
Loss ratio studies go beyond existing rating plans, and are implicitly multivariate
Independence vs. correlation
Most insurance variables are correlated
Univariate vs. multivariate
Correlated variables call for multivariate studies for true answers (Simpson’s Paradox)
Credibility vs. homogeneity
Studies need to be credible and representative
16. 16 What Has Been Done So Far The Tillinghast Study
9 companies’ data, seems representative
Loss ratio study
No other predictive variables included in the study
No detailed information given about the data
Strong correlation with loss ratio, seems credible
This is true, but it doesn’t answer our question and doesn’t quiet the critics
17. 17 What Has Been Done So Far
18. 18 What Has Been Done So Far Monaghan’s Study
Loss ratio study
Large amount of data – credible analysis
Analyze individual credit variables as well as score
Multivariate analysis – limited to score + 1 traditional rating variable at a time
Shows strong correlations with loss ratio do not go away in the presence of other variables
Another good step, but we can go further
19. 19 Our Large-Scale Data Mining Experience Our Work
Loss ratio studies
Multiple studies - representative
Large amounts of data – credible
Hundreds of variables tested along with credit – truly multivariate
Policy, driver, vehicle, coverages, billing, agency, external data, synthetic, …etc.
Sound actuarial and statistical model design
Disciplined data mining process
20. 20 Our Large-Scale Data Mining Experience What Have We Found Out?
Credit score is always one of top variables selected for the multivariate models
Credit score has among the strongest parameters and statistical measurements (t-score)
Credit’s predictive power does not go away in the truly multivariate context
Removing credit score dampens the predictive power of the models
21. 21 Our Large-Scale Data Mining Experience What Do We Conclude?
We conclude that credit score bears an unambiguous relationship to insurance losses, and is not a mere proxy for other kinds of information available to insurance companies.
This does not mean that credit score is the “cause” of insurance losses
22. 22 Our Large-Scale Data Mining Experience Why Is Credit Score Correlated with Ins Losses?
Beyond the scope of our work
Emphasis is not causation
Plausible speculations include
Stress/planning & organization
Risk-seeking behavior
??
Analogy: Age/Gender might be a proxy for testosterone
23. 23 Going Beyond Credit Can We Do Well Without Credit?
YES: non-credit predictive models are
Valuable alternative to credit scores
Flexible
Tailored to individual companies
Comparable predictive power to credit scores
Also possible to build mixed credit/non-credit models
24. 24 Going Beyond Credit Keys to Building Successful Non-Credit Models:
Fully utilize all sources of information
Leverage company’s internal data sources
Enriched with other external data sources
Use large amount of data
Employ disciplined analytical process
Utilize state-of-the-art modeling tools
Apply multivariate methodology
25. 25 Going Beyond Credit Advantages of Going Beyond Credit
Next generation of competitive advantage
More variables, more predictive power
Leverages company’s internal data sources
More flexibility
Address regulatory issues and public concerns
Expense savings
Everyone gets a score (less of a “no hit” problem)
More customized – less “plain vanilla” than credit score
26. 26 Conclusions Credit works… even in a fully multivariate setting
But non-credit models can work well too!
What it means to us – beginning of a new era
Advances in computer technology
Advances in predictive modeling techniques
Large scale multivariate studies now practical
More external and internal info, anything else out there?
Other ways to go beyond credit?
27. 27 Conclusions Future works on this topic
Multivariate pure premium analysis would provide more insights
Further study of public policy issues
WA, VA came to opposite conclusions
Comparison of various existing scoring models