data mining in health insurance
Download
Skip this Video
Download Presentation
Data mining in Health Insurance

Loading in 2 Seconds...

play fullscreen
1 / 64

Data mining in Health Insurance - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Data mining in Health Insurance. Introduction. Rob Konijn, rob.konijn@achmea.nl VU University Amsterdam Leiden Institute of Advanced Computer Science (LIACS) Achmea Health Insurance Currently working here Delivering leads for other departments to follow up Fraud, abuse

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Data mining in Health Insurance' - hamlet


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction
Introduction
  • Rob Konijn, rob.konijn@achmea.nl
    • VU University Amsterdam
    • Leiden Institute of Advanced Computer Science (LIACS)
    • Achmea Health Insurance
      • Currently working here
      • Delivering leads for other departments to follow up
        • Fraud, abuse
  • Research topic keywords: data mining/ unsupervised learning / fraud detection
outline
Outline
  • Intro Application
    • Health Insurance
    • Fraud detection
  • Part 1: Subgroup discovery
  • Part 2: Anomaly detection (slides partly by Z. Slavik, VU)
intro application
Intro Application
  • Health Insurance Data
  • Health Insurance in NL
    • Obligatory
    • Only private insurance companies
    • About 100 euro/month(everyone)+170 euro (income)
    • Premium increase of 5-12% each year

Achmea: about 6 million customers

funding of health insurance costs in the netherlands
Funding of Health Insurance Costs in the Netherlands

vereveningsfonds

vereveningsfonds

vereveningsfonds

vereveningsfonds

vereveningsfonds

vereveningsfonds

vereveningsfonds

vereveningsfonds

rijksbijdrage

verzekerden 18-

2 mld

vereveningsbijdrage

inkomensafh.

bijdrage

werkgevers 17 mld

18 mld

zorgverzekeraar

verzekerde

zorgverzekeraar

nominale premie 18+:

- rekenpremie (~€ 947/vrz): 12 mld

- opslag (~€ 150/vrz) : 2 mld

30 mld

zorguitgaven

verevenings model
Verevenings-model

Mannen

Vrouwen

0 - 4 jr

1,400

1,210

  • By population characteristics
    • Age
    • Gender
    • Income, social class
    • Type of work
  • Calculation afterwards
    • High costs compensation (>15.000 euro)

5 - 9 jr

1,026

936

10 - 14 jr

907

918

15 - 17 jr

964

1,062

18 - 24 jr

892

1,214

25 - 29 jr

870

1,768

30 - 34 jr

905

1,876

35 - 39 jr

980

1,476

40 - 44 jr

1,044

1,232

45 - 49 jr

1,183

1,366

50 - 54 jr

1,354

1,532

55 - 59 jr

1,639

1,713

60 - 64 jr

1,885

1,905

65 - 69 jr

2,394

2,201

70 - 74 jr

2,826

2,560

75 - 79 jr

3,244

2,886

80 - 84 jr

3,349

3,018

85 - 89 jr

3,424

3,034

90 jr e.o.

3,464

3,014

introduction application the data
Introduction Application:The Data
  • Transactional data
    • Records of an event
    • Visit to a medical practitioner
  • Charged directly by medical practioner
  • Patient is not involved
  • Risk of fraud
transactional data
Transactional Data
  • Transactions: Facts
    • Achmea: About 200 mln transactions per year
  • Info of customers and practitioners: dimensions
different levels of hierarchy
Different levels of hierarchy
  • Records represent events
  • However, for example for fraud detection, we are interested in customers, or medical practitoners
  • See examples next pages
  • Groups of records: Subgroup Discovery
  • Individual patients/practioners: outlier detection
different types of fraud hierarchy
Different types of fraud hierarchy
  • On a patient level, or on a hospital level:
handling different hierarchy
Handling different hierarchy
  • Creating profiles from transactional data
  • Aggregating costs over a time period
    • Each record: patient
      • Each attribute i =1 to n: cost spent on treatment i
  • Feature construction, for example
    • The ratio of long/short consults (G.P.)
    • The ratio of 3-way and 2 way fillings (Dentist)
    • Usually used for one-way analysis
different types of fraud detection
Different types of fraud detection
  • Supervised
    • A labeled fraud set
    • A labeled non-fraud set
    • Credit cards, debit cards
  • Unsupervised
    • No labels
    • Health Insurance, Cargo, telecom, tax etc.
unsupervised learning in health insurance data
Unsupervised learning in Health Insurance Data
  • Anomaly Detection (outlier detection)
    • Finding individual deviating points
  • Subgroup Discovery
    • Finding (descriptions of) deviating groups
  • Focus on differences and uncommon behavior
    • In contrast to other unsupervised learning methods
      • Clustering
      • Frequent Pattern mining
subgroup discovery
Subgroup Discovery
  • Goal: Find differences in claim behavior of medical practitioners
  • To detect inefficient claim behavior
    • Actions:
      • A visit from the account manager
      • To include in contract negotiations
    • In the extreme case: fraud
      • Investigation by the fraud detection department
  • By describing deviations of a practitioner from its peers
    • Subgroups
patient level subgroup discovery
Patient-level, Subgroup Discovery
  • Subgroup (orange): group of patients
  • Target (red)
    • Indicates whether a patient visited a practitioner (1), or not (0)
subgroup discovery quality measures
Subgroup Discovery: Quality Measures
  • Target Dentist: 1672 patiënten
    • Compare with peer group, 100.000 patients in total
  • Subgroup V11 > 42 euro : 10347 patients
    • V11: one sided filling
  • Crosstable
the cross table
The cross table
  • Cross table in data
  • Cross table expected:
  • Assuming independence
calculating wracc and lift
Calculating Wracc and Lift
  • Size subgroup = P(S) = 0.10347, size target dentist = P(T) = 0.01672
  • Weighted Relative ACCuracy (WRAcc) = P(ST) – P(S)P(T) = (871 – 173)/100000 = 689/100000
  • Lift = P(ST)/P(S)P(T) = 871/173 = 5.03
making sd more useful adding prior knowledge
Making SD more useful: adding prior knowledge
  • Adding prior knowledge
    • Background variables patient (age, gender, etc.)
    • Specialism practitioner
    • For dentistry: choice of insurance
  • Adding already known differences
    • Already detected by domain experts themselves
    • Already detected during a previous data mining run
quality measures
Quality Measures
  • Ratio (Lift)
  • Difference (WRAcc)
  • Squared sum (Chi-square statistic)
example iterative approach
Example, iterative approach
  • Idea: add subgroup to prior knowledge iteratively
  • Target = single pharmacy
  • Patients that visited the hospital in last 3 years removed from data
  • Compare with peer group (400,000 patients), 2929 patiënts of target pharmacy
  • Top subgroup : “B03XA01 (Erythropoietin)>0 euro”

1 ‘target’

pharmacy

rest

subgroup

B03XA01 > 0

rest

next iteration
Next iteration
  • Add “B03XA01 (EPO) >0 euro” to prior knowledge
  • Next best subgroup: “N05AX08 (Risperdal)>= 500 euro”
figure describing subgroup n05ax08 500
Figure describing subgroup:N05AX08 > 500

Left: target pharmacy, right: other pharmacies

addition adding costs to quality measure
Addition: adding costs to quality measure
    • M55: dental cleaning
    • V11: 1-way filling
    • V21: polishing
  • Cost of treatments in subgroup 370 euro (average)
  • 791 more patients than expected
  • Total quality 791*370 = 292,469 euro
iterative approach top 3 subgroups
Iterative approach, top 3 subgroups
    • V12: 2-sided filling
    • V21: polishing
    • V60: indirect pulpa covering
  • V21 and V60 are not allowed on the same day
  • Claim back (from all dentists): 1.3 million euro
other target types double binary target
Other target types: double binary target
  • Target 1: year: 2009 or 2008
  • Target 2: target practitioner
  • Pattern:
    • M59: extensive (expensive) dental cleaning
    • C12: second consult in one year
  • Crosstable:
other target types multiclass target
Other target types: Multiclass target
  • Subgroup (orange): group of patients
  • Target (red), now is a multi-value column, one value per dentist
anemaly detection

Anemaly Detection

The exampleabovecontains a contextualanomaly...

outline anomaly detection
Outline Anomaly Detection
  • Anomalies
    • Definition
    • Types
    • Technique categories
    • Examples
  • Lecture based on
    • Chandola et al. (2009). Anomaly Detection: A Survey
    • Paper in BB

38

definition
Definition
  • “Anomaly detection refers to the problem of finding patternsin data that do not conform to expected behavior”
  • Anomalies, aka.
    • Outliers
    • Discordant observations
    • Exceptions
    • Aberrations
    • Surprises
    • Peculiarities
    • Contaminants
anomaly types
Anomaly types

Point anomalies

  • A data point is anomalous with respect to the rest of the data
not covered today
Not covered today
  • Other types of anomalies:
    • Collective anomalies
    • Contextual anomalies
  • Other detection approaches:
    • Supervised learning
    • Semi supervised
      • Assume training data is from normal class
      • Use to detect anomalies in the future
we focus on outlier scores
We focus on outlier scores
  • Scores
    • You get a ranked list of anomalies
    • “We investigate the top 10”
    • “An anomaly has a score of at least 134”
    • Leads followed by fraud investigators
  • Labels

ANOMALY

detection method categorisation
Detectionmethodcategorisation
  • Model based
  • Depth based
  • Distance Based
  • Information theory related (not covered)
  • Spectral theory related (not covered)
model based
Model based
  • Build a (statistical) model of the data
  • Data instances occur in high probability regions of a stochastic model, while anomalies occur in low probability regions
  • Or: data instances have a high distance to the model are outliers
  • Or: data instances have a high influence on the model are outliers
example one way outlier detection
Example: one way outlier detection
  • Pharmacy records
  • Records represent patients
  • One attribute at a time:
    • This example: attribute describing the costs spent on fertility medication (gonodatropin) in a year
  • We could use such one way detection for each attribute in the data
example model non parametric distribution
Example, model = non-parametric distribution
  • Left: kernel density estimate
  • Right: boxplot
other models possible
Other models possible
  • Probabilistic
    • Bayesian networks
  • Regression models
    • Regression trees/ random forests
    • Neural networks
  • Outlier score = prediction error (residual)
depth based methods
Depth based methods
  • Applied on 1-4 dimensional datasets
    • Or 1-4 attributes at a time
  • Objects that have a high distance to the “center of the data” are considered outliers
  • Example Pharmacy:
    • Records represent patients
    • 2 attributes:
      • Costs spent on diabetes medication
      • Costs spent on diabetes testing material
distance based nearest neighbor based
Distance based (nearest neighbor based)
  • Assumption:
    • Normal data instancesoccur in denseneighbourhoods, whileanomaliesoccurfarfromtheirclosestneighbours
similarity distance
Similarity/distance
  • You need a similarity measure between two data points
    • Numeric attributes: Eucledian, etc.
    • Nominal: simple match often enough
    • Multivariate:
      • Distance using all attributes
      • Distance between attribute values, then combine
example dentistry data
Example, dentistry data
  • Records represent dentists
  • Attributes are 14 cost categories
    • Denote the percentage of patients that received a claim from the category
option 2 use relative densities of neighbourhoods
Option 2:Use relative densities of neighbourhoods
  • Density of neighbourhood estimated for each instance
  • Instances in the low density neighbourhoods are anomalous, others normal
  • Note:
    • Distance to kth neighbour is an estimate for the inverse of density (large distance  low density)
    • But this estimates outliers in varying density neighbourhoods badly
slide57
LOF

Average local density of k nearest neighbours

Local density of instance

  • Local Outlier Factor:
  • Local density:
    • k divided by the volume of the smallest hyper-sphere centred around the instance, containing k neighbours
  • Anomalous instance:
    • Local density will belower than that ofthe k nearest neighbours
3 clustering based a d techniques
3. Clustering based a.d. techniques
  • 3 possibilities;

1. Normal data instances belong to a cluster in the data, while anomalies do not belong to any cluster

    • Use clustering methods that do not force all instances to belong to a cluster
      • DBSCAN, ROCK, SSN

2. Distance to the cluster center = outlier score

3. Clusters with too few points are outlying clusters

k means with 6 clusters centers of the dentistry data set
K-means with 6 clusters, centers of the dentistry data set
  • Attributes: percent of patient that received claim from cost category
  • Clusters correspond to specialism
    • Dentist
    • Orthodontist
    • Orthodontist (charged by dentist)
    • Dentist
    • Dentist
    • Dental hygenist
combining subgroup discovery and outlier detection
Combining Subgroup Discovery and Outlier Detection
  • Describe regions with outliers using SD
  • Identify suspicious medical practitioners
  • 2 or 3 step approach to describe outliers:
    • Calculate outlier score
    • Use subgroup discovery to describe regions with outliers.
    • (optional) identify the involved medical practitioners
example output
Example output:
  • Look at patients with ‘P30>1050 euro’ for practitioner number 221
  • Left: all data, right: practitioner 221
descriptions of outliers loci outlier score
Descriptions of outliers: LOCI outlier score
  • 1. Calculate outlier score
    • LOCI is a density based outlier score
  • 2. Describe outlying regions
  • Result top subgroup:
    • Orthodontics (dentist) 0.044 ^ Orthodontics 0.78
    • Group of 9 dentists with an average score of 3.9
conclusions
Conclusions
  • Health insurance: Interesting application domain
    • Very relevant
  • Outlier Detection and Subgroup discovery are useful
ad