- 257 Views
- Uploaded on
- Presentation posted in: General

Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation

Richard Frederick, Ph.D., ABPP (Forensic)

US Medical Center for Federal Prisoners

Springfield, Missouri

I am not a neuropsychologist.

My view of brain

Your view of brain

My board certifications:

Forensic Psychology

American Board of Professional Psychology

Assessment Psychology

American Board of Assessment Psychology

My professional goal:

Use tests properly in forensic psychological assessments

Goals of workshop

Participants in this workshop will be able

to employ Excel graphing methods:

--to evaluate classification characteristics of

symptom validity tests

--to adapt symptom validity test scores to their

individual, local, base rates

--to combine information from local base rate

and multiple symptom validity tests

richardfrederick.com

- The SIRS has sensitivity = .485 and specificity = .995.
- The SIRS was administered to 131 criminal defendants
- who were strongly suspected of feigned psychopathology.
- 68% of them were categorized as feigning by the SIRS

What is a classification test?

A structured routine for determining

which individuals belong to which

of two groups.

- There are two groups.
- (2) It’s not easy to determine which
- group an individual belongs to
- without the help of the test.

Real World

The distributions represent our

estimations of how the populations of

the two groups score on the test.

We generally estimate the population

distributions by sampling. We notice

that the populations have two separate,

but overlapping distributions. The extent

of the overlap is of concern to us.

- Questions that must be addressed in
- research before we can continue:
- Are there really two separate groups?
- Can we effectively represent the
- population distributions by sampling?

Real World

What we notice next.

The mean separation between the

groups is 10 points.

Persons in Population A have a mean

score that is 10 points below persons in

Population B.

The sd for each population is the same. The

mean separation between groups is one sd.

When researchers talk about mean

separation, they often refer to effect size.

Often, Cohen’s d is the statistic used to

refer to standardized mean separation.

Here, Cohen’s d = 1. This is often referred

to as a large, or very large, effect size.

Mean separation = 0

Making tests often means finding those characteristics that best separate the distributions of the two groups.

Two distributions of gender

with respect to:

Intelligence

Moderately large mean separation

Two distributions of gender

with respect to:

Longevity

Large mean separation

Two distributions of gender

with respect to:

Hair Length

Very large mean separation

Two distributions of gender

with respect to:

Body Mass

Real World

- Summary:
- We have two groups.
- We have a test for which the two
- groups score differentially.
- (3) The differences in mean scores
- represents a very large effect.

More commonly, researchers report

Sensitivity and Specificity. These terms

are common, but not most helpful.

We are going to use the terms:

True Positive Rate (TPR) and

False Positive Rate (FPR).

TPR = Sensitivity

FPR = 1 - Specificity

What are TPR and FPR?

TPR is the proportion of individuals who do have the condition who generate positive scores. TPR is the rate of scores are beyond the cut in the direction that indicates the presence of the condition.

FPR is the proportion of individuals who do NOT have the condition who generate positive scores. FPR is the rate of scores beyond the cut in the direction that indicates the presence of the condition.

The green line represents

the cut score. Scores to the

LEFT of the line are classified

NEGATIVE. Scores to right

are classified POSITIVE.

Have nots

Haves

Here, the False Positive Rate is 92.4%.

The True Positive Rate is 100%.

As we move the line to the right, both

rates DECREASE.

To totally eliminate

false positives, we have

to be willing to identify

almost no one as a

positive.

TPR = True Positives/Haves

FPR = False Positives/Have Nots

Haves

Have nots

A positive score will be one that is

associated with Population A

membership. If we set a point at which

a score will be used to say, “This score

represents Population A,” such a score

will be referred to as a “positive score.”

A positive score can be a true positive

or a false positive: unknown to us.

The True Positive Rate is the proportion of

Population A members who generate a

positive score.

In our figure, the point at which we

begin to identify “positive scores” is at 50, the

mean of population A. Scores at or below 50

are called positive, and a person who

generates a positive score is classified as a

Population A member.

We can pick any value to be our “cut score,” but

it’s hard to pick one that doesn’t result in some

Population B members producing “positive

scores.”

In our figure, 50% of the Population A members

have scores at 50 or below. This is the True

Positive Rate. TPR = .50.

In our figure, 16% of the Population B members

have scores at 50 or below. This it the False

Positive Rate. FPR = .16.

We note that it is not the test that has

a certain TPR and FPR.

It is the chosen test score that has a

certain TPR and FPR.

A different test score will almost

certainly have different TPR and FPR.

We think of a test as a way to characterize a dependency.

As you have more of X, you have more of Y.

Y depends on X.

X predicts Y.

X is some construct. Y is some test score.

There is a relationship that we wish to characterize and

quantify.

Let’s consider feigning.

As you are more likely to feign, you are more likely to

engage in certain behavior.

This behavior might be “providing answers to items on a

test” at a certain rate.

You might choose more items, you might choose fewer

items than “normals.”

We develop the idea that we can identify individuals who

respond at a certain rate as feigners, and we decide to

make a decision point about when we call test takers

feigners and when we don’t.

We call that decision point a cut score.

We call test scores at or beyond the cut score:

positive scores

Some positive scores are correct: true positives

Some positive scores are incorrect: false positives

If our test is any good, and if the relationship between

X and Y is strong, then our rate of true positives is much

higher than our rate of false positives.

Let’s skip to the end. We are now using the test in our

clinic.

We look over our results. We see a number of “positive

scores.”

We know that those “positive scores” are some unknown

mixture of “true positives” and “false positives.”

We’d like to know what that ratio of that mixture is.

Here’s how we do it:

First, we estimate what the true positive rate of the cut

score is.

Then, we estimate what the false positive rate of the

cut score is.

Then, we figure out what percentage of people in our

sample are feigning.

Then we can get the ratio of the mixture of our true

positive and false positives in all the positive scores in our

clinic. (We call this positive predictive power.)

Getting TPR and FPR:

We depend on researchers to tell us what the estimates

of true positive rate and false positive rate are.

They usually do this through a process called

“criterion groups validation.”

People with more confidence than might be called

for refer to this process as “known groups validation.”

The process is seemingly straightforward.

Identify two groups. One group has the condition.

All the positives in this group are “true positives.”

One group doesn’t have the condition. All the positives

in this group are “false positives.”

The rate of “true positives” is the sensitivity of the test.

TPR = sensitivity.

The rate of “false positives” is the non-specificity of

the test. FPR = 1 – specificity.

There are many problems with this process, but let’s

focus on the main two.

Problem 1

In Study 1, for a given cut score, researchers report the

TPR is .67 and the FPR is .12.

In Study 2, for the same cut score, researchers report

TPR = .58 and FPR = .09.

Which values do you use?

Problem 2:

In Study 1, for a given cut score, researchers report the

TPR is .67 and the FPR is .12.

In Study 2, for a different cut score, researchers report

TPR = .58 and FPR = .09.

Which cut score do you use?

TRUTH

TEST

TRUTH

TEST

TRUTH

TEST

Because God does not whisper to us anymore,we take this test, our best test, and we say, “This is the best we can do.” Let’s call it our Gold Standard.We will now make criterion groups with this test,and we will call the groups “Known Groups.”We will then validate tests, based on these Known Groups.

TRUTH

TEST

“KNOWN” GROUPS

TRUTH

“KNOWN” GROUPS

Let’s validate a new test, which just happens to be a perfect test. What test diagnostic efficiencies will we assign our new, perfect, test?

“KNOWN” GROUPS

PERFECT TEST

Let’s validate a new test, which just happens to be a perfect test. What test diagnostic efficiencies will we assign our new, perfect, test?

“KNOWN” GROUPS

Our belief that

we can make

perfect criterion

groups from

imperfect criteria

has led us to

misunderstand

tremendously

what we are

doing.

PERFECT TEST

TPR = 49/50 = 98%, FPR = 51/150 = 34%

Let’s begin to address these problems in a

non-traditional way.

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

REMINDER: Here is what we are working on—figuring out

which positives in our clinic are true positives.

First, we estimate what the true positive rate of the cut

score is. Then, we estimate what the false positive rate of

the cut score is.

Let’s do that part now.

Then, we figure out what percentage of people in our

sample are feigning.

Then we can get the ratio of the mixture of our true

positive and false positives in all the positive scores in our

clinic.

Table for Computation of Test Characteristics

Table for Computation of Test Characteristics

When BR = 0, 10% of scores

positive, all false positives

When we

say FPR = .16

and TPR = .50,

what we’re

saying is that,

no matter

what samples

we test, we

expect to

see no fewer

than 16%

positive scores

and no more

than 50%

positive

scores.

Movement along this line from left to right

represents increasing rate of Population A and

increasing rate of positive scores.

FPR = .10

TPR = .80

Table for Computation of Test Characteristics

TOMM

No simulation

studies

FPR = .056,

SE = .025

TPR = .742,

SE = .093

For any

imperfect

test,

PPP ranges

from 0 to 1

as base rate

ranges from

0 to 1

NPP ranges

from 0 to 1

as base rate

ranges from

1 to 0

NPP

PPP

Using MGV to estimate test diagnostic efficiencies of the Reliable Digit Span

Laurie Ragatz, PhD

Richard Frederick, PhD

RDS is a symptom validity measure for Digit Span. The value of RDS is derived by adding longest strings of two trials passed for both

forward and backward Digit Span.

Researched cut scores include 5 or lower, 6 or lower, 7 or lower, or 8 or lower.

Directions: Examinee recalls numbers in the same order they were provided by the examiner

Directions: Examinee recalls numbers in the reverse order they were provided by the examiner

Reliable Digit Span: 4 + 3 =7

(1) We found all available articles dealing

with RDS and identified the cut scores

investigated. We included simulator studies.

(2) Based on the authors’ decision about

criterion group membership, we

calculated the overall base rate of

malingering in the study.

(3) We observed the overall rate of

positive scores in the study at the

identified cut score.

(4) We did not include any data for persons

with mental retardation. The rate of

positive scores among persons with mental

retardation was exceedingly high for all cut

scores.

Criterion group

Test outcome

We have 63 malingerers in a sample of 203. BR = 63/203 = 0.31.

We have 57 positive scores. Proportion positive scores (PPS) is 57/203 = .28. For this study, we plot (BR, PPS) = (.31, .28)

x = .31, y = .28. Our n for WLS = 203.

Using weighted least squares regression (with N as the weight),

we regressed Proportion Positive Scores (PPS) on Base Rate (BR)

to generate the Proportion Positive Score Line.

We obtained y-intercept of -.015 (all negative values are

truncated to 0), and slope of .265.

RDS = 5 or lower

put these data in WLS

to obtain regression

line characteristics

scatterplot

RDS: 5 or lower, FPR = 0, TPR = .265

RDS = 6 or lower

y-intercept = .015, slope = .419

RDS: 6 or lower, FPR = .015, TPR = .434

RDS = 7 or lower

y-intercept = .187, slope = .39

RDS: 7 or lower, FPR = .187, TPR = .618

RDS = 8 or lower

y-intercept = .236, slope = .824

RDS: 8 or lower, FPR = .236, TPR = .824

As we move

from a cut score

of 5 or lower to 6

or lower, we

obtain

substantial

improvement in

TPR estimate

with little cost

in FPR increase.

Our choice for best cut score for RDS

RDS: 6 or lower, FPR = .015, TPR = .434

By using WLS regression, we can obtain standard errors of

our estimates of FPR and TPR.

So, new researchers can test hypotheses about parametric

values of FPR and TPR.

- Summary:
- The TVS and MGV allow powerful research into existing published data sets. Summary data are used.
- Understanding of parametric values of TPR and FPR is facilitated when researchers publish results on a variety of cut scores that should be considered. A frequency
- distribution would be ideal, for example,

- Combining studies in this way allows us to generate
- stable values of TPR and FPR with SE’s so that new research
- can test those values.
- Researchers should focus on the basis for estimating BR’s
- in their research groups. All research estimating FPR and TPR
- is vulnerable to error when the purity of research groups is
- overestimated. Working towards a reliable estimate of
- mixed group base rate will facilitate better validation studies.

- How can the Test Validation Summary help me
- determine my local BR?
- Get the best estimate of the test FPR and TPR
- for a certain test score.
- 2. Find the proportion of test scores in your
- sample that are positive scores.

From a sample, observe

rate of positive scores.

Use TVS to estimate

condition BR in that

sample, PPP and NPP

for that BR.

527 criminal

defendants

who took

RMT and VIP

concurrently

Rate of positive

scores in this

sample was .113

PPP = .814

1 – NPP = .077

TOMM

No simulation

studies

FPR = .056,

SE = .025

TPR = .742,

SE = .093

Beth A. Caillouet, Bernice A. Marcopulos, Jesse G. Brand,

Julie Ann Kent, & Richard I. Frederick

Question: What are the BRs of malingering in the two samples?

Question: What are the BRs of malingering in the two samples?

Information needed:

Estimates of TOMM FPR and TPR. From TOMM TVS, we

get FPR = .056, TPR = 742.

Sample 1: Secondary gain present. Proportion positive

scores = 55/220 = .25.

Sample 2: Secondary gain absent. Proportion positive

scores = 34/299 = .11.

Use TOMM TVS to estimate BR of each sample.

When PPS = .25,

BR = .28.

When PPS = .11,

BR = .08.

TPR = .93FPR = .17

BR malingering = 35%, N = 86

TPR = .93FPR = .17

BR malingering = 35%, N = 86

TPR = .93FPR = .17

BR malingering = 35%, N = 86

TPR = .93FPR = .17

BR malingering = 35%, N = 86

TPR = .93FPR = 10/56 = .18

BR malingering = 35%, N = 86

TPR = 28/30 = .93FPR = 10/56 = .18

PPP = 28/38 = .737NPP = .958

BR malingering = .35

NPP PPP 1 – FPR TPR

TPR = 28/30 = .93FPR = 10/56 = .18

PPP = 28/131 = .213NPP = .996

BR malingering = .05

NPP PPP 1 – FPR TPR

Test validation

summary for

M-FAST cut

score

recommended

by test manual.

PPP does not even reach 50%

correct decisions until BR > .16

M-FAST > 5

FPR = .17

TPR = .93

At recommended cut score FPR very high

At BR = .05, PPP does not exceed

.50 until cut score adjusted to

> 9 on M-FAST

Combining information from local base

rate and multiple symptom validity tests

You can get estimates of PPP and NPP

for the sample you work with—IF you

can reliably estimate the BR.

737 defendants were administered:

Rey 15 Item Memory Test (RMT)—memorize and

reproduce 15 items—very easy test.

Score is items reproduced (0 to 15)

Word Recognition Test (WRT)—memorize 15 words,

identify those 15 and correctly reject 15 from a list of 30.

Score is number of hits and correct rejections (0 to 30)

RMT validating

using MGV

with clinical

probability

judgments.

FPR = .025

TPR = .574

Frederick & Bowden,

2009

RMT < 9

FPR = .025

TPR = .574

We found 726 defendants who completed BOTH RMT and WRT.

81/726 failed the RMT= .111 proportion positive score.

By observation of TVS, then BR = .16, PPP = .814, NPP = .923

From a sample, observe

rate of positive scores.

Use TVS to estimate

condition BR in that

sample, PPP and NPP

for that BR.

527 criminal

defendants

who took

RMT and VIP

concurrently

Rate of positive

scores in this

sample was .113

PPP = .814

1 – NPP = .077

We found 726 defendants who completed BOTH RMT and WRT.

81/726 failed the RMT= .111 proportion positive score.

By observation of TVS, then BR = .16, PPP = .814, NPP = .923

If PPP = .814, then in this sample, the probability of feigning if RMT

is positive, is .814.

If NPP = .923, then in this sample, the probability of feigning if RMT

is negative is .077,

or 1 - .923.

- To conduct MGV, we sampled from two groups:
- The 645 individuals who passed the RMT—had a negative score.
- The 81 individuals who failed the RMT—had a positive score.

Example of sampling

645 individuals with

negative scores, p(mal) = .077

81 individuals with

positive scores, p(mal) = .814

Sample n = 360

Sample n = 40

400 cases, 10% failures, 90% passes

Overall p(mal) = 40*.814 + 360*.077 = .151

Sample 25 times, plot x = .151, y = observed rate of

positive WRT scores, n for WLS = 400

For each sample, BR was pre-estimated. Then we observed

rate of positive WRT scores at each potential cut score.

Word Recognition Test (WRT)

Range 4 to 30, Mean = 23.2

Within group of RMT < 9, mean = 18.7

Within group of RMT > 8, mean = 23.8

Word Recognition Test (WRT)

For every potential cut score of WRT (4 -30), we plotted all x, y pairs

obtained from sampling

We performed WLS to obtain the FPR and TPR estimates

at every potential cut score.

We plotted the FPR and TPR estimates at every potential cut

score to generate the ROC curve.

AUC = .905, SE = .012, 95% CI for AUC = .881-.930.

Best cut scores:

LTE 18 (TPR = .563, FPR = .034)

LTE 19 (TPR = .620, FPR = .066)

We plotted the FPR and TPR estimates at every potential cut

score to generate the ROC curve.

AUC = .905, SE = .012, 95% CI for AUC = .881-.930.

Best cut scores:

LTE 18

(TPR = .563, FPR = .034)

LTE 19

(TPR = .620, FPR = .066)

TPR

FPR

WORD RECOGNITON TEST (WRT)

- Summary:
- We can use tests to form mixed groups for validation.
- The best estimates of FPR and TPR for a test cut score
- allow us to estimate PPP and NPP at our sample BR.
- Instead of “known groups” design (which is misleading),
- we do not presume to know (or care) about the status of
- any individual. We assign individuals “probabilities of
- having the condition” based on their test score.
- Mixed groups have an overall “probability of having the
- condition,” which is the average of the individual probabilities.
- We do not need to be certain about group memberships.
- We gain much flexibility by working with probabilities of having
- the condition vs. certainties of having the condition.

Dawes 1967 showed that valid probability judgments are

excellent base rate indicators. His work was substantiated in Frederick 2000 and Frederick and Bowden 2009.

To conduct MGV, we formed groups of defendants for whom individuals ratings of likelihood of malingering psychosis were generated by forensic psychologists, before any testing took place.

The BR of malingered psychosis for each group was then the mean of the probability rating. If each member of the group had been rated as 10% likely to feign psychosis, then the BR of the group was estimated to be 10%.

We then observed the hit rate (proportion positive scores) for

the groups for a variety of F-family indicators of feigning on the

MMPI-2 and MMPI-2-RF.

We formed 15 groups of 30 individuals. For each group, we

had a static base rate, which was the mean of the probability

judgments assigned before testing.

Within each group, we iteratively observed the hit rate of positive

F-family indicators at each potential cut score. Using the BR

estimate and the proportion positive scores at each potential cut

score, we performed WLS to generate estimates of FPR and TPR.

From these estimates, we generated ROC curves.

15 groups, 30 defendants in each group, 450 defendants

Each defendant rated from 0 to 100 before testing, with respect

to likelihood he would feign psychosis.

Groups were formed after first sorting individuals by ratings, from

lowest to highest.

Mean ratings of groups (each group, n = 30):

001.24.25.0

5.05.05.08.110

14.522.230.345.772.3

Rates of positive F-family scores at each potential cut observed.

Estimates by Nicholson, Mouton, Bagby, Buis, Peterson,

and Buigas (1998):

AUC’s and SE: F (.929, .021) Fp (.885, .027)

- Summary:
- Using the estimates of likelihood of feigning based only
- on clinician judgment prior to testing did not result in
- random results. We can assume that mean probability
- judgments were effective base rate estimates.
- Our estimates of F and Fp are consistent with estimates
- in large, well-validated analysis.
- In this study, MMPI-2-RF indicators have higher mean
- AUC and lower SE than their MMPI-2 counterparts.

Combine information about F with the SIRS-2

f

FrequencyPercentValid PercentCumulative Percent

Valid442.73.13.1

61.7.83.8

91.7.84.6

1021.31.56.1

1132.02.38.4

1221.31.59.9

1332.02.312.2

1442.73.115.3

1521.31.516.8

1632.02.319.1

1732.02.321.4

1821.31.522.9

1942.73.126.0

2053.43.829.8

2121.31.531.3

2274.75.336.6

2332.02.338.9

241.7.839.7

2553.43.843.5

2664.04.648.1

2764.04.652.7

2874.75.358.0

2921.31.559.5

3021.31.561.1

3164.04.665.6

3253.43.869.5

3332.02.371.8

3442.73.174.8

3542.73.177.9

3632.02.380.2

3774.75.385.5

3842.73.188.5

3921.31.590.1

4032.02.392.4

4132.02.394.7

431.7.895.4

451.7.896.2

461.7.896.9

471.7.897.7

4821.31.599.2

521.7.8100.0

Total13187.9100.0

MissingSystem1812.1

Total149100.0

131 defendants

who took MMPI

and SIRS

52.7% of cases are 27 or lower

47.3% of cases are 28 or higher

What is the base rate of feigned psychopathology?

BR TPR FPR NPP PPP

What we say:

Within our sample of 131 defendants, the BR of feigned

psychopathology is .73 (NOT .475)

At BR = .73, the PPP of F GTE 28 is .976.

At BR = .73, the NPP of GTE 28 is .492, so

p(feigning if LTE 27) is still .508) (Remember, they’re

being given the SIRS for a reason)

F < 28

NPP about .66

F > 27

Application of MGV to a CGV estimation

of FPR and TPR

Greve, Bianchini, Love, Brennan, & Heinly (2006) articulated six separate

groups with increasing base rate of malingering based on formal criteria

for malingering (the Slick criteria) to validate the MMPI-2 Fake Bad Scale

- No incentive (no evidence of external incentive and no test
- performance suggestive of malingering; n = 18, mean FBS = 15.4)
- Incentive (external incentive, but no test performance suggestive
- of malingering; n = 79, mean FBS = 19.5)
- Suspect (external incentive and at least one indicator suggestive
- of malingering; n = 66, mean FBS = 22.7)
- Statistically Likely (external incentive; at least two indicators
- suggestive of malingering; n = 51, mean FBS = 22.8)
- Probable (external incentive; strong indicators of malingering;
- n = 31, mean FBS = 26.9)
- Definite (external incentive; very strong indicators of
- malingering; n = 14, mean FBS = 29.8)

- Even though it is clear that
- BR Definite > BR Probable > BR Statistically Likely >
- BR Suspect > BR Incentive Only > BR No Incentive
- They were required, to conduct “Known” groups
- validation, to ignore this obvious circumstance and to define
- BR No Incentive = BR Incentive Only = 0
- BR Statistically Likely = BR Probable = BR Definite = 1.0
- And drop all participants defined as Suspect
- to yield the following ROC

FBS ROC

generated

by “Known”

groups

validation by

Greve & Bianchini

If we had estimates of the BR for each of the subgroups

formed by Greve and Bianchini, we could use MGV to

estimate FPR and TPR for each potential cut score.

We have our stable estimate of

TOMM FPR and TPR

TOMM

No simulation

studies

FPR = .056,

SE = .025

TPR = .742,

SE = .093

We can get estimates of BRs for those groups

from other work by Greve & Bianchini.

They formed similar groups using the Slick

criteria to investigate the TOMM.

We can use the proportion of positive TOMMs

in each of these subgroups to estimate the BRs

in each of them.

From Greve, Bianchini, Doane (2006)

We take these BR estimates and reapply them

to the Greve & Bianchini FBS data.

Example of MGV for FBS based on BR estimates for

Greve & Bianchini groups established by Slick criteria

At FBS > 27

For FBS > 27, using WLS Regression, FPR = .091, TPR = .773

(For WLS, n is the weighted variable)

10 clinical

studies

using Rey

15-Item Test

No simulators

All clinical data

RMT validating

using MGV

with clinical

probability

judgments.

FPR = .025

TPR = .574

Frederick & Bowden,

2009

CI, TPR = .574,

SE = .044

We will

generate TVS

based on these

values and

find PPP and

1 – NPP to

estimate

probability of

bad intent

represented

by RMT score.

Easy:BabyDrinkInfant

Moderate:PeopleAllyFolk

Difficult:NimietyConceitSurfeit

Suppression

Not guessing,

knowledgeable

responding

Guessing

Guessing

is imminent

Easy items

Difficult items

Inconsistent curves

527 criminal

defendants

who took

RMT and VIP

concurrently

Rate of positive

scores in this

sample was .113

PPP = .814

1 – NPP = .077

Here we

are matching

VIP categories

to the construct

most likely

captured by

the VIP.

Points in

scatterplot

represent

groups of 25

individuals.

Sorted defendants

by clinical ratings

of malingering,

then took 20

groups of 25 and

one group of 27,

for 527 defendants.

BR of .42 estimated

for this group is mean

of PPP for positive RMT

scores in this group and

(1 – NPP) for negative

RMT scores in the group

Same 21

subgroups,

N = 527

defendants

527 criminal

defendants

VRIN was converted

to “probability

of invalid

responding” by

dividing VRIN

raw score by 12.

VRIN raw scores

>12 were

assigned p = 1.

We are interested

in FPR and TPR for

“Invalid”