1 / 37

# - PowerPoint PPT Presentation

Tampere August 28, 2009. On Some Statistical Aspects of Agreement Among Measurements BIKAS K SINHA [ISI, Kolkata] . Part II : Statistical Assessment of Agreement. Understanding Agreement among Raters involving Continuous Measurements…. Theory & Applications…. Continuous Measurements.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about '' - benjamin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

August 28, 2009

On Some Statistical Aspects of Agreement Among MeasurementsBIKAS K SINHA [ISI, Kolkata]

Part II : Statistical Assessment of Agreement

Understanding Agreement among Raters

involving Continuous Measurements….

• Theory & Applications…..

Evaluation of agreement when the

data are measured on a continuous scale……

Pearson correlation coefficient, regression analysis, paired t-tests, least squares analysis for slope and intercept, within-subject coefficient of variation, and intra-class correlation coefficient…..

• 1. Comparison of Gold Standard or Reference Method and one (or more) New or Test Method(s)

If the two agree fairly well, we can use them interchangeably or the New One which is possibly cheaper or more convenient in place of the Gold Standard !

2. Calibration : Establish mathematical relationship between the two sets of measurements.

• 3. Conversion : Compare two approx.

methods, measuring same underlying quantity.

Goal : Interpret results of one in terms of the

other

Temperature recorded in two instruments….

..one in ^oF and the other in ^oC.

Talk focuses on # 1 : Comparison of GS & TM

GS : Gold Standard & TM : Test Method

• Bland & Altman :Limits Of Agreement [LOA] Approach [with over 6000 citations in the Institute for Scientific Information Database]

• Lawrence Lin : Use of Concordance Correlation Coefficient

• Lin & Collaborators…..serious in-depth study with pharmaceutical applications

• Subjects Rater 1 Rater 2

• 1 x_1 y_1

• 2 x_2 y_2

• ……………………………..

• n x_n y_n

Model : x_ j = S_ j + Beta_1 + e_ 1j

y_ j = S_ j + Beta_2 + e_2 j

S_ j : True Unobservable Measurement for the j-th subject…randomly distributed as

• N(Mu, sigma^2_s)

• Beta_1 & Beta_2 : Fixed Raters’ Bias Terms

• e_1 j : iid N(0, sigma^2_e1)

• e_2 j : iid N(0, sigma^2_e2)

• S_ j, e_1j, e_2 j ….all independent

• This is Grubbs’ Model

• sigma^2_s : Between-subject variance

• sigma^2_e = measurement error variance

• E(X) = Mu + Beta_1,

V(X) = sigma^2_s + sigma^2_e1 = sigma^2_x

• E(Y) = Mu + Beta_2

V(Y) = sigma^2_s + sigma^2_e2 = sigma^2_y

• Cov(X, Y) = sigma^2_s

• Rho = sigma^2_s / sigma_x . sigma_y

• Rho_x = sigma^2_s / sigma^2_x

• = Reliability Coeff. for Rater 1

• Rho_y = sigma^2_s / sigma^2_y

• = Reliability Coeff. for Rater 2

• Rho^2 = Rho^2_x . Rho^2_y

• Notion of Perfect Agreement :

• All paired observations (x_ j, y_ j) lie on the 45^o line through Origin

• Equivalent Conditions : Same means, same variances and Rho = 1

• Leads to Testing Issues……

• m = E(X – Y) = (m1–m2)

• 2 = Var(X-Y) = (12 +22 - 212)

• Estimates are based on paired data

• LOA has 2 components :

• (i) 95% LOA, defined by m^ +/- 1.96 ^

• (ii) Plot of mean (x+y)/2 vs D = x – y, with LOA superimposed....Bland-Altman Plot...SAS JMP produces Plot

If a large proportion of the paired differences

[D’s] are sufficiently close to zero, the two

methods have satisfactory agreement.

Step I : Estimate the set m +/- 1.96 

Step II : Declare ‘sufficient’ agreement if the

differences within these limits are not clinically

important as determined by the investigator

specified threshold value delta_o depending

on the question of clinical judgement.

• Lin et al in a series of papers made in-depth

study of agreement using such notions as concordance correlation coefficient, total deviation index, coverage probability etc

We will now elaborate on these concepts.

• Two raters – n units for measurement

• Data : [{xi, yi}; 1 ≤ i ≤ n]

• Scatter Plot : Visual Checking

• Product Moment Corr. Coeff.:

High +ve : What does it mean ?

• Squared Deviation : D2 = (X-Y)2

MSD:E[D2]=(m1–m2)2 + (12 +22 - 212)

Carotid Stenosis Screening StudyEmory Univ.1994-1996

• Gold Standard : Invasive Intra-arterial Angiogram [IA] Method

• Non-invasive Magnetic Resonance Angiography [MRA] Method

Two Measurements under MRA:

2D & 3D Time of Flight

Three Technicians : Each on Left & Right Arteries for 55 Patients by IA & MRA [2d & 3D] :3x3x2 =18 Obs. / patient

• Between Technicians : No Difference

• Left vs Right : Difference

• 2D vs 3D : Difference

Q. Agreement between IA & 2D ? 3D ?

Barnhart & Williamson (2001, 2002) :

Biometrics papers …..no indication of any strong agreement

Descriptive Statistics :Carotid Stenosis Screening Study

Sample Means

Methods 1A, MRA-2D & MRA-3D by Sides

Method N Left Artery Right Artery

----------------------------------------------------------

1A 55 4.99 4.71

MRA-2D 55 5.36 5.73

MRA-3D 55 5.80 5.52

Sample Variance – Covariance Matrix

1A MRA-2D MRA-3D

L R L R L R

1A-L 11.86 1.40 8.18 1.18 6.80 1.08

1A-R 10.61 2.67 7.53 1.78 7.17

2D-L 10.98 2.70 8.69 1.74

2d-R 8.95 2.19 7.69

3D-L 11.02 2.65

3D-R 10.24

• Recall MSD = E[(X-Y)2] : Normed? No !

• Lin (1989):Converted MSD to Corr.Coeff

• Concordance Corr. Coeff. [CCC]

• CCC = 1 – [MSD / MSDInd.]

= 212 /[(m1–m2)2 + (12 +22)

Properties :Perfect Agreement [CCC = 1] Perfect Disagreement [CCC = -1]

No Agreement [CCC = 0]

• CCC = 212 /[(m1–m2)2 + (12 +22)]

= . a

 = Accuracy Coefficient

a = Precision Coeff. [<=1]

a = [2 / { + 1/ + 2}]

where  = 1/2 and

2 = (m1–m2)2 / 12

CCC = 1 iff  = 1 & a = 1

a = 1 iff [ m1 = m2 ] & [1 = 2 ]

hold simultaneously !!

• Identity of Marginals:Max.Precision

• High value of  : High Accuracy

• Needed BOTH for Agreement

• Simultaneous Inference on

H0 : 0, [m1 = m2] & [ 1= 2 ]

LRT & Other Tests based on CCC

Pornpis/Montip/Bimal Sinha (2006)‏

Thermo Pukkila Volume…..

Lin (1991) & Lin et al (JASA, 2002)

Assume BVN distribution of (X,Y)‏

 = P[ |Y – X| < k]

= P[ D2 < k2 ]; D = Y - X

= 2 [k2, 1, mD2 / D2]..non-central 2

TDI = Value of k for given 

Inference based on TDI

Choice of  : 90 % or more

Lin et al (JASA, 2002) : BVN distribution

CP(d) = P[ |Y – X| < d]

= [(d - mD) / D ] - [(- d - mD) / D ]

Emphasis is on given d and high CP.

CP^ : Plug-in Estimator using sample

means, variances & corr. coeff.

Var[CP^] : LSA

V^[CP^] : Plug-in Estimator

• Carotid Stenosis Screening StudyEmory Univ.1994-1996

• GS : Method IA

• Competitors : 2D & 3D Methods

• Left & Right Arteries : Different

• Range of readings : 0 – 100 %

• Choice of d : 2%

Robieson, W. Z. (1999) : On the Weighted

Kappa and Concordance Correlation Coefficient.

Ph.D. Thesis, University of Illinois at Chicago, USA

Lou, Congrong (2006) : Assessment of Agree-

ment : Multi-Rater Case.

Ph D Thesis, University of Illinois at Chicago, USA

• Lou(2006) derived expressions for

CP(d)^, V^(CP^(d)), COV^(…,…)‏

where CPiJ = P[|Xi – XJ|< d]

Data Analysis : CP12, CP13 & CP23

• Estimated Coverage Probability [CP] & Estimated Var. & Cov. for Screening Study

• Side Pairwise CP^ V^(CP^) COV^

• Left CP12(L)^=0.56 0.0019 0.0009

• Left CP13(L)^=0.47 0.0015

• Right CP12(R)^=0.60 0.0021 0.0010

• Right CP13(R)^=0.54 0.0019

• Left CP23(L)^ =0.64 0.0021

• Right CP23(R)^=0.69 0.0021

• Left Side

• CP12(L)^=0.56 95% Lower CL = 0.48

• CP13(L)^=0.47 95% Lower CL = 0.40

• Right Side

• CP12(R)^=0.60 95% Lower CL = 0.51

• CP13(R)^=0.54 95% Lower CL = 0.46

Conclusion : Poor Agreement in all cases

Testing Hyp. Statistic p - value

H0L : CP12(L)= CP13(L) Z-score 0.0366

[against both-sided alternatives ]

H0R : CP12(R)= CP13(R) Z-score 0.1393

Conclusions : For “Left Side”, CP for [1A vs

2D] & for [1A vs 3D] are likely to be different

while for “Right Side” these are likely to be

equal.

• For “K” alternatives [1, 2, …, K] to the Gold Standard [0], interest lies in

H0L : CP01(L)= CP02(L) = … = CP0K(L)‏

H0R : CP01(R)= CP02(R) = … = CP0K(R)‏

This is accomplished by performing

Large Sample Chi-Square Test [Rao (1973)]

Set for “Left Side”

L= (CP01(L)^ CP02(L)^ ….CP0 K(L)^)‏

Chi-Sq.Test Statistic

LW^-1 L- [LW^-11]2 / [1 W^-1 1]

where

Wtt = Var (CP0t(L)^); t = 1, 2, ..

Wst = Cov (CP0s(L)^, CP0t(L)^); s # t

Asymptotic Chi-Sq. with K-1 df

Slly…for “Right Side” Hypothesis.

Simultaneous Lower Confidence Limits

Pr[CP01 L1,CP02 L2, …,CP0K  Lk]  95%

Set Zt = [CP0t^ – CP0t ] /Var^(CP0t^)‏

Assume : Zt ‘s Jointly follow Multivariate Normal Dist.

Work out estimated Correlation Matrix as usual.

Solve for “z” such that

Pr[Z1 z, Z2  z, Z3  z,…, ZK z]  95%

Then Lt = CP0t^ – z.Var^(CP0t^)‏

t = 1, 2, .., K

Stat Package : Available with Lou (2006).

• Union-Intersection Principle….

• H_o : [│m│ > d_m] U [ > d_s]

H_o :│m│>d_m U d_1< _2 / _1<d_2 U r<d_r

Excellent Review Paper by Choudhary & Nagaraja :

Journal of Stat Planning & Inference .....

• Useful Statistical Concepts

• Sound Technical Tools

• Diverse Application Areas

• Scope for Further Research on Combining Evidences from Multi-Location Experiments...Meta Analysis !

Thanks !