- 199 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'A Unified Approach for Assessing Agreement' - sailor

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### A Unified Approach for Assessing Agreement

Lawrence Lin, Baxter Healthcare

A. S. Hedayat, University of Illinois at Chicago

Wenting Wu, Mayo Clinic

Outline

- Introduction
- Existing approaches
- A unified approach
- Simulation studies
- Examples

Introduction

- Different situations for agreement
- Two raters, each with single reading
- More than two raters, each with single reading
- More than two raters, each with multiple readings
- Agreement within a rater
- Agreement among raters based on means
- Agreement among raters based on individual readings

Existing Approaches (1)

- Agreement between two raters, each with single reading
- Categorical data:
- Kappa and weighted kappa
- Continuous data:
- Concordance Correlation Coefficient (CCC)
- Intraclass Correlation Coefficient (ICC)

Existing Approaches (2)

- Agreement among more than two raters, each with single reading
- Lin (1989): no inference
- Barnhart, Haber and Song (2001, 2002): GEE
- King and Chinchilli (2001, 2001): U-statistics
- Carrasco and Jover (2003): variance components

Existing Approaches (3)

- Agreement among more than two raters, each with multiple readings
- Barnhart (2005)
- Intra-rater/ inter-rater (based on means) /total (based on individual observations) agreement
- GEE method to model the first and second moments

Unified Approach

- Agreement among k (k≥2) raters, with each rater measures each of the n subjects multiple (m) times.
- Separate intra-rater agreement and inter-rater agreement
- Measure relative agreement, precision, accuracy, and absolute agreement, Total Deviation Index (TDI) and Coverage Probability (CP)

Unified Approach - summary

- Using GEE method to estimate all agreement indices and their inferences
- All agreement indices are expressed as functions of variance components
- Data: continuous/binary/ordinary
- Most current popular methods become special cases of this approach

Unified Approach - model

- Set up
- subject effect
- subject by rater effect
- error effect
- rater effect

Unified Approach - targets

- Intra-rater agreement:
- overall, are k raters consistent with themselves?
- Inter-rater agreement:
- Inter-rater agreement (agreement based on mean): overall, are k raters agree with each other based on the average of m readings?
- Total agreement (agreement based on individual reading): overall, are k raters agree with each other based on individual of the m readings?

Unified Approach – agreement(intra)

- : for over all k raters, how well is each rater in reproducing his readings?

Unified Approach – precision(intra) and MSD

- : for any rater j, the proportion of the variance that is attributable to the subjects (same as )
- Examine the absolute agreement independent of the total data range:

Unified Approach – TDI(intra)

- : for each rater j, % of observations are within unit of their replicated readings from the same rater.

is the cumulative normal distribution

is the absolute value

Unified Approach – CP(intra)

- : for each rater j, of observations are within unit of their replicated readings from the same rater

Unified Approach – agreement(inter)

- : for over all k raters, how well are raters in reproducing each others based on the average of the multiple readings?

Unified Approach – precision(inter)

- : for any two raters, the proportion of the variance that is attributable to the subjects based on the average of the m readings

Unified Approach – accuracy(inter)

- : how close are the means of different raters:

Unified Approach – TDI(inter)

- : for overall k raters, % of the average readings are within unit of the replicated averaged readings from the other rater.

Unified Approach – CP(inter)

- : for each rater j, of averaged readings are within unit of replicated averaged readings from the other rater

Unified Approach – agreement(total)

- : for over all k raters, how well are raters in reproducing each others based on the individual readings?

Unified Approach – precision(total)

- : for any two raters, the proportion of the variance that is attributable to the subjects based on the individual readings

Unified Approach – accuracy(total)

- : how close are the means of different raters (accuracy)

Unified Approach – TDI(total)

- : for overall k raters, % of the readings are within unit of the replicated readings from the other rater.

Unified Approach – CP(total)

- : for each rater j, of readings are within unit of replicated readings from the other rater

Unified Approach

is the inverse cumulative normal distribution

is a central Chi-squre distribution with df=1

Estimation and Inference

- Estimate all means, variance components,

and their variances and covariances by GEE method

- Estimate all indices using above estimates
- Estimate variances of all indices using above estimates and delta method

Estimation and Inference (2)

: the covariance of two replications,

and ,with coming from rater and

coming from rater

Estimation and Inference (3)

: the variance from each combination of (i, j), i.e., each cell. Thus is the average of all cells’ variances.

Estimation and Inference (4)

: the variance of replication of rater

: the covariance of two replications, and , both of them coming from rater .

Estimation and Inference (5)

- Using GEE method to estimate all indices through estimating the means and all variance components:

Estimation and Inference (8)

- is the working variance-covariance structure of , “working” means assume following normal distribution
- is the derivative matrix of expectation of with respective to all the parameters

Estimation and Inference (9)

- GEE method provides:
- estimates of all means
- estimates of all variance components
- estimates of variances for all variance components
- Estimates of covariances between any two variance components

Estimation and Inference (10)

- Delta method is used to estimate the variances for all indices

Estimation and Inference (18)

- Transformations for variances
- Z-transformation: CCC-indices and precision indices
- Logit-transformation: accuracy and CP indices
- Log-transformation: TDI indices

Simulation Study

- three types of data: binary/ordinary/normal
- three cases for each type of data
- k=2, m=1 / k=4, m=1 / k=2, m=3
- for each case: 1000 random samples with sample size n=20
- for binary and ordinary data: inferences obtained through transformation vs. no-transformation
- For normal data: transformation

Simulation Study (2)

- Conclusions:
- Algorithm works well for three types of data, both in estimates and in inferences
- For binary and ordinary data: no need for transformation
- For normal data, Carrasco’s method is superior than us, but for categorical data, our is superior.
- For ordinal data, both Carrasco’s method and ours are similar.

Example One

- Sigma method vs. HemoCue method in measuring the DCHLb level in patients’ serum
- 299 samples: each sample collected twice by each method
- Range: 50-2000 mg/dL

Example One – HemoCue method

HemoCue method first readings vs. second readings

Example One – Sigma method

Sigma method first readings vs. second readings

Example One – HemoCue vs. Sigma

HemoCue’s averages vs. Sigma’s averages

Example One – analysis result (2)

*: for all CCC, precision, accuracy and CP indices, the 95% lower limits are reported. For all TDI indices, the 95% upper limit are reported.

Example Two

- Hemagglutinin Inhibition (HAI) assay for antibody to Influenza A (H3N2) in rabbit serum samples from two labs
- 64 rabbit serum samples: measured twice by each lab
- Antibody level: negative/positive/highly positive

Conclusions (1)

- When data are continuous and m goes to ∞:
- agreement indices are the same as that proposed by Barnhart (2005), both in estimates and inferences
- improvements
- Precision indices, accuracy indices TDIs and CP
- Variance components

Conclusions (2)

- When m=1:
- agreement index degenerates into OCCC as proposed by King (2002), Carrasco (2003) for continuous data
- Improvements:
- For categorical data:
- King’s method: approximates to kappa and weighted kappa, our estimates (without transformation) are exactly the same as kappa and weighted kappa, both in estimate and in inference.
- Our estimates superior to Carrasco’s estimates when precision and accuracy are high
- Covariates adjustment become available

Conclusions (3)

- When data are continuous, k=2 and m=1:
- agreement index degenerates to the original CCC by Lin (1989)
- When data are binary, k=2 and m=1:
- agreement index degenerates into kappa, both in estimate and inference

Conclusions (4)

- When data are ordinary, k=2 and m=1:
- agreement index degenerates into weighted kappa with below weight set, both in estimate and in inference.

Conclusions (5)

- Unified approach
- Relative agreement indices: CCC with precision and accuracy – data range
- Absolute agreement: Total deviation indices and Coverage Probability – normal assumption
- Link function need more work
- Require balanced data

References

- Barkto, John J (1966): The intraclass correlation coefficient as a measure of reliability. Pshchological Reports 19, 3-11.
- Barnhart, H. X. and Williamson, J. M. (2001). Modeling concordance correlation via GEE to evaluate reproducibility. Biometrics 57, 931-940.
- Barnhart, H. X. Song, Jingli and Haber, Michael J. (2005): Assessing intra, inter and total agreement with replicated readings. Statistics in Medicine 19: 255-270.
- Carrasco, J. L. and Jover, L. (2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics 59, 849-858.
- Fleiss, J., Cohen, J. and Everitt, B (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin 72, 323-327.
- King, Tonya S. and Chinchilli, Vernon M. (2001): A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20: 2131-2147.
- Lin, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255-268.
- Lin, L. I., Hedayat, A. S., Sinha, B., and Yang, M. (2002). Statistical methods in assessing agreement: models, issues & tools. Journal of American Statistical Association 97(457), 257-270.
- Wu, Wenting. A unified approach for assessing agreement. Ph.D. thesis, UIC, 2006

Download Presentation

Connecting to Server..