1 / 36

Statistical Assessment of Agreement

Statistical Assessment of Agreement. Bikas K Sinha [ Retd . Professor] Indian Statistical Institute, Kolkata ******************* RUDS September 13-14, 2017. Quotes of the Day. “I now tend to believe …so long…I was completely wrong .”

gsabella
Download Presentation

Statistical Assessment of Agreement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Assessment of Agreement • Bikas K Sinha [Retd. Professor] • Indian Statistical Institute, Kolkata ******************* • RUDS September 13-14, 2017

  2. Quotes of the Day • “I now tend to believe …so long…I was completely wrong.” “Ah ! That’s good. You and I finally agree!“ • *************** • “When two men of science disagree, they do not invoke the secular arm; they wait for further evidence to decide the issue, because, as men of science, they know that neither is infallible”.

  3. Latest Book on Agreement

  4. A Statistician’s Call….. • In God • We Trust…. • All Others :----- • Must Bring Data …….

  5. Today’s Talk..... Agreement for Categorical Data [55 minutes] Discussion [5 minutes]

  6. Agreement : Categorical Data A Revealing Study was conducted in a Specialist EYE Hospital in Bangkok 600+ Diabetic Patients All : In-house & confined to Hospital Beds All under Treatment for Diabetic Retinopathy ...someting to do with eye ...needed regular monitoring..... Doctors in the study group ?

  7. Rajavithi Hospital, Bangkok Dr. PaisanRuamviboonsuk MD Dr. KhemawanTeerasuwanajak MD Dr. KanokwanYuttitham MD Affiliations : Thai Screening for Diabetic Retinopathy Study Group Department of Ophthalmology, Rajavithi Hospital, Bangkok, Thailand Statistician : DrMontipTiensuwan Deptt Mathematics, Mahidol University

  8. Description of Study Material 400/600+ Diabetic Patients Selected randomly from the hospital One Good Single-Field Digital Fundus Image was taken from each patient with Signed Consent Approved by Ethical Committee on Research with Human Subjects Q. What did they do with the 400 images ? Purpose : Extract information ......on what ? Why ?

  9. THREE Major Features #1. Diabetic Retinopathy Severity [6 options] No Retinopathy / Mild / Moderate Severe / Critical / Ungradable #2. Macular Edema [ 2 options] Presence / Absence #3. Referral to Opthalmologists [2 options] Referrals / Non-Referrals

  10. Who Extracted the Features ? • Retina Specialists • General Opthalmologists • Photographers • Nurses • All of them attached to the Hospital • AND 3 from each Group !!! • Altogether 12 ‘RATERS’ collected data on each of the 3 features…..from each of the 400 images…..loaded with data…..

  11. Measurements : Provided by Experts / Observers / Raters • Rater....Generic Term • Could be two or more systems, assessors, chemists, psychologists, radiologists, clinicians, nurses, rating systems or raters, diagnosis or treatments, instruments or methods, processes or techniques or formulae……

  12. Retina Specialists’ Ratings [Macular Edema] RS1 RS2 RS3 Remarks CODES Remarkable Agreement! Presence 330 326 332 Too good to be valid ! Absence 70 74 68 Total 400 400 400 Q. Is there any inside story – yet to be revealed ? Called upon a Statistician : Dr Montip Tiensuwan, PhD [Statistics] from Western Australia Faculty, Mathematics & Statistics,Mahidol University, Bangkok Had already studied the literature on Statistical Assessment of Agreement... Successfully collaborated with the Medical Doctors......

  13. Bring out the Inside Story…. • RS1 \ RS2 yes no Total • yes 270 60 330 • no 5614 70 • 326 74 400 •   versus RS3 • yes 280 50 330 • no 5218 70 • 332 68 400 • RS2 versus RS3 • yes 27056 326 • no 6212 74 • 332 68 400 • Agreement…..not strong at all…..more than 25% disagreement upfront between any two raters

  14. Cohen’s Kappa for 2x2 Rating • Rater I vs Rater II : 2 x 2 Case Categories : Yes & No (Y,Y) & (N,N) : Agreement Prop. along the main diagonal (Y,N) & (N,Y) :Disagreement Prop. along the anti-diagonal 0 = (Y,Y) + (N,N) = Prop. Agreement Chancy Agreement [CA] ? • e = (Y,.) (.,Y) + (N,.) (.,N)=Prop. CA •  = [ 0 - e ] / [ 1 - e ] x 100 % Kappa • Chance-corrected Agreement Index

  15. Study of Agreement [RS-ME] 2 x 2 Table : Cohen’s Kappa () Coefficient Retina Specialist Retina Specialist 2 1 Presence Absence Subtotal Presence 270 60 330 Absence 56 14 70 Subtotal 326 74 400 % agreement : (270 + 14) / 400 = 71% = 0 [Observed] % Chancy Agreement : %Yes. %Yes + %No. %No (330/400)(326/400) + (70/400)(74/400) = 0.825x0.815+0.175x0.185 = 70.48%= e [expected by chance]  = [0 – e] / [ 1 – e ] = 1.8 % ....very poor agreement..... Net Agreement Standardized Agreement Index

  16. Marginal Agreement vs Overall Agreement …. • Up front : Case of Marginal Agreement • Should not be judged by Mar. Agr. • Must look into all the 400 images and verify agreement case by case to decide on the extent of overall agreement….. Pairwise  -Index for Macular Edema • RS1 vs RS2….1.8 % • RS1 vs RS3….10.68 % • RS2 vs RS3…..0% • No or very poor overall agreement…..

  17. Other Features….. #1. Diabetic Retinopathy Severity [6 options] No Retinopathy / Mild / Moderate Severe / Critical / Ungradable A bit tricky.....6 options.... #2. Macular Edema [ 2 options]....done Presence / Absence #3. Referral to Opthalmologists [2 options] Referrals / Non-Referrals #3 is similar to #2 : 2 x 2 Table [R vs NR]

  18. Marginal Summary of Data : RS • Diab. Ret. Classification of Patients by • Status RS1 RS2 RS3 • Nil 252 286 287 • Mild 38 30 33 • Moderate 81 53 49 • Severe 7 13 12 • Critical 10 11 12 • Ungradable 12 7 7 • Total 400 400 400 • Remark : Reasonably good agreement ….very good agreement between RS2 & RS3 indeed….. • Inside story ? Chance-Corrected Kappa Index ?

  19. Retina Specialists’ Ratings [DR] RS1 \ RS2 CODES 1 2 3 4 5 6Total 1 247 2 2 1 0 0 252 2 12 18 7 1 0 0 38 3 22 10 40 8 0 1 81 4 0 0 3 2 2 0 7 5 0 0 0 1 9 0 10 6 5 0 1 0 0 6 12 Total 286 30 53 13 11 7 400

  20. Retina Specialists’ Ratings [DR] RS1 \ RS3 CODES 0 1 2 3 4 9Total 0 249 2 0 1 0 0 252 1 23 8 7 0 0 0 38 2 31 4 44 2 0 0 81 3 0 0 7 0 0 0 7 4 0 0 0 0 10 0 10 9 9 1 0 0 0 2 12 Total 312 15 58 3 10 2 400

  21. Retina Specialists’ Ratings [DR] RS2 \ RS3 CODES 0 1 2 3 4 9Total 0 274 5 6 1 0 0 286 1 16 5 8 1 0 0 30 2 15 2 35 0 0 1 53 3 2 2 7 1 1 0 13 4 0 0 2 0 9 0 11 9 5 1 0 0 0 1 7 Total 312 15 58 3 10 2 400

  22. Retina Specialists’ Consensus Rating [DR] RS1 \ RSCR CODES 0 1 2 3 4 9Total 0 252 0 0 0 0 0 252 1 17 19 2 0 0 0 38 2 15 19 43 2 1 1 81 3 0 0 2 4 1 0 7 4 0 0 0 0 10 0 10 9 8 0 0 0 0 4 12 Total 292 38 47 6 12 5 400

  23. Understanding the 6x6 Table.... 1 Retina Specialists 2 CODES 1 2 3 4 5 6Total 1 247 2 2 1 0 0 252 2 12 18 7 1 0 0 38 3 22 10 40 8 0 1 81 4 0 0 3 2 2 0 7 5 0 0 0 1 9 0 10 6 5 0 1 0 0 612 Total 286 30 5313117400

  24.  - Computation…… % Agreement =(247+18+40+2+9+6)/400 = 322/400 =0.8050 = 80.50 % = 0 % Chancy Agreement = (252/400)(286/400) + ….+(12/400)(7/400) = 0.4860 = 48.60 % e  = [0 – e ] / [ 1 – e ] = 62% ! Note : 100% Credit for ’Hit’ & No Credit for ’Miss’. Criticism : Heavy Penalty for narrowly missed ! Concept of Weighted Kappa

  25. Hit or Miss….. 100% credit for ’hit’ along the diagonal 1 Retina Specialists 2 CODES 1 2 3 4 5 6Total 1 24722 1 0 0 252 2 12187 1 0 0 38 3 2210408 0 1 81 4 0 0 3 22 0 7 5 0 0 0 19 0 10 6 5 0 1 0 0 6 12 Total 286 30 53 13 11 7400

  26. Table of Weights for 6x6 Ratings Ratings Ratings [ 1 to 6 ] 1 2 3 4 5 6 1 1 24/25 21/25 16/25 9/25 0 2 24/25 1 24/25 21/25 16/25 9/25 3 21/25 24/25 1 24/25 21/25 16/25 4 16/25 21/25 24/25 1 24/25 21/25 5 9/25 16/25 21/25 24/25 1 24/25 6 0 9/25 16/25 21/25 24/25 1 Formula wiJ = 1 – [(i – j)^2 / (6-1)^2]

  27. Formula for Weighted Kappa • 0 (w) = ∑∑wij f ij / n • e(w) = ∑ ∑ wij (fi. /n)(f.j /n) • These ∑ ∑ are over ALL cells with f ij as freq. in the (i,j)th cell • For unweighted Kappa : we take into account only the cell freq. along the main diagonal with 100% weight

  28. -statistics for Pairs of Raters Categories DR ME Referral Retina Specialists 1 vs 2 0.63 0.58 0.65 1 vs 3 0.55 0.64 0.65 2 vs 3 0.56 0.51 0.59 -coeff. Interpretation : Usually 70 % or more...sign of satisfactory agreeement.... Not very exciting form of agreement...

  29.  for Multiple Raters’ Agreement • How to judge agreement among • Retina Specialists vs Opthalmologists Retina Specialists vs Photographers • Retina Specialists vs Nurses and so on..... • Needed computational formlae for single Index of Agreement for each Category of Raters....for Category-wise Comparisons... • Research Papers and Books.....

  30.  -statistic for Multiple Raters… CATEGORIES DR ME Referral Retina Specialsts 0.58 0.58 0.63 Gen. Opthalmo. 0.36 0.19 0.24 Photographers 0.37 0.38 0.30 Nurses 0.26 0.20 0.20 All Raters 0.34 0.27 0.28 Except for Retina Specialists, no other expert group shows good agreement in any feature

  31. Conclusion based on  -Study • Of all 400 cases….. • 44 warranted Referral to Opthalmologists due to Retinopathy Severity • 5 warranted Referral to Opthalmologists due to uncertainty in diagnosis • Fourth Retina Specialist carried out Dilated Fundus Exam of these 44 patients and substantial agreement [ = 0.68] was noticed for DR severity…… • Exam confirmed Referral of 38 / 44 cases.

  32. Discussion on the Study • Retina Specialists : All in active clinical practice : Most reliable for digital image interpretation • Individual Rater’s background and experience play roles in digital image interpretation • Unusually high % of ungradable images among nonphysician raters, though only 5 out of 400 were declared as ’ungradable’ by consensus of the Retina Specialists’ Group. • Lack of Confidence of Nonphysicians, rather than true image ambiguity ! • For this study, other factors [blood pressure, blood sugar, cholesterol etc] not taken into account……

  33. References….. Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Educational & Psychological Measurement, 20(1): 37 – 46. [Famous for Cohen’s Kappa] Cohen, J. (1968). Weighted Kappa : Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit. Psychological Bulletin, 70(4): 213-220.

  34. References…. Lin. L. I. (2000).Total Deviation Index for Measuring Individual Agreement : With Application in Lab Performance and Bioequivalence. Statistics in Medicine, 19 : 255 - 270. Lin, L. I., Hedayat, A. S., Sinha, Bikas & Yang, Min (2002). Statistical Methods in Assessing Agreement : Models, Issues, and Tools. Jour. Amer. Statist. Assoc. 97 (457) 257 - 270.

  35. References…. Banerjee, M., Capozzoli, M., Mcsweeney, L. & Sinha, D.(1999). Beyond Kappa : A Review of Interrater Agreement Measures. Canadian Jour. of Statistics, 27(1) : 3 - 23. Sinha, B.K., Tiensuwan, M. & Pharita (2007). Cohen’s Kappa Statistic : A Critical Appraisal and Some Modifications. • Calcutta Statistical Association Bulletin, 58, 151-169.

  36. The End • Thank you for your attention ! • Bikas K Sinha • Sept. 13-14, 2017

More Related