Preview

Reliability analysis of binary outcomes: sample size and calculations of kappa statistic

https://doi.org/10.22328/2413-5747-2023-9-3-102-112

Abstract

Reliability analysis is an important methodological tool used in medical research to assess the degree of agreement between measurements taken by different methods or by multiple investigators. In this article, we provide an easy-to-understand overview of the basic concepts associated with reliability analysis, as well as the statistical criteria used in its application in biomedical research. The similarities and differences between the analysis of validity and the analysis of reliability are also presented. The principles of calculating Cohen’s kappa for the simplest situation with two researchers and binary variables are demonstrated both by using the formulas and by applying the SPSS software. Advantages and disadvantages of using kappa statistic are discussed. The article is intended for novice researchers and young scientists and will be useful for planning of research projects and training data collectors.

About the Authors

Ekaterina A. Mitkina
Northern State Medical University
Russian Federation

student of the faculty of dentistry



Yulia G. Kozlova
Northern State Medical University
Russian Federation

student of the faculty of dentistry



Maria A. Gorbatova
Northern State Medical University
Russian Federation

Cand. of Sci. (Med.), MPH, Associate professor at the Department of Pediatric Dentistry



Andrej M. Grjibovski
Northern State Medical University; North-Eastern Federal University
Russian Federation

D-r of Sci. (Med), Master of International Community Health, Head of the Directorate for Research and Innovations, Director of the Central Scientific Research Laboratory; Professor at the Department of Public Health, Public Health, General Hygiene and Bioethics



References

1. Whittemore R., Chase S.K., Mandle C.L. Validity in Qualitative Research. Qualitative Health Research, 2001, Vol. 11, № 4, pp. 522–537. doi: 10.1177/104973201129119299.

2. Ahmed I., Ishtiaq S. Reliability and validity: Importance in Medical Research. J Pak Med Assoc, 2021, Vol. 71, № 10, pp. 2401–2406. doi: 10.47391/JPMA.06-861.

3. McHugh Mary L. Interrater reliability: the kappa statistic. Biochem Med (Zagreb), 2012, Vol. 22, № 3, pp. 276–282.

4. Tang W., Hu J., Zhang H., Wu P., He H. Kappa coefficient: a popular measure of rater agreement. Shanghai Arch Psychiatry, 2015, Vol. 27, № 1, pp. 62–67. doi: 10.11919/j.issn.1002-0829.215010.

5. Noble H., Smith J. Issue of validity and reliability in quantitative research. Evid Based Nurs, 2015, Vol. 18, № 2, pp. 34–35. doi: 10.1136/eb-2015-102054.

6. Aoki K., Hall T., Takasaki H. Reporting on the level of validity and reliability of questionnaires measuring Katakori severity: A systematic review. SAGE Open Med, 2019, Vol. 7, pp. 1–13. doi: 10.1177/2050312119836617.

7. Akturk Z. Reliability and validity in medical research. Dicle Med J, 2012, Vol. 39, № 2, pp. 196–202. doi: 10.5798/diclemedj.0921.2012.02.0150.

8. Fyffe H.E., Deery C., Nugent Z.J., Nuttall N.M., Pitts N.B. Effect of diagnostic threshold on the validity and reliability of epidemiological caries diagnosis using the Dundee selectable threshold method for caries diagnosis (DSTM). Community Dent Oral Epidemiol, 2000, Vol. 28, № 1, pp. 42–51. doi: 10.1034/j.1600-0528.2000.280106.x.

9. Рождественская Е.Ю. Надежность качественных методов и качество данных // INTER. 2014. Т. 8. C. 16–28 [Rozhdestvenskaya E.Y. Reliability of qualitative methods and data quality. INTER, 2014, № 8, 16–29 (In Russ.)].

10. Rechmann P., Jue B., Santo W., Rechmann B.M.T., Featherstone J.D.B. Calibration of dentists for Caries Management by Risk Assessment Research in a Practice Based Research network - cambra pbrn. BMC Oral Health, 2018, Vol. 18, №2. doi: 10.1186/s12903-017-0457-3.10.1186/s12903-017-0457-3

11. Tavakol M., Sandars J. Quantitative and qualitative methods in medical education research: AMEE Guide No 90: Part II. Medical Teacher, 2014, Vol. 36, № 10, pp. 838–848. doi: 10.3109/0142159X.2014.915297.

12. Warren J.J., Weber-Gasparoni K., Tinanoff N., Batliner T.S., Jue B., Santo W., Garcia R.I., Gansky S.A., Early Childhood Caries Collaborating Centers. Examination criteria and calibration procedures for prevention trials of the Early Childhood Caries Collaborating Centers. Public Health Dent, 2015, Vol. 75, № 4, pp. 317–326. doi: 10.1111/jphd.12102.

13. Amarante B.C., Arima L.Y., Pinheiro E., Carvalho P., Michel-Crosato E., Bönecker M. Diagnosis training and calibration for epidemiological studies on primary and permanent teeth with hypomineralization. Eur Arch Paediatr Dent, 2022, Vol. 23, № 1, pp. 169–177. doi: 10.1007/s40368-021-00686-3.

14. Shoukri M. Measurement of Agreement. Wiley StatsRef: Statistics Reference Online, 2015, pp. 1–31. doi: 10.1002/9781118445112.stat05301.pub2.

15. Donner A., Rotondi M.A. Sample Size Requirements for Interval Estimation of the Kappa Statistic for Interobserver Agreement Studies with a Binary Outcome and Multiple Raters. The International Journal of Biostatistics, 2010, Vol. 6, № 1. doi: 10.2202/1557-4679.1275.

16. Hyunsook H., Yunhee C., Seokyung H., Sue K.P., Byung-Joo P. Nomogram for sample size calculation on a straightforward basis for the kappa statistic. Annals of Epidemiology, 2014, Vol. 24, № 9, pp. 673–680. doi: 10.1016/j.annepidem.2014.06.097.

17. Guggenmoos-Holzmann I. The meaning of kappa: probabilistic concepts of reliability and validity revisited. Clin Epidemiol, 1996, Vol. 49, № 7, pp. 775–782. doi: 10.1016/0895-4356(96)00011-x.

18. Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, Vol. 20, №1, pp. 37–46. doi: 10.1177/001316446002000104.

19. Sim J., Wright C.C. The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Physical Therapy, 2005, Vol. 85, № 3, pp. 257–268. doi: 10.1093/ptj/85.3.257.

20. Кригер Е.А., Гржибовский А.М., Постоев В.А. Оценка распространенности заболеваний с учетом диагностической эффективности тестов на примере использования серологических тестов для диагностики новой коронавирусной инфекции (COVID-19) // Экология человека. 2022. Т. 29, № 5. С. 301–309 [Kriger E.A., Grjibovski A.M., Postoev V.A. Prevalence assessment adjusted for laboratory test performance using an example of the COVID-19 serological tests. Ekologiya cheloveka [Human Ecology], 2022, Vol. 29, № 5, 301–309 (In Russ.)]. doi: 10.17816/humeco108116.

21. Zec S., Soriani N., Comoretto R., Baldi I. High Agreement and High Prevalence: The Paradox of Cohen’s Kappa. Open Nurs J, 2017, Vol. 11, pp. 211–218. doi: 10.2174/1874434601711010211.


Supplementary files

1. Fig. 1. Two-by-two table for calculating coefficients used in reliability analysis
Subject
Type author.submit.suppFile.figureResearchMaterials
View (30KB)    
Indexing metadata ▾
2. Fig. 2. Two-by-two table for manual calculation of kappa statistic.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (59KB)    
Indexing metadata ▾
3. Fig. 3. Example from Table 1 in SPSS data window.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (127KB)    
Indexing metadata ▾
4. Fig. 4. Dialog box for crosstabulation.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (93KB)    
Indexing metadata ▾
5. Fig. 5. Dialog box Crosstabs: Statistics with selection of kappa statistic.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (120KB)    
Indexing metadata ▾
6. Fig. 6. Contingency table in SPSS with the data from Table 1.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (48KB)    
Indexing metadata ▾
7. Fig. 7. SPSS output with the results of kappa statistic calculation.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (88KB)    
Indexing metadata ▾
8. Fig. 8. Sample size for kappa statistic calculation.
Subject
Type author.submit.suppFile.figureResearchMaterials
View (48KB)    
Indexing metadata ▾

Review

For citations:


Mitkina E.A., Kozlova Yu.G., Gorbatova M.A., Grjibovski A.M. Reliability analysis of binary outcomes: sample size and calculations of kappa statistic. Marine Medicine. 2023;9(3):102-112. (In Russ.) https://doi.org/10.22328/2413-5747-2023-9-3-102-112

Views: 2


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2413-5747 (Print)
ISSN 2587-7828 (Online)