Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis

Research output: Contribution to journalReviewResearchpeer-review

Standard

Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills : A systematic review and meta-analysis. / Andersen, Steven Arild Wuyts; Nayahangan, Leizl Joy; Park, Yoon Soo; Konge, Lars.

In: Academic Medicine, Vol. 96, No. 11, 2021, p. 1609-1619.

Research output: Contribution to journalReviewResearchpeer-review

Harvard

Andersen, SAW, Nayahangan, LJ, Park, YS & Konge, L 2021, 'Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis', Academic Medicine, vol. 96, no. 11, pp. 1609-1619. https://doi.org/10.1097/ACM.0000000000004150

APA

Andersen, S. A. W., Nayahangan, L. J., Park, Y. S., & Konge, L. (2021). Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis. Academic Medicine, 96(11), 1609-1619. https://doi.org/10.1097/ACM.0000000000004150

Vancouver

Andersen SAW, Nayahangan LJ, Park YS, Konge L. Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis. Academic Medicine. 2021;96(11):1609-1619. https://doi.org/10.1097/ACM.0000000000004150

Author

Andersen, Steven Arild Wuyts ; Nayahangan, Leizl Joy ; Park, Yoon Soo ; Konge, Lars. / Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills : A systematic review and meta-analysis. In: Academic Medicine. 2021 ; Vol. 96, No. 11. pp. 1609-1619.

Bibtex

@article{5d8b4c5e9b5a4612bab7b31ef005862a,
title = "Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: A systematic review and meta-analysis",
abstract = "Purpose Competency-based education relies on the validity and reliability of assessment scores. Generalizability (G) theory is well suited to explore the reliability of assessment tools in medical education but has only been applied to a limited extent. This study aimed to systematically review the literature using G-theory to explore the reliability of structured assessment of medical and surgical technical skills and to assess the relative contributions of different factors to variance. Method In June 2020, 11 databases, including PubMed, were searched from inception through May 31, 2020. Eligible studies included the use of G-theory to explore reliability in the context of assessment of medical and surgical technical skills. Descriptive information on study, assessment context, assessment protocol, participants being assessed, and G-analyses was extracted. Data were used to map G-theory and explore variance components analyses. A meta-analysis was conducted to synthesize the extracted data on the sources of variance and reliability. Results Forty-four studies were included; of these, 39 had sufficient data for metaanalysis. The total pool included 35,284 unique assessments of 31,496 unique performances of 4,154 participants. Person variance had a pooled effect of 44.2% (95% confidence interval [CI], 36.8%-51.5%). Only assessment tool type (Objective Structured Assessment of Technical Skills-type vs task-based checklist-type) had a significant effect on person variance. The pooled reliability (G-coefficient) was 0.65 (95% CI,.59-.70). Most studies included decision studies (39, 88.6%) and generally seemed to have higher ratios of performances to assessors to achieve a sufficiently reliable assessment. Conclusions G-theory is increasingly being used to examine reliability of technical skills assessment in medical education, but more rigor in reporting is warranted. Contextual factors can potentially affect variance components and thereby reliability estimates and should be considered, especially in high-stakes assessment. Reliability analysis should be a best practice when developing assessment of technical skills.",
author = "Andersen, {Steven Arild Wuyts} and Nayahangan, {Leizl Joy} and Park, {Yoon Soo} and Lars Konge",
note = "Publisher Copyright: {\textcopyright} 2021 Lippincott Williams and Wilkins. All rights reserved.",
year = "2021",
doi = "10.1097/ACM.0000000000004150",
language = "English",
volume = "96",
pages = "1609--1619",
journal = "Academic Medicine",
issn = "1040-2446",
publisher = "Lippincott Williams & Wilkins",
number = "11",

}

RIS

TY - JOUR

T1 - Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills

T2 - A systematic review and meta-analysis

AU - Andersen, Steven Arild Wuyts

AU - Nayahangan, Leizl Joy

AU - Park, Yoon Soo

AU - Konge, Lars

N1 - Publisher Copyright: © 2021 Lippincott Williams and Wilkins. All rights reserved.

PY - 2021

Y1 - 2021

N2 - Purpose Competency-based education relies on the validity and reliability of assessment scores. Generalizability (G) theory is well suited to explore the reliability of assessment tools in medical education but has only been applied to a limited extent. This study aimed to systematically review the literature using G-theory to explore the reliability of structured assessment of medical and surgical technical skills and to assess the relative contributions of different factors to variance. Method In June 2020, 11 databases, including PubMed, were searched from inception through May 31, 2020. Eligible studies included the use of G-theory to explore reliability in the context of assessment of medical and surgical technical skills. Descriptive information on study, assessment context, assessment protocol, participants being assessed, and G-analyses was extracted. Data were used to map G-theory and explore variance components analyses. A meta-analysis was conducted to synthesize the extracted data on the sources of variance and reliability. Results Forty-four studies were included; of these, 39 had sufficient data for metaanalysis. The total pool included 35,284 unique assessments of 31,496 unique performances of 4,154 participants. Person variance had a pooled effect of 44.2% (95% confidence interval [CI], 36.8%-51.5%). Only assessment tool type (Objective Structured Assessment of Technical Skills-type vs task-based checklist-type) had a significant effect on person variance. The pooled reliability (G-coefficient) was 0.65 (95% CI,.59-.70). Most studies included decision studies (39, 88.6%) and generally seemed to have higher ratios of performances to assessors to achieve a sufficiently reliable assessment. Conclusions G-theory is increasingly being used to examine reliability of technical skills assessment in medical education, but more rigor in reporting is warranted. Contextual factors can potentially affect variance components and thereby reliability estimates and should be considered, especially in high-stakes assessment. Reliability analysis should be a best practice when developing assessment of technical skills.

AB - Purpose Competency-based education relies on the validity and reliability of assessment scores. Generalizability (G) theory is well suited to explore the reliability of assessment tools in medical education but has only been applied to a limited extent. This study aimed to systematically review the literature using G-theory to explore the reliability of structured assessment of medical and surgical technical skills and to assess the relative contributions of different factors to variance. Method In June 2020, 11 databases, including PubMed, were searched from inception through May 31, 2020. Eligible studies included the use of G-theory to explore reliability in the context of assessment of medical and surgical technical skills. Descriptive information on study, assessment context, assessment protocol, participants being assessed, and G-analyses was extracted. Data were used to map G-theory and explore variance components analyses. A meta-analysis was conducted to synthesize the extracted data on the sources of variance and reliability. Results Forty-four studies were included; of these, 39 had sufficient data for metaanalysis. The total pool included 35,284 unique assessments of 31,496 unique performances of 4,154 participants. Person variance had a pooled effect of 44.2% (95% confidence interval [CI], 36.8%-51.5%). Only assessment tool type (Objective Structured Assessment of Technical Skills-type vs task-based checklist-type) had a significant effect on person variance. The pooled reliability (G-coefficient) was 0.65 (95% CI,.59-.70). Most studies included decision studies (39, 88.6%) and generally seemed to have higher ratios of performances to assessors to achieve a sufficiently reliable assessment. Conclusions G-theory is increasingly being used to examine reliability of technical skills assessment in medical education, but more rigor in reporting is warranted. Contextual factors can potentially affect variance components and thereby reliability estimates and should be considered, especially in high-stakes assessment. Reliability analysis should be a best practice when developing assessment of technical skills.

U2 - 10.1097/ACM.0000000000004150

DO - 10.1097/ACM.0000000000004150

M3 - Review

C2 - 33951677

AN - SCOPUS:85116816314

VL - 96

SP - 1609

EP - 1619

JO - Academic Medicine

JF - Academic Medicine

SN - 1040-2446

IS - 11

ER -

ID: 301021839