Inter-expert and intra-expert reliability in sleep spindle scoring

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Inter-expert and intra-expert reliability in sleep spindle scoring. / Wendt, Sabrina L; Welinder, Peter; Sorensen, Helge B D; Peppard, Paul E; Jennum, Poul; Perona, Pietro; Mignot, Emmanuel; Warby, Simon C.

In: Clinical Neurophysiology, Vol. 126, No. 8, 08.2015, p. 1548-56.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Wendt, SL, Welinder, P, Sorensen, HBD, Peppard, PE, Jennum, P, Perona, P, Mignot, E & Warby, SC 2015, 'Inter-expert and intra-expert reliability in sleep spindle scoring', Clinical Neurophysiology, vol. 126, no. 8, pp. 1548-56. https://doi.org/10.1016/j.clinph.2014.10.158

APA

Wendt, S. L., Welinder, P., Sorensen, H. B. D., Peppard, P. E., Jennum, P., Perona, P., Mignot, E., & Warby, S. C. (2015). Inter-expert and intra-expert reliability in sleep spindle scoring. Clinical Neurophysiology, 126(8), 1548-56. https://doi.org/10.1016/j.clinph.2014.10.158

Vancouver

Wendt SL, Welinder P, Sorensen HBD, Peppard PE, Jennum P, Perona P et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clinical Neurophysiology. 2015 Aug;126(8):1548-56. https://doi.org/10.1016/j.clinph.2014.10.158

Author

Wendt, Sabrina L ; Welinder, Peter ; Sorensen, Helge B D ; Peppard, Paul E ; Jennum, Poul ; Perona, Pietro ; Mignot, Emmanuel ; Warby, Simon C. / Inter-expert and intra-expert reliability in sleep spindle scoring. In: Clinical Neurophysiology. 2015 ; Vol. 126, No. 8. pp. 1548-56.

Bibtex

@article{8fb619545ea44f03b81c8b7d0c6e0603,
title = "Inter-expert and intra-expert reliability in sleep spindle scoring",
abstract = "OBJECTIVES: To measure the inter-expert and intra-expert agreement in sleep spindle scoring, and to quantify how many experts are needed to build a reliable dataset of sleep spindle scorings.METHODS: The EEG dataset was comprised of 400 randomly selected 115s segments of stage 2 sleep from 110 sleeping subjects in the general population (57±8, range: 42-72 years). To assess expert agreement, a total of 24 Registered Polysomnographic Technologists (RPSGTs) scored spindles in a subset of the EEG dataset at a single electrode location (C3-M2). Intra-expert and inter-expert agreements were calculated as F1-scores, Cohen's kappa (κ), and intra-class correlation coefficient (ICC).RESULTS: We found an average intra-expert F1-score agreement of 72±7% (κ: 0.66±0.07). The average inter-expert agreement was 61±6% (κ: 0.52±0.07). Amplitude and frequency of discrete spindles were calculated with higher reliability than the estimation of spindle duration. Reliability of sleep spindle scoring can be improved by using qualitative confidence scores, rather than a dichotomous yes/no scoring system.CONCLUSIONS: We estimate that 2-3 experts are needed to build a spindle scoring dataset with 'substantial' reliability (κ: 0.61-0.8), and 4 or more experts are needed to build a dataset with 'almost perfect' reliability (κ: 0.81-1).SIGNIFICANCE: Spindle scoring is a critical part of sleep staging, and spindles are believed to play an important role in development, aging, and diseases of the nervous system.",
keywords = "Adult, Aged, Arousal, Electroencephalography, Female, Humans, Male, Middle Aged, Observer Variation, Polysomnography, Reproducibility of Results, Sleep Stages",
author = "Wendt, {Sabrina L} and Peter Welinder and Sorensen, {Helge B D} and Peppard, {Paul E} and Poul Jennum and Pietro Perona and Emmanuel Mignot and Warby, {Simon C}",
note = "Copyright {\textcopyright} 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.",
year = "2015",
month = aug,
doi = "10.1016/j.clinph.2014.10.158",
language = "English",
volume = "126",
pages = "1548--56",
journal = "Clinical Neurophysiology",
issn = "1388-2457",
publisher = "Elsevier Ireland Ltd",
number = "8",

}

RIS

TY - JOUR

T1 - Inter-expert and intra-expert reliability in sleep spindle scoring

AU - Wendt, Sabrina L

AU - Welinder, Peter

AU - Sorensen, Helge B D

AU - Peppard, Paul E

AU - Jennum, Poul

AU - Perona, Pietro

AU - Mignot, Emmanuel

AU - Warby, Simon C

N1 - Copyright © 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

PY - 2015/8

Y1 - 2015/8

N2 - OBJECTIVES: To measure the inter-expert and intra-expert agreement in sleep spindle scoring, and to quantify how many experts are needed to build a reliable dataset of sleep spindle scorings.METHODS: The EEG dataset was comprised of 400 randomly selected 115s segments of stage 2 sleep from 110 sleeping subjects in the general population (57±8, range: 42-72 years). To assess expert agreement, a total of 24 Registered Polysomnographic Technologists (RPSGTs) scored spindles in a subset of the EEG dataset at a single electrode location (C3-M2). Intra-expert and inter-expert agreements were calculated as F1-scores, Cohen's kappa (κ), and intra-class correlation coefficient (ICC).RESULTS: We found an average intra-expert F1-score agreement of 72±7% (κ: 0.66±0.07). The average inter-expert agreement was 61±6% (κ: 0.52±0.07). Amplitude and frequency of discrete spindles were calculated with higher reliability than the estimation of spindle duration. Reliability of sleep spindle scoring can be improved by using qualitative confidence scores, rather than a dichotomous yes/no scoring system.CONCLUSIONS: We estimate that 2-3 experts are needed to build a spindle scoring dataset with 'substantial' reliability (κ: 0.61-0.8), and 4 or more experts are needed to build a dataset with 'almost perfect' reliability (κ: 0.81-1).SIGNIFICANCE: Spindle scoring is a critical part of sleep staging, and spindles are believed to play an important role in development, aging, and diseases of the nervous system.

AB - OBJECTIVES: To measure the inter-expert and intra-expert agreement in sleep spindle scoring, and to quantify how many experts are needed to build a reliable dataset of sleep spindle scorings.METHODS: The EEG dataset was comprised of 400 randomly selected 115s segments of stage 2 sleep from 110 sleeping subjects in the general population (57±8, range: 42-72 years). To assess expert agreement, a total of 24 Registered Polysomnographic Technologists (RPSGTs) scored spindles in a subset of the EEG dataset at a single electrode location (C3-M2). Intra-expert and inter-expert agreements were calculated as F1-scores, Cohen's kappa (κ), and intra-class correlation coefficient (ICC).RESULTS: We found an average intra-expert F1-score agreement of 72±7% (κ: 0.66±0.07). The average inter-expert agreement was 61±6% (κ: 0.52±0.07). Amplitude and frequency of discrete spindles were calculated with higher reliability than the estimation of spindle duration. Reliability of sleep spindle scoring can be improved by using qualitative confidence scores, rather than a dichotomous yes/no scoring system.CONCLUSIONS: We estimate that 2-3 experts are needed to build a spindle scoring dataset with 'substantial' reliability (κ: 0.61-0.8), and 4 or more experts are needed to build a dataset with 'almost perfect' reliability (κ: 0.81-1).SIGNIFICANCE: Spindle scoring is a critical part of sleep staging, and spindles are believed to play an important role in development, aging, and diseases of the nervous system.

KW - Adult

KW - Aged

KW - Arousal

KW - Electroencephalography

KW - Female

KW - Humans

KW - Male

KW - Middle Aged

KW - Observer Variation

KW - Polysomnography

KW - Reproducibility of Results

KW - Sleep Stages

U2 - 10.1016/j.clinph.2014.10.158

DO - 10.1016/j.clinph.2014.10.158

M3 - Journal article

C2 - 25434753

VL - 126

SP - 1548

EP - 1556

JO - Clinical Neurophysiology

JF - Clinical Neurophysiology

SN - 1388-2457

IS - 8

ER -

ID: 162686467