The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge : A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset. / IWOAI Segmentation Challenge Writing Group.

I: Radiology. Artificial intelligence, Bind 3, Nr. 3, e200078, 2021.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

IWOAI Segmentation Challenge Writing Group 2021, 'The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset', Radiology. Artificial intelligence, bind 3, nr. 3, e200078. https://doi.org/10.1148/ryai.2021200078

APA

IWOAI Segmentation Challenge Writing Group (2021). The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset. Radiology. Artificial intelligence, 3(3), [e200078]. https://doi.org/10.1148/ryai.2021200078

Vancouver

IWOAI Segmentation Challenge Writing Group. The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset. Radiology. Artificial intelligence. 2021;3(3). e200078. https://doi.org/10.1148/ryai.2021200078

Author

IWOAI Segmentation Challenge Writing Group. / The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge : A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset. I: Radiology. Artificial intelligence. 2021 ; Bind 3, Nr. 3.

Bibtex

@article{26d4cf5456b342b0bccc6d9e5c74c205,
title = "The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset",
abstract = "Purpose: To organize a multi-institute knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression.Materials and Methods: A dataset partition consisting of three-dimensional knee MRI from 88 retrospective patients at two time points (baseline and 1-year follow-up) with ground truth articular (femoral, tibial, and patellar) cartilage and meniscus segmentations was standardized. Challenge submissions and a majority-vote ensemble were evaluated against ground truth segmentations using Dice score, average symmetric surface distance, volumetric overlap error, and coefficient of variation on a holdout test set. Similarities in automated segmentations were measured using pairwise Dice coefficient correlations. Articular cartilage thickness was computed longitudinally and with scans. Correlation between thickness error and segmentation metrics was measured using the Pearson correlation coefficient. Two empirical upper bounds for ensemble performance were computed using combinations of model outputs that consolidated true positives and true negatives.Results: Six teams (T 1-T 6) submitted entries for the challenge. No differences were observed across any segmentation metrics for any tissues (P = .99) among the four top-performing networks (T 2, T 3, T 4, T 6). Dice coefficient correlations between network pairs were high (> 0.85). Per-scan thickness errors were negligible among networks T 1-T 4 (P = .99), and longitudinal changes showed minimal bias (< 0.03 mm). Low correlations (ρ < 0.41) were observed between segmentation metrics and thickness error. The majority-vote ensemble was comparable to top-performing networks (P = .99). Empirical upper-bound performances were similar for both combinations (P = .99).Conclusion: Diverse networks learned to segment the knee similarly, where high segmentation accuracy did not correlate with cartilage thickness accuracy and voting ensembles did not exceed individual network performance.See also the commentary by Elhalawani and Mak in this issue.Keywords: Cartilage, Knee, MR-Imaging, Segmentation {\textcopyright} RSNA, 2020Supplemental material is available for this article.",
author = "Desai, {Arjun D} and Francesco Caliva and Claudia Iriondo and Aliasghar Mortazi and Sachin Jambawalikar and Ulas Bagci and Mathias Perslev and Christian Igel and Dam, {Erik B} and Sibaji Gaj and Mingrui Yang and Xiaojuan Li and Deniz, {Cem M} and Vladimir Juras and Ravinder Regatte and Gold, {Garry E} and Hargreaves, {Brian A} and Valentina Pedoia and Chaudhari, {Akshay S} and {IWOAI Segmentation Challenge Writing Group}",
year = "2021",
doi = "10.1148/ryai.2021200078",
language = "English",
volume = "3",
journal = "Radiology. Artificial intelligence",
issn = "2638-6100",
publisher = "Radiological Society of North America",
number = "3",

}

RIS

TY - JOUR

T1 - The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge

T2 - A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset

AU - Desai, Arjun D

AU - Caliva, Francesco

AU - Iriondo, Claudia

AU - Mortazi, Aliasghar

AU - Jambawalikar, Sachin

AU - Bagci, Ulas

AU - Perslev, Mathias

AU - Igel, Christian

AU - Dam, Erik B

AU - Gaj, Sibaji

AU - Yang, Mingrui

AU - Li, Xiaojuan

AU - Deniz, Cem M

AU - Juras, Vladimir

AU - Regatte, Ravinder

AU - Gold, Garry E

AU - Hargreaves, Brian A

AU - Pedoia, Valentina

AU - Chaudhari, Akshay S

AU - IWOAI Segmentation Challenge Writing Group

PY - 2021

Y1 - 2021

N2 - Purpose: To organize a multi-institute knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression.Materials and Methods: A dataset partition consisting of three-dimensional knee MRI from 88 retrospective patients at two time points (baseline and 1-year follow-up) with ground truth articular (femoral, tibial, and patellar) cartilage and meniscus segmentations was standardized. Challenge submissions and a majority-vote ensemble were evaluated against ground truth segmentations using Dice score, average symmetric surface distance, volumetric overlap error, and coefficient of variation on a holdout test set. Similarities in automated segmentations were measured using pairwise Dice coefficient correlations. Articular cartilage thickness was computed longitudinally and with scans. Correlation between thickness error and segmentation metrics was measured using the Pearson correlation coefficient. Two empirical upper bounds for ensemble performance were computed using combinations of model outputs that consolidated true positives and true negatives.Results: Six teams (T 1-T 6) submitted entries for the challenge. No differences were observed across any segmentation metrics for any tissues (P = .99) among the four top-performing networks (T 2, T 3, T 4, T 6). Dice coefficient correlations between network pairs were high (> 0.85). Per-scan thickness errors were negligible among networks T 1-T 4 (P = .99), and longitudinal changes showed minimal bias (< 0.03 mm). Low correlations (ρ < 0.41) were observed between segmentation metrics and thickness error. The majority-vote ensemble was comparable to top-performing networks (P = .99). Empirical upper-bound performances were similar for both combinations (P = .99).Conclusion: Diverse networks learned to segment the knee similarly, where high segmentation accuracy did not correlate with cartilage thickness accuracy and voting ensembles did not exceed individual network performance.See also the commentary by Elhalawani and Mak in this issue.Keywords: Cartilage, Knee, MR-Imaging, Segmentation © RSNA, 2020Supplemental material is available for this article.

AB - Purpose: To organize a multi-institute knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression.Materials and Methods: A dataset partition consisting of three-dimensional knee MRI from 88 retrospective patients at two time points (baseline and 1-year follow-up) with ground truth articular (femoral, tibial, and patellar) cartilage and meniscus segmentations was standardized. Challenge submissions and a majority-vote ensemble were evaluated against ground truth segmentations using Dice score, average symmetric surface distance, volumetric overlap error, and coefficient of variation on a holdout test set. Similarities in automated segmentations were measured using pairwise Dice coefficient correlations. Articular cartilage thickness was computed longitudinally and with scans. Correlation between thickness error and segmentation metrics was measured using the Pearson correlation coefficient. Two empirical upper bounds for ensemble performance were computed using combinations of model outputs that consolidated true positives and true negatives.Results: Six teams (T 1-T 6) submitted entries for the challenge. No differences were observed across any segmentation metrics for any tissues (P = .99) among the four top-performing networks (T 2, T 3, T 4, T 6). Dice coefficient correlations between network pairs were high (> 0.85). Per-scan thickness errors were negligible among networks T 1-T 4 (P = .99), and longitudinal changes showed minimal bias (< 0.03 mm). Low correlations (ρ < 0.41) were observed between segmentation metrics and thickness error. The majority-vote ensemble was comparable to top-performing networks (P = .99). Empirical upper-bound performances were similar for both combinations (P = .99).Conclusion: Diverse networks learned to segment the knee similarly, where high segmentation accuracy did not correlate with cartilage thickness accuracy and voting ensembles did not exceed individual network performance.See also the commentary by Elhalawani and Mak in this issue.Keywords: Cartilage, Knee, MR-Imaging, Segmentation © RSNA, 2020Supplemental material is available for this article.

U2 - 10.1148/ryai.2021200078

DO - 10.1148/ryai.2021200078

M3 - Journal article

C2 - 34235438

VL - 3

JO - Radiology. Artificial intelligence

JF - Radiology. Artificial intelligence

SN - 2638-6100

IS - 3

M1 - e200078

ER -

ID: 274865087