Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study. / Ternov, Niels K.; Christensen, Anders N.; Kampen, Peter J. T.; Als, Gustav; Vestergaard, Tine; Konge, Lars; Tolsgaard, Martin; Hölmich, Lisbet r.; Guitera, Pascale; Chakera, Annette H.; Hannemose, Morten R.

In: JEADV Clinical Practice, Vol. 1, No. 4, 2022, p. 344-354.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Ternov, NK, Christensen, AN, Kampen, PJT, Als, G, Vestergaard, T, Konge, L, Tolsgaard, M, Hölmich, L, Guitera, P, Chakera, AH & Hannemose, MR 2022, 'Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study', JEADV Clinical Practice, vol. 1, no. 4, pp. 344-354. https://doi.org/10.1002/jvc2.59

APA

Ternov, N. K., Christensen, A. N., Kampen, P. J. T., Als, G., Vestergaard, T., Konge, L., Tolsgaard, M., Hölmich, L., Guitera, P., Chakera, A. H., & Hannemose, M. R. (2022). Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study. JEADV Clinical Practice, 1(4), 344-354. https://doi.org/10.1002/jvc2.59

Vancouver

Ternov NK, Christensen AN, Kampen PJT, Als G, Vestergaard T, Konge L et al. Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study. JEADV Clinical Practice. 2022;1(4):344-354. https://doi.org/10.1002/jvc2.59

Author

Ternov, Niels K. ; Christensen, Anders N. ; Kampen, Peter J. T. ; Als, Gustav ; Vestergaard, Tine ; Konge, Lars ; Tolsgaard, Martin ; Hölmich, Lisbet r. ; Guitera, Pascale ; Chakera, Annette H. ; Hannemose, Morten R. / Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study. In: JEADV Clinical Practice. 2022 ; Vol. 1, No. 4. pp. 344-354.

Bibtex

@article{be762e7b79ba4afba5c300e567daa883,
title = "Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study",
abstract = "BackgroundArtificial intelligence can be trained to outperform dermatologists in image-based skin cancer diagnostics. However, the networks' sensitivity to biases and overfitting may hamper their clinical applicability.ObjectivesThe aim of this study was to explain the potential consequences of implementing convolutional neural networks for stand-alone melanoma diagnostics and skin lesion triage.MethodsIn this algorithm validation study on retrospective data, we reproduced and evaluated the performance of state-of-the-art artificial intelligence (convolutional neural networks) for skin cancer diagnostics. The networks were trained on 25,331 annotated dermoscopic skin lesion images from an open-source data set (ISIC-2019) and tested using a novel data set (AISC-2021) consisting of 26,591 annotated dermoscopic skin lesion images. We tested the trained algorithms' ability to generalize to new data and their diagnostic performance in two simulations (melanoma diagnostics and skin lesion triage).ResultsThe trained algorithms performed significantly less accurate diagnostics on images of nevi, melanomas and actinic keratoses from the AISC-2021 data set than the ISIC-2019 data set (p < 0.003). Almost one-third (31.1%) of the melanomas were misclassified during the melanoma diagnostics simulation, irrespective of their Breslow thickness. Furthermore, the algorithms marked 92.7% of the lesions {\textquoteleft}suspicious{\textquoteright} during the triage simulation, which yielded a triage sensitivity and specificity of 99.7% and 8.2%, respectively.ConclusionsAlthough state-of-the-art artificial intelligence outperforms dermatologists on image-based skin lesion classification within an artificial setting, additional data and technological advances are needed before clinical implementation",
author = "Ternov, {Niels K.} and Christensen, {Anders N.} and Kampen, {Peter J. T.} and Gustav Als and Tine Vestergaard and Lars Konge and Martin Tolsgaard and Lisbet r. H{\"o}lmich and Pascale Guitera and Chakera, {Annette H.} and Hannemose, {Morten R.}",
year = "2022",
doi = "10.1002/jvc2.59",
language = "English",
volume = "1",
pages = "344--354",
journal = "JEADV Clinical Practice",
issn = "2768-6566",
publisher = "Wiley Open Access",
number = "4",

}

RIS

TY - JOUR

T1 - Generalizability and usefulness of artificial intelligence for skin cancer diagnostics: An algorithm validation study

AU - Ternov, Niels K.

AU - Christensen, Anders N.

AU - Kampen, Peter J. T.

AU - Als, Gustav

AU - Vestergaard, Tine

AU - Konge, Lars

AU - Tolsgaard, Martin

AU - Hölmich, Lisbet r.

AU - Guitera, Pascale

AU - Chakera, Annette H.

AU - Hannemose, Morten R.

PY - 2022

Y1 - 2022

N2 - BackgroundArtificial intelligence can be trained to outperform dermatologists in image-based skin cancer diagnostics. However, the networks' sensitivity to biases and overfitting may hamper their clinical applicability.ObjectivesThe aim of this study was to explain the potential consequences of implementing convolutional neural networks for stand-alone melanoma diagnostics and skin lesion triage.MethodsIn this algorithm validation study on retrospective data, we reproduced and evaluated the performance of state-of-the-art artificial intelligence (convolutional neural networks) for skin cancer diagnostics. The networks were trained on 25,331 annotated dermoscopic skin lesion images from an open-source data set (ISIC-2019) and tested using a novel data set (AISC-2021) consisting of 26,591 annotated dermoscopic skin lesion images. We tested the trained algorithms' ability to generalize to new data and their diagnostic performance in two simulations (melanoma diagnostics and skin lesion triage).ResultsThe trained algorithms performed significantly less accurate diagnostics on images of nevi, melanomas and actinic keratoses from the AISC-2021 data set than the ISIC-2019 data set (p < 0.003). Almost one-third (31.1%) of the melanomas were misclassified during the melanoma diagnostics simulation, irrespective of their Breslow thickness. Furthermore, the algorithms marked 92.7% of the lesions ‘suspicious’ during the triage simulation, which yielded a triage sensitivity and specificity of 99.7% and 8.2%, respectively.ConclusionsAlthough state-of-the-art artificial intelligence outperforms dermatologists on image-based skin lesion classification within an artificial setting, additional data and technological advances are needed before clinical implementation

AB - BackgroundArtificial intelligence can be trained to outperform dermatologists in image-based skin cancer diagnostics. However, the networks' sensitivity to biases and overfitting may hamper their clinical applicability.ObjectivesThe aim of this study was to explain the potential consequences of implementing convolutional neural networks for stand-alone melanoma diagnostics and skin lesion triage.MethodsIn this algorithm validation study on retrospective data, we reproduced and evaluated the performance of state-of-the-art artificial intelligence (convolutional neural networks) for skin cancer diagnostics. The networks were trained on 25,331 annotated dermoscopic skin lesion images from an open-source data set (ISIC-2019) and tested using a novel data set (AISC-2021) consisting of 26,591 annotated dermoscopic skin lesion images. We tested the trained algorithms' ability to generalize to new data and their diagnostic performance in two simulations (melanoma diagnostics and skin lesion triage).ResultsThe trained algorithms performed significantly less accurate diagnostics on images of nevi, melanomas and actinic keratoses from the AISC-2021 data set than the ISIC-2019 data set (p < 0.003). Almost one-third (31.1%) of the melanomas were misclassified during the melanoma diagnostics simulation, irrespective of their Breslow thickness. Furthermore, the algorithms marked 92.7% of the lesions ‘suspicious’ during the triage simulation, which yielded a triage sensitivity and specificity of 99.7% and 8.2%, respectively.ConclusionsAlthough state-of-the-art artificial intelligence outperforms dermatologists on image-based skin lesion classification within an artificial setting, additional data and technological advances are needed before clinical implementation

U2 - 10.1002/jvc2.59

DO - 10.1002/jvc2.59

M3 - Journal article

VL - 1

SP - 344

EP - 354

JO - JEADV Clinical Practice

JF - JEADV Clinical Practice

SN - 2768-6566

IS - 4

ER -

ID: 346603187