Should artificial intelligence have lower acceptable error rates than humans?

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Should artificial intelligence have lower acceptable error rates than humans? / Lenskjold, Anders; Nybing, Janus Uhd; Trampedach, Charlotte; Galsgaard, Astrid; Brejnebøl, Mathias Willadsen; Raaschou, Henriette; Rose, Martin Høyer; Boesen, Mikael.

In: BJR open, Vol. 5, No. 1, 20220053, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Lenskjold, A, Nybing, JU, Trampedach, C, Galsgaard, A, Brejnebøl, MW, Raaschou, H, Rose, MH & Boesen, M 2023, 'Should artificial intelligence have lower acceptable error rates than humans?', BJR open, vol. 5, no. 1, 20220053. https://doi.org/10.1259/bjro.20220053

APA

Lenskjold, A., Nybing, J. U., Trampedach, C., Galsgaard, A., Brejnebøl, M. W., Raaschou, H., Rose, M. H., & Boesen, M. (2023). Should artificial intelligence have lower acceptable error rates than humans? BJR open, 5(1), [20220053]. https://doi.org/10.1259/bjro.20220053

Vancouver

Lenskjold A, Nybing JU, Trampedach C, Galsgaard A, Brejnebøl MW, Raaschou H et al. Should artificial intelligence have lower acceptable error rates than humans? BJR open. 2023;5(1). 20220053. https://doi.org/10.1259/bjro.20220053

Author

Lenskjold, Anders ; Nybing, Janus Uhd ; Trampedach, Charlotte ; Galsgaard, Astrid ; Brejnebøl, Mathias Willadsen ; Raaschou, Henriette ; Rose, Martin Høyer ; Boesen, Mikael. / Should artificial intelligence have lower acceptable error rates than humans?. In: BJR open. 2023 ; Vol. 5, No. 1.

Bibtex

@article{d72acbb9aadf417c9fdbfbcd69ea6680,
title = "Should artificial intelligence have lower acceptable error rates than humans?",
abstract = "The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.",
author = "Anders Lenskjold and Nybing, {Janus Uhd} and Charlotte Trampedach and Astrid Galsgaard and Brejneb{\o}l, {Mathias Willadsen} and Henriette Raaschou and Rose, {Martin H{\o}yer} and Mikael Boesen",
note = "{\textcopyright} 2023 The Authors. Published by the British Institute of Radiology.",
year = "2023",
doi = "10.1259/bjro.20220053",
language = "English",
volume = "5",
journal = "BJR open",
issn = "2513-9878",
publisher = "British Institute of Radiology",
number = "1",

}

RIS

TY - JOUR

T1 - Should artificial intelligence have lower acceptable error rates than humans?

AU - Lenskjold, Anders

AU - Nybing, Janus Uhd

AU - Trampedach, Charlotte

AU - Galsgaard, Astrid

AU - Brejnebøl, Mathias Willadsen

AU - Raaschou, Henriette

AU - Rose, Martin Høyer

AU - Boesen, Mikael

N1 - © 2023 The Authors. Published by the British Institute of Radiology.

PY - 2023

Y1 - 2023

N2 - The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.

AB - The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.

U2 - 10.1259/bjro.20220053

DO - 10.1259/bjro.20220053

M3 - Journal article

C2 - 37389001

VL - 5

JO - BJR open

JF - BJR open

SN - 2513-9878

IS - 1

M1 - 20220053

ER -

ID: 396398047