Advertisement

Evaluation of Artificial Intelligence–Assisted Diagnosis of Skin Neoplasms: A Single-Center, Paralleled, Unmasked, Randomized Controlled Trial

Published:February 17, 2022DOI:https://doi.org/10.1016/j.jid.2022.02.003

      Trial design

      This was a single-center, unmasked, paralleled, randomized controlled trial.

      Methods

      A randomized trial was conducted in a tertiary care institute in South Korea to validate whether artificial intelligence (AI) could augment the accuracy of nonexpert physicians in the real-world settings, which included diverse out-of-distribution conditions. Consecutive patients aged >19 years, having one or more skin lesions suspicious for skin cancer detected by either the patient or physician, were randomly allocated to four nondermatology trainees and four dermatology residents. The attending dermatologists examined the randomly allocated patients with (AI-assisted group) or without (unaided group) the real-time assistance of AI algorithm (https://b2020.modelderm.com#world; convolutional neural networks; unmasked design) after simple randomization of the patients.

      Results

      Using 576 consecutive cases (Fitzpatrick skin phototypes III or IV) with suspicious lesions out of the initial 603 recruitments, the accuracy of the AI-assisted group (n = 295, 53.9%) was found to be significantly higher than those of the unaided group (n = 281, 43.8%; P = 0.019). Whereas the augmentation was more significant from 54.7% (n = 150) to 30.7% (n = 138; P < 0.0001) in the nondermatology trainees who had the least experience in dermatology, it was not significant in the dermatology residents. The algorithm could help trainees in the AI-assisted group include more differential diagnoses than the unaided group (2.09 vs. 1.95 diagnoses; P = 0.0005). However, a 12.2% drop in Top-1 accuracy of the trainees was observed in cases in which all Top-3 predictions given by the algorithm were incorrect.

      Conclusions

      The multiclass AI algorithm augmented the diagnostic accuracy of nonexpert physicians in dermatology.

      Abbreviations:

      AI (artificial intelligence), GP (general practitioner), RCT (randomized controlled trial)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'
      Society Members (SID/ESDR), remember to log in for access.

      Subscribe:

      Subscribe to Journal of Investigative Dermatology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Daneshjou R.
        • Smith M.P.
        • Sun M.D.
        • Rotemberg V.
        • Zou J.
        Lack of transparency and potential bias in artificial intelligence data sets and algorithms: a scoping review.
        JAMA Dermatol. 2021; 157: 1362-1369
        • Dascalu A.
        • David E.O.
        Skin cancer detection by deep learning and sound analysis algorithms: a prospective clinical study of an elementary dermoscope.
        EBioMedicine. 2019; 43: 107-113
        • Dinnes J.
        • Deeks J.J.
        • Chuchu N.
        • Ferrante di Ruffano L.
        • Matin R.N.
        • Thomson D.R.
        • et al.
        Dermoscopy, with and without visual inspection, for diagnosing melanoma in adults.
        Cochrane Database Syst Rev. 2018; 12CD011902
        • Esteva A.
        • Kuprel B.
        • Novoa R.A.
        • Ko J.
        • Swetter S.M.
        • Blau H.M.
        • et al.
        Dermatologist-level classification of skin cancer with deep neural networks.
        Nature. 2017; 542 ([published correction apperas in Nature 2017;546:686]): 115-118
        • Genin K.
        • Grote T.
        Randomized controlled trials in medical AI: a methodological critique.
        Philosophy of Medicine. 2021; 2
        • Haenssle H.A.
        • Fink C.
        • Toberer F.
        • Winkler J.
        • Stolz W.
        • Deinlein T.
        • et al.
        Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions.
        Ann Oncol. 2020; 31: 137-143
        • Haggenmüller S.
        • Maron R.C.
        • Hekler A.
        • Utikal J.S.
        • Barata C.
        • Barnhill R.L.
        • et al.
        Skin cancer classification via convolutional neural networks: systematic review of studies involving human experts.
        Eur J Cancer. 2021; 156: 202-216
        • Han S.S.
        • Moon I.J.
        • Kim S.H.
        • Na J.I.
        • Kim M.S.
        • Park G.H.
        • et al.
        Assessment of deep neural networks for the diagnosis of benign and malignant skin neoplasms in comparison with dermatologists: a retrospective validation study.
        PLoS Med. 2020; 17e1003381
        • Han S.S.
        • Moon I.J.
        • Lim W.
        • Suh I.S.
        • Lee S.Y.
        • Na J.I.
        • et al.
        Keratinocytic skin cancer detection on the face using region-based convolutional neural network.
        JAMA Dermatol. 2020; 156: 29-37
        • Han S.S.
        • Park G.H.
        • Lim W.
        • Kim M.S.
        • Na J.I.
        • Park I.
        • et al.
        Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network.
        PLoS One. 2018; 13e0191493
        • Han S.S.
        • Park I.
        • Eun Chang S.E.
        • Lim W.
        • Kim M.S.
        • Park G.H.
        • et al.
        Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders.
        J Invest Dermatol. 2020; 140: 1753-1761
        • Kim Y.J.
        • Han S.S.
        • Yang H.J.
        • Chang S.E.
        Prospective, comparative evaluation of a deep neural network and dermoscopy in the diagnosis of onychomycosis.
        PLoS One. 2020; 15 ([published correction appears in PloS One 2020;15:e0244899])e0234334
        • Kim Y.J.
        • Na J.I.
        • Han S.S.
        • Won C.H.
        • Lee M.W.
        • Shin J.W.
        • et al.
        Augmenting the accuracy of trainee doctors in diagnosing skin lesions suspected of skin neoplasms in a real-world setting: a prospective controlled before-and-after study.
        PLoS One. 2022; 17e0260895
        • Lapuschkin S.
        • Wäldchen S.
        • Binder A.
        • Montavon G.
        • Samek W.
        • Müller K.R.
        Unmasking Clever Hans predictors and assessing what machines really learn.
        Nat Commun. 2019; 10: 1096
        • Liu X.
        • Faes L.
        • Kale A.U.
        • Wagner S.K.
        • Fu D.J.
        • Bruynseels A.
        • et al.
        A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.
        Lancet Digit Health. 2019; 1 ([published correction appears in Lancet Digit Health 2019;1:e334]): e271-e297
        • Liu X.
        • Rivera S.C.
        • Moher D.
        • Calvert M.J.
        • Denniston A.K.
        • SPIRIT-AI and CONSORT-AI Working Group
        Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension.
        BMJ. 2020; 370: m3164
        • MacLellan A.N.
        • Price E.L.
        • Publicover-Brouwer P.
        • Matheson K.
        • Ly T.Y.
        • Pasternak S.
        • et al.
        The use of noninvasive imaging techniques in the diagnosis of melanoma: a prospective diagnostic accuracy study.
        J Am Acad Dermatol. 2021; 85: 353-359
        • Maron R.C.
        • Schlager J.G.
        • Haggenmüller S.
        • von Kalle C.
        • Utikal J.S.
        • Meier F.
        • et al.
        A benchmark for neural network robustness in skin cancer classification.
        Eur J Cancer. 2021; 155: 191-199
        • Muñoz-López C.
        • Ramírez-Cornejo C.
        • Marchetti M.A.
        • Han S.S.
        • Del Barrio-Díaz P.
        • Jaque A.
        • et al.
        Performance of a deep neural network in teledermatology: a single-centre prospective diagnostic study.
        J Eur Acad Dermatol Venereol. 2021; 35: 546-553
        • Navarrete-Dechent C.
        • Liopyris K.
        • Marchetti M.A.
        Multiclass artificial intelligence in dermatology: progress but still room for improvement.
        J Invest Dermatol. 2021; 141: 1325-1328
        • Phillips M.
        • Marsden H.
        • Jaffe W.
        • Matin R.N.
        • Wali G.N.
        • Greenhalgh J.
        • et al.
        Assessment of accuracy of an artificial intelligence algorithm to detect melanoma in images of skin lesions.
        JAMA Netw Open. 2019; 2e1913436
        • Rosner B.
        Fundamentals of biostatistics.
        7th ed. Brooks/Cole, Boston, MA2011
        • Tanaka M.
        • Saito A.
        • Shido K.
        • Fujisawa Y.
        • Yamasaki K.
        • Fujimoto M.
        • et al.
        Classification of large-scale image database of various skin diseases using deep learning.
        Int J Comput Assist Rad Surg. 2021; 16: 1875-1887
        • Topol E.J.
        Welcoming new guidelines for AI clinical research.
        Nat Med. 2020; 26: 1318-1320
        • Tschandl P.
        • Codella N.
        • Akay B.N.
        • Argenziano G.
        • Braun R.P.
        • Cabo H.
        • et al.
        Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study.
        Lancet Oncol. 2019; 20: 938-947
        • Tschandl P.
        • Rinner C.
        • Apalla Z.
        • Argenziano G.
        • Codella N.
        • Halpern A.
        • et al.
        Human–computer collaboration for skin cancer recognition.
        Nat Med. 2020; 26: 1229-1234
        • Zhou Q.
        • Chen Z.H.
        • Cao Y.H.
        • Peng S.
        Clinical impact and quality of randomized controlled trials involving interventions evaluating artificial intelligence prediction tools: a systematic review.
        npj Digit Med. 2021; 4: 154

      Supplementary References

        • Han S.S.
        • Kim M.S.
        • Lim W.
        • Park G.H.
        • Park I.
        • Chang S.E.
        Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm.
        J Invest Dermatol. 2018; 138: 1529-1538
        • Han S.S.
        • Lim W.
        • Kim M.S.
        • Park I.
        • Park G.H.
        • Chang S.E.
        Interpretation of the outputs of a deep learning model trained with a skin cancer dataset.
        J Invest Dermatol. 2018; 138: 2275-2277
        • Han S.S.
        • Moon I.J.
        • Lim W.
        • Suh I.S.
        • Lee S.Y.
        • Na J.I.
        • et al.
        Keratinocytic skin cancer detection on the face using region-based convolutional neural network.
        JAMA Dermatol. 2020; 156: 29-37
        • Han S.S.
        • Park I.
        • Eun Chang S.E.
        • Lim W.
        • Kim M.S.
        • Park G.H.
        • et al.
        Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders.
        J Invest Dermatol. 2020; 140: 1753-1761
      1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016 IEEE conference on computer vision and pattern recognition (CVPR); 2016. p. 770–778.

      2. Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF conference on computer vision and pattern recognition; 2018. p. 7132–7141.

        • Muñoz-López C.
        • Ramírez-Cornejo C.
        • Marchetti M.A.
        • Han S.S.
        • Del Barrio-Díaz P.
        • Jaque A.
        • et al.
        Performance of a deep neural network in teledermatology: a single-centre prospective diagnostic study.
        J Eur Acad Dermatol Venereol. 2021; 35: 546-553
        • Navarrete-Dechent C.
        • Dusza S.W.
        • Liopyris K.
        • Marghoob A.A.
        • Halpern A.C.
        • Marchetti M.A.
        Automated dermatological diagnosis: hype or reality?.
        J Invest Dermatol. 2018; 138: 2277-2279
        • Navarrete-Dechent C.
        • Liopyris K.
        • Marchetti M.A.
        Multiclass artificial intelligence in dermatology: progress but still room for improvement.
        J Invest Dermatol. 2021; 141: 1325-1328

      Linked Article