Background: Malignant melanoma can most successfully be cured when diagnosed at an early stage in the natural history. However, there is controversy over screening programs and many advocate screening only for high-risk individuals.
Objectives: This study aimed to evaluate the accuracy of an artificial intelligence neural network (Deep Ensemble for Recognition of Melanoma [DERM]) to identify malignant melanoma from dermoscopic images of pigmented skin lesions and to show how this compared to doctors’ performance assessed by meta-analysis.
Methods: DERM was trained and tested using 7,102 dermoscopic images of both histologically confirmed melanoma (24%) and benign pigmented lesions (76%). A meta-analysis was conducted of studies examining the accuracy of naked-eye examination, with or without dermoscopy, by specialist and general physicians whose clinical diagnosis was compared to histopathology. The meta-analysis was based on evaluation of 32,226pigmented lesions including 3,277 histopathology-confirmed malignant melanoma cases. The receiver operating characteristic (ROC) curve was used to examine and compare the diagnostic accuracy.
Results: DERM achieved a ROC area under the curve (AUC) of 0.93 (95% confidence interval: 0.92-0.94), and sensitivity and specificity of 85.0% and 85.3%, respectively. Avoidance of false-negative results is essential, so different decision thresholds were examined. At 95% sensitivity DERM achieved a specificity of 64.1% and at 95% specificity the sensitivity was 67%. The meta-analysis showed primary care physicians (10 studies) achieve an AUC of 0.83 (95% confidence interval: 0.79-0.86), with sensitivity and specificity of 79.9% and 70.9%; and dermatologists (92 studies) 0.91 (0.88-0.93), 87.5%, and 81.4%, respectively.
Conclusions: DERM has the potential to be used as a decision support tool in primary care, by providing dermatologist-grade recommendation on the likelihood of malignant melanoma.
Malignant melanoma (MM) is less common than basal and squamous cell skin cancer; however, the incidence of MM is increasing faster than that of other forms of cancer and it is responsible for the majority of skin cancer deaths [ 1 ] . Early diagnosis of MM (stage 1) has more than 95% five-year relative survival rate compared with 8% to 25% for MM diagnosed at later stages [ 2 ] .
Current practice guidelines in the United Kingdom recommend appropriately trained health care professionals assess all suspect pigmented lesions using dermoscopy [ 1 , 3 ] . Diagnosis is confirmed with biopsy, histological examination, and specialist pathological interpretation. Pressure to diagnose MM early leads to a high proportion of benign pigmented lesions being referred from primary care to specialist care, and a large proportion of biopsied lesions are found to be benign [ 4 , 5 ] . This creates increased demands on overburdened secondary care and pathology service resources [ 6 ] . Improved accuracy of pigmented lesion review in primary care would help reduce this pressure. Techniques such as dermoscopy with classification algorithms, reflectance confocal microscopy, and teledermatology have been reported to improve diagnostic accuracy of MM [ 7 – 15 ] . However, the diagnostic accuracy is still dependent on the degree of experience of the examiners and the equipment required is costly [ 16 ] .
A large number of smartphone applications for MM detection have been released recently. However, there is little evidence of clinical validation. Kassianos et al reviewed 39 apps that addressed skin cancer issues; 19 involved smartphone photography and 4 provided an estimate of the probability of malignancy. None of these apps had been assessed for diagnostic accuracy [ 17 ] . Understandably there is concern about the possible harm to patients that poorly designed, inaccurate, and/or misleading consumer apps may cause [ 18 – 20 ] . However, with appropriate development and suitable evaluation there is no reason why modern electronic technology could not improve diagnostic accuracy. Recently, an artificial intelligence (AI) algorithm categorizing photographs of pigmented lesions has been shown to be capable of classifying MM with a level of competence comparable to that of dermatologists [ 21 ] . As Obermeyer and Emanuel state in a recent review, “Machine learning has become ubiquitous and indispensable for solving complex problems in most sciences. The same methods will open up vast new possibilities in medicine” [ 22 ] . However, there are ethical issues associated with the clinical applications of AI in medicine that do not apply to current business applications, astronomy, or chemistry, and these cannot be ignored [ 23 ] .
The primary aim of this study was to evaluate the diagnostic accuracy of an AI algorithm (Deep Ensemble for Recognition of Melanoma [DERM]) developed by Skin Analytics Limited. The secondary aim was to improve the methodology for evaluating an AI diagnostic tool by comparing DERM’s performance with clinical examination by physicians and stratification based on level of expertise and use of dermoscopy using a meta-analysis of diagnostic studies. But it should be noted that this was not designed to be a systematic review such as the recent Cochrane reviews of skin cancer.
DERM was designed and developed using deep learning techniques that identify and assess features of pigmented lesions that are associated with MM [ 23 – 28 ] . Deep learning differs from earlier machine learning methods by learning features that are associated with MM directly from the data, rather than using features predetermined by a researcher. The algorithm was trained and validated against a dataset of archived dermoscopic images of skin lesions, using 10-fold cross-validation. This approach allows every image to be tested once, while ensuring the same image does not appear in the training and test datasets. Cross-validation is performed by splitting the dataset into several (10) “folds” (datasets). The algorithm is tested against each fold, with the remainder used for training. The results for each fold are then averaged so that the overall performance can be assessed.
The image dataset was collated from several different sources including the PH2 dataset [ 29 ] , Interactive Atlas of Dermoscopy [ 30 ] , and ISIC archive [ 31 ] . An additional 672 dermoscopic lesion images were collected from a variety of other sources. The ISIC archive contains a large number of images obtained from children, which are easy to classify as benign. Their inclusion in the dataset was found to optimistically bias results so they were excluded from the development work. The ISIC archive also contains a large number of identical and near-identical images which were removed from the dataset. The final dataset consists of a total of 7,102 unique pigmented lesion images, 24% being confirmed as MM by histopathology, though subtype information was not available, the rest being made up of benign and nonbenign lesions.
DERM generates a continuous response to an image with limits of 0 and 1, which reflects its “confidence” that the lesion is MM: a value close to 1 indicates MM and near 0 indicates a benign lesion. A nonparametric receiver operating characteristic (ROC) curve analysis was used to examine the overall diagnostic accuracy of the result using Pepe’s nonparametric methods with bootstrapped estimation [ 32 ] . The gold standard for MM was histopathology. We examined different cut-points used by DERM to categorize lesions as positive or negative, ie, illustrating alternative diagnostic rules from the diagnostic model [ 33 ] . The methods of Youden [ 34 ] and Liu [ 35 ] were used, as well as the values that maximized the ROC area, resulted in a sensitivity and a specificity of 95%, and generated less than 1% false negative. The area under the curve (AUC) of the ROC curve, specificity/sensitivity, and diagnostic odds ratios were calculated for each of these cut-points.
The ROC AUC is not a perfect assessment measure for diagnostic methods when the standard error of the estimator is quite different for the diagnostic alternatives (benign pigmented lesions vs MM), as is the case for DERM (see Figure 1 ) [ 36 ] . This issue was addressed by constructing the Lorenz curve (a mirror image of the ROC curve) with the associated Gini index [ 37 ] .
Figure 1 .
Level of confidence of Deep Ensemble for Recognition of Melanoma (DERM) algorithm by lesion type.
To compare the accuracy of DERM with that of current diagnostic practices, we decided to conduct a meta-analysis of studies of diagnostic accuracy for MM rather than have a limited panel of dermatologists conduct parallel assessments, as has been done in other studies [ 21 , 38 ] . We chose this approach because biopsy-based histopathology provides the gold standard for MM diagnosis, and a meta-analysis enables comparison to a variety of different clinician experiences and evaluation techniques. This analysis was not intended to be systematic review, but the PRISMA guidelines were followed when appropriate.
A literature search was conducted for studies reporting diagnostic accuracy data of naked-eye clinical examination, with or without dermoscopy, compared with histologically confirmed diagnosis. MEDLINE (413), Web of Science (707), and EMBASE (322) were searched for the period from January 1, 1990, to September 30, 2017, using terms “accuracy pigmented lesions PLUS melanoma pigmented lesions PLUS detection,” “dermoscopy pigmented lesions PLUS melanoma pigmented lesions PLUS accuracy,” and “melanoma pigmented lesions PLUS diagnosis pigmented lesions PLUS primary care.” Studies included in previous systematic reviews were also included [ 2 , 15 , 39 – 41 ] . The PRISMA flow diagram is shown in Figure 2 . One author (M.P.) conducted the literature search and extracted counts of true negative; true positive; false negative; false positive; or estimates of sensitivity, specificity, number of lesions examined, and number of MM diagnoses confirmed by histology, from which the counts could be derived. The reports were also examined for information concerning physician experience (general vs specialist physician) and context of use (primary care, secondary care). A meta-analysis from this data was conducted. The Stata user-written packages METANDI [ 42 ] and MIDAS [ 43 ] were used, and a meta-regression was used to examine associations between diagnostic accuracy and year of study report, level of care, and expertise of the practitioner. Many of the dermoscopy studies reported multiple results for each lesion using different dermoscopic algorithms (eg, ABCD, 7-point checklist, etc. [ 44 ] ); all of these results were included in the dataset. Since this produces a clustered dataset, violating the statistical assumption of the independence of observations, we conducted a sensitivity analysis. Multiple datasets were generated in which 1 estimate only was randomly included for each study where there were multiple estimates. The results indicated that the initial estimates were not sensitive to the clustering (details of this analysis are not reported here).
Figure 2 .
PRISMA flow diagram of publications searched for the meta-analysis.
All analysis was conducted by M.P. using the Stata statistical package (StataCorp. 2015. Stata Statistical Software: Release 15 . College Station, TX: StataCorp LP).
Most of the data used to create the algorithm were based on anonymous, publicly available images, and an additional 672 anonymized dermoscopic lesion images were generously made available by clinical dermatologists. The meta-analysis data were derived from published papers that did not include individual patient data. There was no requirement for ethics approval, but the Ethics Committee of Royal Perth Hospital was informed of the study as a courtesy.
Histograms showing the distribution of the DERM value for MM and for benign lesions are shown in Figure 1 . The histograms show that the value does not follow a normal distribution and there is a different dispersion of data for the 2 types of lesion. DERM estimated the median level of confidence as 0.059 (interquartile range: 0.016–0.171) when the lesion was a benign pigmented lesion and 0.651 (interquartile range: 0.417–0.849) when the lesion was MM. The equality of the 2 medians was compared by Fisher exact test and found to be significantly different (P < 0.0001).
The empirical ROC curve analysis showed that DERM has a high level of accuracy with an AUC of 0.928 (95% confidence interval: 0.922–0.935) and an acceptable goodness-of-fit χ 2 = 6,078 (P = 0.98) ( Figure 3 ). The Lorenz curve analysis gave a Gini index of 0.857. The Gini index has an upper limit of 1 and the high value is indicative of high inequality between MM and benign lesions, which supports the ROC analysis.
Figure 3 .
The receiver operating characteristic curve of Deep Ensemble for Recognition of Melanoma (DERM) results. Shaded area shows 95% confidence interval.
The Youden, Liu, and maximum AUC methods estimated the same optimum cut-point at a value of 0.272 (95% confidence interval: 0.232–0.313) ( Table 1 ). As the sensitivity increases, the expected loss of specificity occurs, but when the sensitivity is fixed at 95%, specificity is still 64%.
Indices of Diagnostic Accuracy (±95% CI) at Different Cut-Points of the DERM Confidence Value
The summary of 82 studies that investigated the diagnostic accuracy of naked-eye examination (n = 29) or dermoscopy (n = 53) for pigmented lesions and MM between 1990 and 2017 is shown in Table 2 . A visual guide to the study accuracy is provided in the forest plots in Figures 4 and 5 . Table 3 shows the pooled and weighted values of sensitivity, specificity, and diagnostic odds ratio for the studies. The pooled results for all studies are as follows: AUC = 0.90, sensitivity = 85%, and specificity = 82%. The beta value (an indicator of asymmetry of the summary ROC curve) is statistically significant (β = 0.263, P = 0.022), indicating that the diagnostic odds ratio shows variation across the summary ROC curve. For naked-eye examination the pooled results are as follows: AUC = 0.88, sensitivity = 79%, specificity = 83%, β = 0.048, P = 0.81; and for dermoscopy the pooled results are as follows: AUC = 0.91, sensitivity = 86%, specificity = 81%, β = 0.397, P = 0.005.
Figure 4 .
Forest plot for naked-eye examination.
Figure 5 .
Forest plot for dermoscopy.
Studies for Meta-analysis of Diagnostic Accuracy
Meta-regression for the year of publication showed no significant association assessed by the combination of sensitivity and specificity for either visual clinical examination (P = 0.25) or dermoscopy (P = 0.18). There was a significant difference between experts and nonexperts both for naked-eye visual clinical examination (P < 0.001) and dermoscopy (P < 0.001), which is reflected in the estimated values shown in Table 3 , where experts have both higher sensitivity and specificity than nonexperts, and is most marked for specificity for both methods and for sensitivity only for dermoscopy ( Figure 6 ). The contrast in accuracy is most obvious for primary vs secondary care (P < 0.0001) with the AUC differing by 8% (0.83 vs 0.91) ( Figure 7 ). There was no association between the AUC and year of study publication, suggesting that diagnostic accuracy is not improving over time (P = 0.63).
Figure 6 .
Summary receiver operating characteristic curves for naked eye and dermoscopic diagnosis overlaid with the Deep Ensemble for Recognition of Melanoma (DERM) sensitivity and specificity at cut-points from Table 1 (the shaded rectangle shows the summary point from the meta-analysis). AUC = area under the curve.
Figure 7 .
Summary receiver operating characteristic curves for primary and secondary care overlaid with the Deep Ensemble for Recognition of Melanoma (DERM) sensitivity and specificity at cut-points from Table 1 (the shaded rectangle shows the summary point from the meta-analysis).
Herewith we present an extensive evaluation of the ability of DERM to identify MM from dermoscopic images of skin lesions. This preliminary analysis demonstrates the ability of an AI-based system to learn features of a skin lesion that are associated with MM, which can then be applied to the identification of MM. We conducted a meta-analysis of MM diagnostic accuracy to generate comparative values from current primary care and specialist dermatologist practices. These results confirm that clinician experience and use of dermoscopy improve accuracy. DERM achieves an AUC of 0.93, sensitivity and specificity of 85% and 85%, respectively, when using the estimated optimum value of 0.28. This is higher than naked-eye visual assessment (0.88, 80% and 71%), and similar to findings for dermatologists with dermoscopy (0.91, 85% and 82%). This is illustrated by plotting a ROC curve of the data from studies in the meta-analysis, and superimposing the DERM data from 4 cut-points ( Figures 6 and 7 ).
A recent comprehensive series of Cochrane reviews concluded that visual inspection alone had a specificity of 42% at a fixed sensitivity of 80% and a sensitivity of 76% at a fixed specificity of 80%, whereas dermoscopy plus visual inspection had a specificity of 92% at a fixed sensitivity of 80% and a sensitivity of 82% at a fixed specificity of 80% [ 45 ] . Our meta-analysis showed for visual inspection alone specificity of 83% when sensitivity was 80%; sensitivity of 78% when specificity was 80%; specificity of 86% when sensitivity was 80%; and sensitivity of 87% when specificity was 80%. DERM gave comparable indices of specificity of 89% at sensitivity of 80% and a sensitivity of 90% at specificity of 80%.
Strengths and Limitations
We trained our algorithm using archived images that have been published to train clinicians. It is likely that biases exist in the datasets (eg, patient demographics, MM subtypes, image capture methods), but it is very difficult to determine whether such biases exist and thus have been introduced into DERM during its development. In addition, it must be emphasized that the algorithm was trained predominantly using images of images rather than images created in a clinical setting. We are currently collecting such images during a clinical trial and plan to report the results in the near future.
By using postbiopsy histology as the gold standard for both DERM and the inclusion criteria for our meta-analysis, images of nonsuspicious lesions have not been included when training or evaluating DERM. We have therefore not shown the ability of DERM (or clinicians) to accurately classify nonsuspicious lesions, which could lead to verification bias as was observed by a study of cancer registry data during a prospective follow-up [ 46 ] . However, this bias will apply to both the evaluation of DERM and the meta-analysis results, so it seems unlikely that the comparison of the 2 would be affected, but it remains a possibility.
A strength of our study is that the use of a meta-analysis of naked-eye examination and dermoscopy, the most common current diagnostic methods for MM used in primary care, is based on evaluation of 32,226 pigmented lesions including 3,277 histopathology-confirmed MM.
Comparison With Existing Literature
Recently, 2 other groups who retooled versions of Google’s Inception network for the identification of melanoma showed accuracy equivalent to or better than that of a panel of dermatologists [ 22 , 23 ] . However, this approach is likely to generate issues such as overfitting (because of the small size of the review panel) and a lack of generalization (because of the selected nature of the voluntary reviewers).
A recent addition to the literature was the publication of an extensive systematic review by the Cochrane Collaboration skin group [ 45 ] . Four studies were conducted on melanoma diagnosis in adults by visual inspection, dermoscopy with and without visual inspection, reflectance confocal microscopy, and smartphone applications for triaging suspicious lesions. The dates of publication were slightly different from our study dates (up to August 2016 compared with September 2017), they searched more databases, and they did not limit themselves to histology-confirmed pathology as the diagnostic outcome but also included clinical follow-up of benign-appearing lesions, cancer registry follow-up, and “expert opinion with no histology or follow-up.” Despite these differences, the number of studies is very similar. We identified 108 studies (29 visual and 79 dermoscopy) and they identified 104 (24 visual and 86 dermoscopy).
Implications for Research and Practice
Using different cut-points at which DERM defines a lesion as MM, the sensitivity and specificity ranged between 85.0% to 98.6% and 85.3% to 62.9%, respectively. The cut-points calculated by the Youden and Liu methods assume that false-negative and false-positive results have equal importance. This is not the case when dealing with a life-threatening disease, such as MM, where a cut-point that maximizes sensitivity—thus reducing the number of false-negative cases—should be adopted. However, this results in a higher false-positive rate, which has health care and patient costs associated with further investigations. The most appropriate cut-point for use in a clinical setting will need to be determined by consensus agreement taking into account both clinical and economic factors and is likely to be different for different clinical settings and levels of care.
At high levels of sensitivity, DERM offers comparable specificity to dermatologists with dermatoscopes. DERM could therefore provide dermatologist-grade advice on likelihood of MM to general practitioners without the cost and training requirements of dermoscopy. While diagnostic accuracy plays a pivotal role in the clinical evaluation of diagnostic tests, it does not prove that the test improves outcomes in relevant patient populations or that it enhances health care quality, efficiency, and cost-effectiveness. The only way to truly determine a test’s utility in the real-life decision-making setting of clinics is by conducting prospective clinical trials. We are currently conducting clinical validation studies of DERM. To our knowledge, no other AI-based MM diagnostic test is undergoing such extensive clinical utility testing [ 23 , 46 , 47 ] .
Our study demonstrates the ability of an AI-based system to learn features of a skin lesion photograph that are associated with MM. DERM has the potential to be used in primary care to provide dermatologist-grade decision support. It is too early to say deployment of DERM would reduce onward referral, but such clinical validation is ongoing.
- Revised U.K. guidelines for the management of cutaneous melanoma 2010 Marsden JR, Newton-Bishop JA, Burrows L, et al. Br J Dermatol.2010;163(2):238-256.
- Screening for Skin Cancer in Adults: An Updated Systematic Evidence Review for the US Preventive Services Task Force [Internet] Wernli KJ, Henrikson NB, Morrison CC, Nguyen M, Pocobelli G, Whitlock EP. U.S. Preventive Services Task Force Evidence Syntheses, formerly Systematic Evidence Reviews.2016.
- Melanoma: Assessment and Management National Institute for Health and Care Excellence: Clinical Guidelines.2015.
- Skin biopsy rates and incidence of melanoma: population based ecological study Welch HG, Woloshin S, Schwartz LM. BMJ.2005;331(7515):481.
- Diagnosing and managing cutaneous pigmented lesions: primary care physicians versus dermatologists Chen SC, Pennie ML, Kolm P, et al. J Gen Intern Med.2006;21(7):678-682.
- Strategies for early melanoma detection: approaches to the patient with nevi Goodson AG, Grossman D. J Am Acad Dermatol.2009;60(5):719-735.
- Dermoscopy compared with naked eye examination for the diagnosis of primary melanoma: a meta-analysis of studies performed in a clinical setting Vestergaard ME, Macaskill P, Holt PE, Menzies SW. Br J Dermatol.2008;159(3):669-676.
- Evidence-based dermoscopy Menzies SW. Dermatol Clin.2013;31(4):vii-524.
- Diagnostic accuracy of dermatoscopy for melanocytic and nonmelanocytic pigmented lesions Rosendahl C, Tschandl P, Cameron A, Kittler H. J Am Acad Dermatol.2011;64(6):1068-1073.
- Dermoscopy for melanoma detection in family practice Herschorn A. Can Fam Physician.2012;58(7):e372-e378.
- Incidence of melanoma and keratinocytic carcinomas in patients evaluated by store-and-forward teledermatology vs. dermatology clinic Creighton-Smith M, Murgia RD, Konnikov N, Dornelles A, Garber C, Nguyen BT. Int J Dermatol.2017;56(10):1026-1031.
- Clinical indications for use of reflectance confocal microscopy for skin cancer diagnosis Borsari S, Pampena R, Lallas A, et al. JAMA Dermatol.2016;152(10):1093-1098.
- Comparison of dermoscopy and reflectance confocal microscopy for the diagnosis of malignant skin tumours: a meta-analysis Xiong YQ, Ma SJ, Mo Y, Huo ST, Wen YQ, Chen Q. J Cancer Res Clin Oncol.2017;143(9):1627-1635.
- Modern non-invasive diagnostic techniques in the detection of early cutaneous melanoma Kardynal A, Olszewska M. J Dermatol Case Rep.2014;8(1):1-8.
- Diagnostic accuracy of dermoscopy Kittler H, Pehamberger H, Wolff K, Binder M. Lancet Oncol.2002;3(3):159-165.
- Mobile applications in dermatology Brewer AC, Endly DC, Henley J, et al. JAMA Dermatol.2013;149(11):1300-1304.
- Smartphone applications for melanoma detection by community, patient and generalist clinician users: a review Kassianos AP, Emery JD, Murchie P, Walter FM. Br J Dermatol.2015;172(6):1507-1518.
- Skin scan: a demonstration of the need for FDA regulation of medical apps on iPhone Ferrero NA, Morrell DS, Burkhart CN. J Am Acad Dermatol.2013;68(3):515-516.
- Diagnostic inaccuracy of smartphone applications for melanoma detection: reply Wolfe JA, Ferris LK. JAMA Dermatol.2013;149(7):885.
- Diagnostic inaccuracy of smartphone applications for melanoma detection: representative lesion sets and the role for adjunctive technologies Stoecker WV, Rader RK, Halpern A. JAMA Dermatol.2013;149(7):884.
- Dermatologist-level classification of skin cancer with deep neural networks Esteva A, Kuprel B, Novoa RA, et al. Nature.2017;542(7639):115-118.
- Predicting the future—big data, machine learning, and clinical medicine Obermeyer Z, Emanuel EJ. N Engl J Med.2016;375(13):1216-1219.
- Implementing machine learning in health care—addressing ethical challenges Char DS, Shah NH. N Engl J Med.2018;378(11):981-983.
- Very deep convolutional networks Simonyan K, Zisserman A. .2015.
- Inception-v4, Inception-ResNet and the impact of residual connections on learning Szegedy C, Ioffe S, Vanhoucke V, Alemi A. 2016.
- U-Net: convolutional networks for biomedical image segmentation Ronneberger O, Fischer P, Brox T, Navab N, et al. MICCAI 2015, Part III, LNCS 9351.2015;:234-241. CrossRef
- Deep learning ensembles for melanoma recognition in dermoscopy images Codella N, Nguyen Q-B, Pankanti S, et al. IBM J Res Dev.2017;61(4/5):5:1-5:15.
- Fast and accurate deep network learning by exponential linear units (ELUs) Clevert D, Unterthiner T, Hochreiter S. .2016.
- PH2—a dermoscopic image database for research and benchmarking Mendonca T, Ferreira P, Marques J, Marcal A, Rozeira J. .. CrossRef
- Interactive Atlas of Dermoscopy Argenziano G, Soyer P, De Giorgio V, et al. Milan, Italy: Edra Medical Publishing and New Media; 2000.
- ADDI Project 2012 PH2 Database..
- The Statistical Evaluation of Medical Tests for Classification and Prediction Pepe MS. Oxford: Oxford University Press; 2003.
- Assessing the incremental value of diagnostic and prognostic markers: a review and illustration Steyerberg EW, Pencina MJ, Lingsma HF, Kattan MW, Vickers AJ, Van Calster B. Eur J Clin Invest.2012;42(2):216-228.
- Index for rating diagnostic tests Youden WJ. Cancer.1950;3(1):32-35.
- Classification accuracy and cut point selection Liu X. Stat Med.2012;31(23):2676-2686.
- Probabilistic analysis of global performances of diagnostic tests: interpreting the Lorenz curve-based summary measures Lee WC. Stat Med.1999;18(4):455-471.
- Lognormal Lorenz and normal receiver operating characteristic curves as mirror images Irwin JR, Hautus MJ. R Soc Open Sci.2015;2(2):140280.
- Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists Haenssle HA, Fink C, Schneiderbauer R, et al. Ann Oncol.2018;29(8):1836-1842.
- Systematic review of dermoscopy and digital dermoscopy/artificial intelligence for the diagnosis of melanoma Rajpara SM, Botello AP, Townend J, Ormerod AD. Br J Dermatol.2009;161(3):591-604.
- Diagnosing malignant melanoma in ambulatory care: a systematic review of clinical prediction rules Harrington E, Clyne B, Wesseling N, et al. BMJ Open.2017;7(3):e014096.
- Is dermoscopy (epiluminescence microscopy) useful for the diagnosis of melanoma? Results of a meta-analysis using techniques adapted to the evaluation of diagnostic tests Bafounta ML, Beauchet A, Aegerter P, Saiag P. Arch Dermatol.2001;137(10):1343-1350.
- metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression Harbord RM, Whiting P. Stata Journal.2009;9(2):211-229.
- Stata module for meta-analytical integration of diagnostic test accuracy studies Dwamena B. Statistical Software Components S456880..
- Dermatoscopy: facts and controversies Lee JB, Hirokawa D. Clin Dermatol.2010;28(3):303-310.
- Dermoscopy, with and without visual inspection, for diagnosing melanoma in adults Dinnes J, Deeks JJ, Chuchu N, et al. Cochrane Database Syst Rev.2018;12:CD011901.
- Assessment of diagnostic tests when disease verification is subject to selection bias Begg C, Greenes R. Biometrics.1983;39(1):207-215.
- Sensitivity, specificity, and diagnostic accuracy of three dermoscopic algorithmic methods in the diagnosis of doubtful melanocytic lesions: the importance of light brown structureless areas in differentiating atypical melanocytic nevi from thin melanomas Annessi G, Bono R, Sampogna F, Faraggiana T, Abeni D. J Am Acad Dermatol.2007;56(5):759-767.
- Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis Argenziano G, Fabbrocini G, Carli P, De Giorgi V, Sammarco E, Delfino M. Arch Dermatol.1998;134(12):1563-1570.
- Dermoscopy improves accuracy of primary care physicians to triage lesions suggestive of skin cancer Argenziano G, Puig S, Zalaudek I, et al. J Clin Oncol.2006;24(12):1877-1882.
- Blue-black rule: a simple dermoscopic clue to recognize pigmented nodular melanoma Argenziano G, Longo C, Cameron A, et al. Br J Dermatol.2011;165(6):1251-1255.
- The role of spectrophotometry in the diagnosis of melanoma Ascierto PA, Palla M, Ayala F, et al. BMC Dermatol.2010;10:5.
- Computer-aided dermoscopy for diagnosis of melanoma Barzegari M, Ghaninezhad H, Mansoori P, Taheri A, Naraghi ZS, Asgari M. BMC Dermatol.2005;5:8.
- The dermoscopic versus the clinical diagnosis of melanoma Benelli C, Roscetti E, Dal Pozzo V, Gasparini G, Cavicchini S. Eur J Dermatol.1999;9(6):470-476.
- The dermoscopic (7FFM) versus the clinical (ABCDE) diagnosis of small diameter melanoma Benelli C, Roscetti E, Pozzo VD. Eur J Dermatol.2000;10(4):282-287.
- Epiluminescence microscopy: a useful tool for the diagnosis of pigmented skin lesions for formally trained dermatologists Binder M, Schwarz M, Winkler A, et al. Arch Dermatol.1995;131(3):286-291.
- Epiluminescence microscopy of small pigmented skin lesions: short-term formal training improves the diagnostic performance of dermatologists Binder M, Puespoeck-Schwarz M, Steiner A, et al. J Am Acad Dermatol.1997;36:197-202.
- Three-colour test in dermoscopy: a re-evaluation Blum A, Clemens J, Argenziano G. Br J Dermatol.2004;150(5):1040.
- Melanoma detection Bono A, Bartoli C, Cascinelli N, et al. Dermatology.2002;205(4):362-366.
- Micro-melanoma detection: a clinical study on 206 consecutive cases of pigmented skin lesions with a diameter ≤3 mm Bono A, Tolomio E, Trincone S, et al. Br J Dermatol.2006;155(3):570-573.
- Reliability and inter-observer agreement of dermoscopic diagnosis of melanoma and melanocytic naevi: Dermoscopy Panel Carli P, De Giorgi V, Naldi L, Dosi G. Eur J Cancer Prev.1998;7(5):397-402.
- Pattern analysis, not simplified algorithms, is the most reliable method for teaching dermoscopy for melanoma diagnosis to residents in dermatology Carli P, Quercioli E, Sestini S, et al. Br J Dermatol.2003;148(5):981-984.
- The problem of false-positive diagnosis in melanoma screening: the impact of dermoscopy Carli P, Mannone F, de Giorgi V, Nardini P, Chiarugi A, Giannotti B. Melanoma Res.2003;13(2):179-182.
- Dermatoscopy: usefulness in the differential diagnosis of cutaneous pigmentary lesions Cristofolini M, Zumiani G, Bauer P, Cristofolini P, Boi S, Micciolo R. Melanoma Res.1994;4(6):391-394.
- The seven features for melanoma: a new dermoscopic algorithm for the diagnosis of malignant melanoma Dal Pozzo V, Benelli C, Roscetti E. Eur J Dermatol.1999;9(4):303-308.
- Comparative performance of 4 dermoscopic algorithms by nonexperts for the diagnosis of melanocytic lesions Dolianitis C, Kelly J, Wolfe R, Simpson P. Arch Dermatol.2005;141(8):1008-1014.
- Computer versus human diagnosis of melanoma: evaluation of the feasibility of an automated diagnostic system in a prospective clinical trial Dreiseitl S, Binder M, Hable K, Kittler H. Melanoma Res.2009;19(3):180-184.
- Videomicroscopy in differential diagnosis of skin tumors and secondary prevention of malignant melanoma [in German] Dummer W, Doehnel KA, Remy W. Hautarzt.1993;44(12):772-776.
- The ABCD rule in dermatoscopy: analysis of 500 melanocytic lesions [in German] Feldmann R, Fellenz C, Gschnait F. Hautarzt.1998;49(6):473-476.
- Evaluation of a program for the automatic dermoscopic diagnosis of melanoma in a general dermatology setting Fueyo-Casado A, Vázquez-Lopez F, Sanchez-Martin J, Garcia-Garcia B, Pérez-Oliva N. Dermatol Surg.2009;35(2):257-262.
- Comparison of two dermoscopic techniques in the diagnosis of clinically atypical pigmented skin lesions and melanoma: seven-point and three-point checklists Gereli MC, Onsun N, Atilganoglu U, Demirkesen C. Int J Dermatol.2010;49(1):33-38.
- Spectrophotometric intracutaneous analysis versus dermoscopy for the diagnosis of pigmented skin lesions: prospective, double-blind study in a secondary reference centre Glud M, Gniadecki R, Drzewiecki KT. Melanoma Res.2009;19(3):176-179.
- Seven-point checklist for dermatoscopy: performance during 10 years of prospective surveillance of patients at increased melanoma risk Haenssle HA, Korpas B, Hansen-Hagge C, et al. J Am Acad Dermatol.2010;62(5):785-793.
- Electrical impedance scanning for melanoma diagnosis: a validation study Har-Shai Y, Glickman YA, Siller G, et al. Plast Reconstr Surg.2005;116(3):782-790.
- CASH algorithm for dermoscopy revisited Henning JS, Stein JA, Yeung J, et al. Arch Dermatol.2008;144(4):554-555.
- A study of the value of the seven-point checklist in distinguishing benign pigmented lesions from melanoma Keefe M, Dick DC, Waleel RA. Clin Exp Dermatol.1990;15(3):167-171.
- Dermatoscopy and high frequency sonography: two useful non-invasive methods to increase preoperative diagnostic accuracy in pigmented skin lesions Krähn G, Gottlöber P, Sander C, Peter RU. Pigment Cell Res.1998;11(3):151-154.
- Epiluminescent microscopy: a score of morphological features to identify malignant melanoma Kreusch J, Rassner G, Trahn C, Pietsch-Breitfeld B, Henke D, Selbmann HK. Pigment Cell Res.1992;:295-298.
- Clinical and dermatoscopic diagnosis of malignant melanoma: assessed by expert and non-expert groups Lorentzen H, Weismann K, Petersen CS, Larsen FG, Secher L, Sk⊘dt V. Acta Derm Venereol.1999;79(4):301-304.
- Comparison of dermatoscopic ABCD rule and risk stratification in the diagnosis of malignant melanoma Lorentzen H, Weismann K, Kenet RO, Secher L, Larsen FG. Acta Derm Venereol.2000;80(2):122-126.
- Laypersons’ sensitivity for melanoma identification is higher with dermoscopy images than clinical photographs Luttrell MJ, McClenahan P, Hofmann-Wellenhof R, Fink-Puches R, Soyer HP. Br J Dermatol.2012;167(5):1037-1041.
- The use of the dermatoscope to identify early melanoma using the three-colour test Mackie RM, Fleming C, McMahon AD, Jarrett P. Br J Dermatol.2002;146(3):481-484.
- Clinical predictors of malignant pigmented lesions: a comparison of the Glasgow seven-point checklist and the American Cancer Society’s ABCDs of pigmented lesions McGovern TWM, Litaker MSM. J Dermatol Surg Oncol.1992;18(1):22-26.
- Frequency and morphologic characteristics of invasive melanomas lacking specific surface microscopic features Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Arch Dermatol.1996;132(10):1178-1182.
- Dermoscopic evaluation of amelanotic and hypomelanotic melanoma Menzies SW, Kreusch J, Byth K, et al. Arch Dermatol.2008;144(9):1120-1127.
- Dermoscopic evaluation of nodular melanoma Menzies SW, Moloney FJ, Byth K, et al. JAMA Dermatol.2013;149(6):699-709.
- The ABCD rule of dermatoscopy: high prospective value in the diagnosis of doubtful melanocytic skin lesions Nachbar F, Stolz W, Merkle T, et al. J Am Acad Dermatol.1994;30(4):551-559.
- Surface microscopy of naevi and melanomas—clues to melanoma Nilles M, Boedeker RH, Schill WBS. Br J Dermatol.1994;130(3):349-355.
- Can automated dermoscopy image analysis instruments provide added benefit for the dermatologist? A study comparing the results of three systems Perrinaud A, Gaide O, French LE, Saurat J-H, Marghoob AA, Braun RP. Br J Dermatol.2007;157(5):926-933.
- Computer-automated ABCD versus dermatologists with different degrees of experience in dermoscopy Piccolo D, Crisman G, Schoinas S, Altamura D, Peris K. Eur J Dermatol.2014;24(4):477-481.
- Can early malignant melanoma be differentiated from atypical melanocytic nevi by in vivo techniques? Rao BK, Marghoob AA, Stolz W, et al. Skin Res Technol.1997;3(1):8-14.
- Limitations of dermoscopy in the recognition of melanoma Skvara H, Teban L, Fiebiger M, Binder M, Kittler H. Arch Dermatol.2005;141(2):155-160.
- Diagnostic reliability of dermoscopic criteria for detecting malignant melanoma Soyer HP, Smolle J, Leitinger G, Rieger E, Kerl H. Dermatology.1995;190(1):25-30.
- Three-point checklist of dermoscopy: a new screening method for early detection of melanoma Soyer HP, Argenziano G, Zalaudek I, et al. Dermatology.2004;208(1):27-31.
- A cancer-registry-assisted evaluation of the accuracy of digital epiluminescence microscopy associated with clinical examination of pigmented skin lesions Stanganelli I, Serafini M, Bucch L. Dermatology.2000;200(1):11-16.
- Comparison of dermatoscopic diagnostic algorithms based on calculation: the ABCD rule of dermatoscopy, the seven-point checklist, the three-point checklist and the CASH algorithm in dermatoscopic evaluation of melanocytic lesions Unlu E, Akay BN, Erdem C. J Dermatol.2014;41(7):598-603.
- Using the 7-point checklist as a diagnostic aid for pigmented skin lesions in general practice: a diagnostic validation study Walter FM, Prevost AT, Vasconcelos J, et al. Br J Gen Pract.2013;63(610):e345-e353.
- Increase in the sensitivity for melanoma diagnosis by primary care physicians using skin surface microscopy Westerhoff K, McCarthy WH, Menzies SW. Br J Dermatol.2000;143(5):1016-1020.
- Three-point checklist of dermoscopy: an open internet study Zalaudek I, Argenziano G, Soyer HP, et al. Br J Dermatol.2006;154(3):431-437.
- Diagnosing skin cancer in primary care: how do mainstream general practitioners compare with primary care skin cancer clinic doctors? Youl PH, Baade PD, Janda M, Del Mar CB, Whiteman DC, Aitken JF. Med J Aust.2007;187(4):215-220.