If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Population Health Department, QIMR Berghofer Medical Research Institute, Herston, Queensland, AustraliaCancer Research UK Manchester Institute and Institute of Inflammation and Repair, University of Manchester, Manchester, UK
Basal cell and squamous cell carcinomas of the skin are the commonest cancers in humans, yet no validated tools exist to estimate future risks of developing keratinocyte carcinomas. To develop a prediction tool, we used baseline data from a prospective cohort study (n = 38,726) in Queensland, Australia, and used data linkage to capture all surgically excised keratinocyte carcinomas arising within the cohort. Predictive factors were identified through stepwise logistic regression models. In secondary analyses, we derived separate models within strata of prior skin cancer history, age, and sex. The primary model included terms for 10 items. Factors with the strongest effects were >20 prior skin cancers excised (odds ratio 8.57, 95% confidence interval [95% CI] 6.73–10.91), >50 skin lesions destroyed (odds ratio 3.37, 95% CI 2.85–3.99), age ≥ 70 years (odds ratio 3.47, 95% CI 2.53–4.77), and fair skin color (odds ratio 1.75, 95% CI 1.42–2.15). Discrimination in the validation dataset was high (area under the receiver operator characteristic curve 0.80, 95% CI 0.79–0.81) and the model appeared well calibrated. Among those reporting no prior history of skin cancer, a similar model with 10 factors predicted keratinocyte carcinoma events with reasonable discrimination (area under the receiver operator characteristic curve 0.72, 95% CI 0.70–0.75). Algorithms using self-reported patient data have high accuracy for predicting risks of keratinocyte carcinomas.
Keratinocyte carcinomas (KCs) (specifically, basal cell carcinomas (BCC) and squamous cell carcinomas (SCC) of the skin) are the most common cancers in humans. Although often regarded as unimportant cancers, KCs cause considerable morbidity. Each year, 1.9% of the US adult population is estimated to receive treatment for KC, rising to 6.9% of the population aged >65 years (
). Globally, the highest rates of BCC (>1000 × 10−5 person-years) and SCC (387 × 10−5 person-years) are observed in Australia, although KC incidence rates exceed 100 × 10−5 person-years in most fair-skinned populations around the world (
). Reducing hazardous exposure to solar ultraviolet radiation is accepted as the mainstay of primary prevention in the general population, but there is less certainty about how to deploy medical services for skin cancer control. At one end of the early detection spectrum is formal population-based screening. To date, only Germany has embarked upon a nationwide program of screening for skin cancer (
). In other jurisdictions, guidelines recommend that patients at high risk for skin cancer undergo periodic surveillance, with the implied but unstated advice that those at low risk receive usual care (
), reliable risk stratification tools are needed to identify patients most likely to benefit from clinical intervention. Presently, clinicians must rely on their own experience to estimate a patient’s future risk of skin cancer, because unlike for melanoma (
), no prediction tools for these cancers have been developed or validated. Thus, we aimed to develop and validate a risk prediction model for quantifying the probability of being treated for a KC, regardless of histological subtype. Our approach was to use self-reported information of the type that can be elicited remotely, so that patients might be assessed and triaged before consulting a physician.
The eligible cohort comprised 38,726 participants with no prior history of melanoma, of whom 56% were women and the mean age was 56.2 years (standard deviation 8.1). Most participants reported white European ancestry (n = 34,579, 93%) and most were born in Australia (30,054, 81%). Median follow-up was 36 months (min 20; max 44). Distributions of key factors in the derivation and validation datasets are presented in Table 1.
Table 1Characteristics of study participants in derivation and validation datasets
There were 4,237 (16%) participants in the derivation dataset with at least one surgical excision for a confirmed BCC or SCC during the follow-up period. In univariate analyses, 23 items were significantly associated with the occurrence of KC (Supplementary Table S1 online). After conducting backward stepwise regression within the imputed derivation datasets (Supplementary Table S2 online), and then testing whether adding further candidate terms significantly improved the fit of the model, we derived a final model with terms for 10 predictors (age, sex, smoking status, ethnicity, skin color, tanning ability, freckling tendency, number of sunburns <10 years, number of previous skin cancers excised, and number of previous skin lesions destroyed; Table 2). None of the pairwise interaction terms was statistically significant at the P < 0.05 level, nor did they substantially modify the Akaike Information Criterion or the area under the receiver operator characteristic curve (AUROC); hence they were not retained in the final models. Strongest effects were observed for >20 prior skin cancers excised, >50 skin lesions destroyed, age >70 years, and fair skin color. Discrimination in the validation dataset was high (AUROC 0.80, 95% confidence interval 0.79–0.81; Figure 1). Although the model appeared well calibrated overall, there was some evidence that true risks were underestimated at the low end of the scale, and that risks were marginally overestimated in the highest categories (Figure 2).
Table 2Specification and performance of models to predict risk of keratinocyte carcinoma within 3 years: full derivation dataset (N = 25,842)
Cumulative incidence plots demonstrated that the risk of KC events was strongly predicted by the self-reported history of prior skin cancer excisions (Figure 3). Thus, we derived separate models according to the absence or presence of a past skin cancer history. Among those in the derivation dataset with no prior history of skin cancer excisions (n = 16,021), we derived a predictive model with good discrimination (AUROC 0.72, 95% confidence interval 0.70–0.75; Supplementary Figures S1 and S2 online) that comprised 10 statistically significant terms, namely age, sex, ethnicity, skin color, tanning ability, freckling, sunburns < 10 years, sunburns > 20 years, number of previous skin lesions destroyed, and family history of melanoma (Table 3). Among those with a self-reported history of at least one prior skin cancer excision (n = 9,650), the most parsimonious prediction model comprised eight terms (namely number of previous skin cancers excised, number of previous skin lesions destroyed, age, sex, smoking status, skin color, freckling tendency, and sunburns < 10 years) (Table 3). Discrimination in the validation cohort was 0.72 (95% confidence interval 0.70–0.73) (Supplementary Figures S3 and S4 online).
Table 3Specification and performance of models to predict risk of keratinocyte carcinoma within 3 years, by strata of self-reported prior skin cancer history (N = 25,671)
In supplementary analyses, we derived prediction models within strata of age and sex (Supplementary Table S3 online). Discrimination of the models in the validation datasets was similarly high across all models (AUROC approximately 0.80), although the terms included in the prediction models differed. Models restricted to older people (≥60 years) and women had fewer terms (seven each) than models restricted to younger people (10 terms) and men (11 terms). Across all four strata, terms were retained for age, ethnicity, skin color, numbers of skin cancers excised, and numbers of skin lesions destroyed.
Finally, to assess the utility of using the prediction model to guide clinical management, we calculated the proportion of patients in the source population that would be considered for clinical action (e.g., chemoprevention, surveillance program, etc.) at different risk score thresholds as predicted by the model (Table 4). Assuming a stringent scenario in which clinical action would be taken only for patients with a very high risk of developing skin cancer (e.g., 3-year risk score ≥0.7), then only approximately 1.3% of patients would be affected. Such a regimen would have extremely high specificity (99.5%) but very low sensitivity (5.4%), missing almost 95% of future cases of skin cancer. In contrast, selecting patients with 3-year risk score threshold ≥0.2 would permit 72.5% of patients to avoid clinical action, while yielding 64.5% of future cases with reasonable specificity (79.7%). The Youden Index was optimized (J = 0.46) at a predicted 3-year risk score of 0.13.
Table 4Sensitivity and specificity of the risk prediction model and proportion of “actionable” patients at incremental thresholds of risk scores for keratinocyte carcinoma within 3 years
We used data from a large cohort in the high incidence setting of Queensland, Australia, to develop a tool for predicting the 3-year risk of developing KC. We found that information on 10 items yielded a risk prediction index for KCs with very high discrimination (AUROC 0.79). This level of discrimination is among the highest reported in a validation dataset for a cancer prediction index. Although statistical techniques are not yet available for testing the calibration performance of prediction tools in very large datasets (
), visual inspection of the plots suggests that the predicted probabilities accord favorably with the observed risk of KC events. Overall, the strongest predictors of future risk were a past history of an excised skin cancer or a destroyed actinic skin lesion, and age. Smoking status and a number of pigmentary factors including skin color, tanning ability, and freckling tendency also independently and significantly improved the fit of the model. In separate analyses restricted to participants with no self-reported skin cancer history, the predictive factors were largely the same, although smoking status was dropped from that model and three other factors (sunburns in childhood, sunburns in adulthood, and family history of melanoma) were retained. Interestingly, apart from sunburns, measures of past or current sun exposure were not retained in any models, despite their strong associations with KC events in univariate analyses. Several nonexclusive explanations are likely. First, a past history of skin cancer is essentially a proxy measure for high-level cumulative sun exposure. Secondly, in populations exposed to high levels of ambient solar ultraviolet radiation (such as Australia or southern USA), the role of pigmentation factors risk is likely accentuated because these factors strongly determine the actual dose of ultraviolet radiation received at the level of the target cell. Thirdly, because phenotypic measures are reported with higher reproducibility than sun exposure (both in this sample (
)), they are less prone to misclassification and thus, on average, more likely to discriminate cases from controls.
The models described here are conceived to be clinical aids for stratifying patients according to their probability of developing skin cancer requiring surgical excision. As such, we used self-reported data of the type that can be collected remotely before consultation, and we did not discriminate between the development of BCC or SCC because initial management is similar for both. In addition to the pragmatic design, other strengths include the very large sample size (allowing separate datasets for deriving and validating models) and the use of a validated survey instrument for collecting baseline data that was specifically intended to assess skin cancer risk factors. We used health administration data ensuring virtually complete follow-up for all such events within the cohort. Moreover, we have shown that the eight items for excision of histologically confirmed KC have very high concordance (approximately 97%) with histopathologic diagnoses obtained independently from pathology records (
). We consider it unlikely that any controls would be misclassified as cases, although it is possible that some study participants who did develop invasive KCs during follow-up may have been misclassified as controls in these analyses. This could happen if KCs were treated destructively or were removed by simple biopsies that did not attract one of the eight billing codes reserved for excisions of histologically confirmed KCs. The magnitude of bias of this type is difficult to estimate, although we note that approximately 15% of controls had at least one skin biopsy during follow-up. Such bias is most likely conservative because it would tend to diminish the differences between cases and controls, and thereby reduce the discrimination of the model. A potential limitation of the present analysis was the absence of histology data with which to separately predict risks of BCC and SCC. We aim to explore histology-specific prediction tools when such data become available.
Although population screening for melanoma and KCs was introduced in Germany in 2008, recent reports suggest that the program has not delivered the anticipated benefits (
). Based on these findings, and the limited likelihood that any randomized trials for skin cancer screening will be conducted in the foreseeable future, there is growing interest in developing prediction tools to identify patients who may benefit from clinical interventions such as chemoprevention (
). An additional benefit of such tools, especially in high incidence populations or in settings where access to specialist care may be rationed, is to be able to triage people at low risk of disease back to their routine care providers. Prediction models for cancers of the breast (
) have been developed, but we are not aware of any published models that predict risk of KCs in the general population.
We therefore developed a prototype web-based application (see www.qskin.qimrberghofer.edu.au) to calculate a personal risk score. From a drop-down menu, a person selects the most appropriate response for each of the 10 items that were found to significantly predict risk of KC. The algorithm sums the beta-coefficients of the selected response items to generate a risk score, and then determines where that person’s score lies relative to the distribution of all risk scores in the QSkin cohort. For clinical utility, and to avoid perceptions of spurious precision, the tool reports a risk category rather than the actual score, as follows: (i) very much below average risk (bottom 20% of the risk distribution), (ii) below average risk (21st to 40th percentile of the risk distribution), (iii) about average risk (41st to 60th percentile of the risk distribution), (iv) above average risk (61st to 80th percentile of the risk distribution), (v) very much above average risk (top 20% of the risk distribution). Although the subsequent management of each patient will depend on his or her own particular circumstances, we believe this tool will aid clinicians and their patients in quantifying risk and deciding on an appropriate course of action. Although the external validity of this prediction tool remains to be determined, especially in settings where the incidence of KCs is lower than in Australia, it is notable that the factors conferring the greatest contribution to the risk score were not specific to this particular population. Moreover, the items to measure the factors were scaled across the full ranges of exposure, and are readily elicited by self-report either in a clinical encounter or remotely. In our future research, we aim to assess the impact of applying these prediction tools in diverse clinical settings.
Material and Methods
The QSkin Study comprises a cohort of 43,794 men and women aged 40–69 years randomly sampled from the population of Queensland, Australia, in 2011 (overall participation fraction 23%). Full details of recruitment and baseline characteristics of the cohort have been published previously (
). The Human Research Ethics Committee at the QIMR Berghofer Medical Research Institute approved the study, and all participants gave their informed and written consent to take part.
At baseline, participants were asked to self-report information about demographic items, ethnicity, general medical history, pigmentary characteristics, history of sun exposure, use of tanning beds and past history of treatments for skin cancer, and other skin lesions. The characteristics of the questionnaire have been published, including measures of repeatability for study survey items (
The primary outcome for this analysis was excision of a histologically confirmed BCC or SCC (“KC event”). To identify KC events in the cohort, we obtained administrative claims data from Medicare Australia for all medical services provided to all participants who gave express written consent for data linkage (n = 40,383) between the date of consent (from November 2010) and censor date (30 June 2014). Medicare is Australia’s universal health care system that subsidizes virtually all medical services outside of the public hospital system for citizens and permanent residents, regardless of age, private health insurance status, or other factors (
). Deterministic linkage to Medicare was conducted using Medicare number, name, address, and date of birth as identifiers. We defined KC cases as participants who received a medical service for one of eight item numbers (31255, 31260, 31265, 31270, 31275, 31280, 31285, 31290) for first surgical excision of BCC or SCC, for which the diagnosis must be confirmed by histology before the claim is lodged. We excluded 1,657 participants with confirmed pre-enrolment melanoma identified through record linkage to the population-based Queensland Cancer Registry; thus the final dataset for analysis comprised 38,726 participants.
Candidate predictor variables
Candidate predictor variables were selected a priori from the literature and practitioner input and included terms for demographic characteristics including age at enrollment, sex, place of birth, ethnicity, and place of residence as a child. We tested numerous self-reported measures of phenotype (including unexposed skin color, skin burning tendency, skin tanning tendency, eye color, hair color, categorical freckling density on the face at age 21, and categorical nevus burden at age 21), self-reported measures of sun exposure (including number of sunburns at ages <10 years, 10–20 years, and >20 years, numbers of hours spent outdoors on weekdays and weekend days in the past year), and frequency of use of tanning beds. Terms related to prior medical history included self-reported numbers of skin cancers excised surgically, self-reported numbers of skin lesions treated destructively, history of melanoma in close blood relatives, and frequency of aspirin use in the past year (see Supplementary Table S1).
Imputation of missing data
Missing values for most candidate predictor items occurred at prevalence <1% and was highest for educational attainment (7%). To avoid potential bias, we imputed missing values for candidate predictors using the fully conditional specification method in PROC MI in SAS v9.4 (SAS Institute, Cary, NC), under the assumption that data were missing at random. We included all predictor variables and the outcome variable in the imputation step, specifying logistic regression to impute ordinal variables and linear regression for continuous variables. Imputation was run over five imputation cycles to generate five imputed datasets. We then performed the backward stepwise regression analyses on the five imputed datasets and combined the regression coefficients of the retained predictor variables using a modified version of “Rubin’s rules” suitable for categorical variables (
Our overall approach was to derive risk prediction models in a randomly selected sample of the dataset (two-thirds subset n = 25,842; hereafter “derivation sample”), and then to test the performance of models in the remaining sample (“validation sample,” n = 12,884; see Figure 1). We compared KC cases and controls using χ2 tests for categorical variables.
Risk model development
We used logistic regression for analysis, because the pragmatic consideration in the clinical setting is to determine the probability that a person with a given set of characteristics will require a KC excision within the next 3 years (i.e., become a “case”), or not. Although the logistic regression approach does not accommodate an analysis of competing risks of death, this is very unlikely to bias the estimates in our models because there were very few deaths during follow-up (>98.5% survival at 3 years). The very low rate of deaths was similar among cases and controls in both the development dataset and the validation dataset (development dataset: 1.49% vs. 1.29% in cases and controls, respectively, P = 0.30; validation dataset: 0.99% vs. 1.08% in cases and controls, respectively, P = 0.74). In the first phase, we included in the multivariate model those variables that were statistically significantly associated with KC at the 0.05 level in univariate analyses. We performed backward stepwise regression whereby those factors losing their significance at the 0.1 level in the multivariate analysis were dropped. In the second phase, those factors not significant in the univariate analyses were subsequently fitted to the multivariate model to identify any effects detectable only after adjusting for major risk factors.
Our analysis protocol stipulated that we were to perform analyses stratified by prior skin cancer history (yes vs. no); sex and age (<60 years vs. 60+ years) on the a priori assumption that these factors would likely interact with predictor variables. Because of the substantial computational resources required to test all possible pairwise combinations of predictive factors, as well as the implications for multiple testing and the very low likelihood of meaningful gains in predictive value, we tested only for potential interactions between terms considered plausible and clinically relevant (age*sex; age*sunburns as a child; age*number of skin cancers excised; age*number of skin lesions destroyed; skin color*sunburns as a child; skin color*number of skin lesions destroyed). We included each pairwise interaction term in the primary analysis models and re-ran the logistic regression in the development dataset.
We assessed the performance of derived prediction models using tests for discrimination and calibration. We evaluated discrimination using the AUROC (also known as the c-statistic) and its 95% confidence interval. The AUROC can be interpreted as the probability that the model will assign a higher probability of developing KC to a randomly chosen participant who developed KC than to a randomly chosen participant who did not develop KC during follow-up. An AUROC of 0.5 indicates that the model discriminates no better than chance, whereas an AUROC of 1.0 indicates a perfectly discriminating model. We also assessed calibration that compares the observed proportions of KC versus controls within equally sized groups categorized according to their predicted probability from the model. When the average predicted risk within specified categories matches the proportion observed, the model is well calibrated. We plotted bootstrap-corrected calibration curves (averaged over 500 replications) to illustrate the model’s fit across the range of predicted risk for KC compared with the observed outcome. We did not use the Hosmer-Lemeshow goodness-of-fit statistic because this measure is known to be overly sensitive with large samples sizes and as yet, no correction algorithms have been validated for samples >25,000 (
). We calculated the Youden index (J) for all data points in the sample to identify the “optimal” cutoff point at which both sensitivity and specificity are maximized.
In secondary analyses, we sought to establish whether predictive factors differed according to the past history of excision of skin cancer (self-reported), sex, and age (<60 years, ≥60 years), and so we repeated the process within strata of these variables. We excluded from the analyses stratified by self-reported prior history of skin cancer those participants with missing data for that item (n = 171, 0.6%). We used multiple imputation to complete missing values for all other variables.
Statistical computations were performed using SAS software (version 9.4; SAS Institute), and all tests for statistical significance were two-sided at α = 0.05. Calibration was conducted using the rms package in R v3.0.1.
An online version of the KC risk prediction calculator will be made available on our study website (www.qskin.qimrberghofer.edu.au) after publication of the article.
We acknowledge the assistance of Medicare Australia for data linkage and the Australian Electoral Commission for establishing the initial sampling frame. We also acknowledge the contributions of the research assistants who helped to collate the data and manage the process, and we would like to thank the many thousands of Queensland residents who have participated in the QSkin Study (QSkin Investigators: David C. Whiteman, MBBS, PhD; Adele C. Green, MBBS, PhD; Catherine M. Olsen, PhD; Rachel E. Neale, B Vet Sc, PhD; Penelope M. Webb, D Phil. QSkin Study Team: Lea M. Jackman, Barbara A. Ranieri, Bridie S. Thompson, Rebekah A. Cicero). This work was supported by program grants from the National Health and Medical Research Council (NHMRC) of Australia (grant numbers 1073898 and 552429). DCW and REN are supported by NHMRC Research Fellowships. The funding body had no role in the design and conduct of the study, the collection, management, analysis, and interpretation of the data, or the preparation, review, or approval of the manuscript.
DCW was responsible for conception and design, interpretation of data, and drafting and revising the manuscript critically for important intellectual and statistical content. BST and APT conducted the statistical analyses and contributed to interpretation of data and drafting and reviewing the manuscript. M-CH and CM contributed to the statistical analyses. ACG, REN, and CMO contributed to the conception and design, interpretation of data, and critical revision of the manuscript for important intellectual content. All authors read and approved the final version of the manuscript.