If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Biologic therapies have shown high efficacy in psoriasis, but individual response varies and is poorly understood. To inform biomarker discovery in the Psoriasis Stratification to Optimise Relevant Therapy (i.e., PSORT) study, we evaluated a comprehensive array of omics platforms across three time points and multiple tissues in a pilot investigation of 10 patients with severe psoriasis, treated with the tumor necrosis factor (TNF) inhibitor, etanercept. We used RNA sequencing to analyze mRNA and small RNA transcriptome in blood, lesional and nonlesional skin, and the SOMAscan platform to investigate the serum proteome. Using an integrative systems biology approach, we identified signals of treatment response in genes and pathways associated with TNF signaling, psoriasis pathology, and the major histocompatibility complex region. We found association between clinical response and TNF-regulated genes in blood and skin. Using a combination of differential expression testing, upstream regulator analysis, clustering techniques, and predictive modeling, we show that baseline samples are indicative of patient response to biologic therapies, including signals in blood, which have traditionally been considered unreliable for inference in dermatology. In conclusion, our pilot study provides both an analytical framework and empirical basis to estimate power for larger studies, specifically the ongoing PSORT study, which we show as powered for biomarker discovery and patient stratification.
The introduction of biologic therapies into clinical practice has led to major improvements for patients with severe psoriasis. However, optimal, cost-effective provision of these therapies in a resource-limited health care system will necessitate a stratified approach (
Translational research has been revolutionized by the availability of technologies to measure features of the genome, transcriptome, and proteome (so called “omics”), primarily facilitated by high-throughput sequencing (formerly next-generation sequencing). It is likely that data generated by these new technologies will inaugurate an era of stratified care founded on comprehensive cellular profiles, rather than individual biomarker molecules (
). Well-designed and adequately powered prospective studies are required to identify clinically robust biomarkers. Psoriasis Stratification to Optimise Relevant Therapy (PSORT) is an academic-industrial UK stratified medicine consortium funded by the Medical Research Council and devoted to developing a stratifier of response prediction to biologics, scalable for clinical use for those with moderate to severe disease (
To inform the analytical strategy of PSORT, we conducted a pilot study to evaluate response to a biologic in psoriasis patients using lesional (PP) and non-lesional (PN) skin and blood and a range of omic platforms and different analysis pipelines. In addition to comparing the performance of each platform and tissue, we used our preliminary data to obtain an empirical estimate of the required sample size to adequately power the full PSORT study. Here we report the results of a comprehensive multi-omic pilot study (Figure 1), including RNA sequencing (RNA-seq) of mRNA PP and PN skin, RNA-seq from mRNA and from micro RNA (miRNA) from blood, and SOMAscan (SomaLogic, Boulder, CA) proteomic data from blood.
Our tightly phenotyped, rigorously controlled cohort of patients with severe chronic plaque psoriasis commenced biologic therapy with the tumor necrosis factor inhibitor (TNFi) etanercept. We evaluated the relative merits of each platform and show a workflow for scaled use on large data sets. We provide not only open data but open access to our complete analysis scripts and a fully executable R Markdown document for colleagues to evaluate and exactly reproduce the workflow themselves (
). Multi-omic analysis is a very resource-intensive process, particularly with the breadth of approaches described here, which are beyond the resources of most projects. We use this pilot study to comment on the relative merits of multi-omic approaches and highlight platforms that show particular promise in predicting response to therapy.
Patient characteristics and analysis of clinical response
Ten patients commencing etanercept therapy were recruited from a prospective clinical observational study entitled “Pharmacogenomic Signatures of Treatment Response in Psoriasis.” Figure 1 provides an overview of the study, and patient characteristics for included participants are shown in Table 1. Participants were assessed at baseline, week 1, and week 12 of therapy, and response to therapy was determined using the Psoriasis Area and Severity Index (PASI), with response defined as a reduction of PASI by at least 75% from baseline, and nonresponse defined as failure to achieve a reduction of at least 50% from baseline. Supplementary Figure S1 online shows a scatterplot of PASI at baseline versus PASI at week 12.
Table 1Summary of Clinical Characteristics of Included Participants
Patients (n = 10)
Age in years, mean
F 2, M 8
Weight in kg, mean ± SD
94.3 ± 17.7
BMI in kg/m2, mean ± SD
30.6 ± 5.5
Age at onset of psoriasis in years, mean ± SD
17 ± 11
Baseline PASI, mean ± SD
20.3 ± 8.8
PASI at week 12, mean ± SD
6.8 ± 3.9
Baseline DLQI, mean ± SD
20.1 ± 9.3
DLQI at week 12, mean ± SD
4.5 ± 3.4
BMI, body mass index; DLQI, Dermatology Life Quality Index; F, female; M, male; PASI, Psoriasis Area and Severity Index; SD, standard deviation.
Using samples from each participant at each time-point (one biopsy sample per library), we performed RNA-seq on mRNA from PP skin (60 million paired reads/sample), PN skin (60 million paired reads/sample), and blood (30 million paired reads/sample). We additionally performed RNA-seq on miRNA from blood (10 million single reads/sample) and SOMAlogic proteomic assessment on serum samples. Because exploratory data analysis is a key first step in multi-omics, we first constructed a sample similarity matrix to compare mRNA transcriptome across tissues, by calculating the pairwise Euclidean distance between all mRNA samples (see Supplementary Figure S2). Samples were clearly separated by tissue, although less distinction was seen between PP and PN skin samples, in part reflecting strong intrasubject effects and treatment effects between baseline and 12 weeks. Next, we examined transcriptome structure on a tissue-by-tissue basis using two different projection methods. Principal component analysis showed clear separation between skin and blood along the first principal component, as expected (see Supplementary Figure S3a online). The second principal component separated PP from PN samples, albeit with one data point corresponding to a participant’s week 12 observation. This patient showed good response to therapy, suggesting putative detection of remission at the mRNA transcriptome level. Similar although less distinct tissue separation was seen by another projection method, t-distributed stochastic neighbor embedding (see Supplementary Figure S3b). Tissue-wise projection plots across all platforms were dominated by intrasubject signatures (see Supplementary Figures S4 and S5 online). These unsupervised methods do not appear to separate patients by treatment response, indicating that supervised techniques may be required to detect a response signal in these data.
Response differential expression analysis by platform
Differential expression analyses to investigate the effects of etanercept treatment over time were performed for each platform (mRNA-seq, miRNA-seq, and SOMAscan proteomic assessment) and tissue type (PP skin, PN skin, and blood) using a common limma analytical framework (
). Access to our complete analysis script and fully executable R Markdown (RStudio, Boston, MA) document allows reproduction of this workflow with evaluation of these results (see Materials and Methods section and the Supplementary Materials online). We imposed a 10% false discovery rate threshold. We selected this cutoff because power calculations suggest the modest sample size of the study will impede our sensitivity to detect differential expression. A 10% false discovery rate threshold is therefore likely to underestimate the true number of differentially expressed genes or proteins in our data set.
A summary of differential expression of mRNA, miRNA, and protein across time and across tissue types may be found in Figure 2, and all differentially expressed molecules are summarized in Supplementary Table S1 online. Heatmaps of the top 1% gene expression changes from PP skin, PN skin, and blood are shown in Supplementary Figure S6 online. The top 1% of genes cluster by response to treatment across PP and PN skin and blood. Similar results were seen with supervised and unsupervised cluster assignments.
Upstream regulator analysis
Acknowledging that our study is not powered for discovery, we used the Upstream Regulator Analysis function in Ingenuity Pathway Analysis (IPA) (Qiagen Silicon Valley, Redwood, CA) to evaluate upstream regulator signals at the systems level that may be responsible for the observed gene expression changes. Upstream regulators are defined as any molecule that can affect the expression of another molecule, including transcription factors, cytokines, miRNAs, and drugs. The activation state for each regulator was predicted based on global direction of changes in the differential expression analyses for previously published targets of this regulator. The predicted top 30 regulators across all tissues and time points are shown in a hierarchically clustered heatmap in Figure 3. Results show a range of proinflammatory signaling and drug pathways, including a highly conserved, pan-tissue TNF signature, strongest at baseline in blood and at week 1 in PP and PN skin, and substantially diminished at week 12 across all tissues. A similar pattern is also seen in Figure 3, hierarchically clustered with TNF in IFNα2 and IFNγ signaling, in addition to NF-κβ signaling. This is an interesting proof of concept of the ability to detect a biologic drug response at a systems level, which we will discuss further.
Baseline omic platform concordance
We performed supervised and unsupervised partitioning around medoids (i.e., PAM) clustering on baseline samples for each tissue across all platforms, relating the differentially expressed genes from each platform to response and informing where drivers of prediction to response have commonalities. Cross-platform concordance was evaluated using the mutual information between cluster assignments, indicating a wide range of concordance values among supervised clusters (Figure 4a). Lesional mRNA and blood mRNA concordance was highest at 0.88 bits.
Machine learning models
We built a series of random forest models to predict continuous response using baseline data from each tissue type and platform (Figure 4b). Predictive power was detected across platforms using this methodology, showing additional signal to the differential expression analyses. The proteomics assay, in which we found no significant differentially expressed proteins at baseline using traditional marginal techniques (i.e., looking at each feature separately) proved the most predictive platform for response when modeled using random forests; however, differences between data types were generally insignificant. The recursive feature elimination algorithm that we used for these models (see Methods) may provide an alternative approach to biomarker discovery, offering insight into omic signatures of response. Our top-performing model achieved a root-mean-square error of 0.123, which is just less than 75% of the standard deviation of our (winsorized) delta PASI distribution.
Power calculations for a prospective observational study
and parameters derived from this pilot study, we calculated the requisite sample size to achieve 90% power to detect differential expression associated with response. Using the pilot data presented here as a guide, we project that 17,000 genes are likely to pass a reasonable expression filter and that some 1% of these genes will prove prognostic in a sufficiently large cohort. The top 1% of genes in our baseline measures had an average read count of approximately 100 before normalization; a minimum log fold change of approximately 0.72 after modeling; and a global dispersion estimate of 0.137, as estimated by the empirical Bayes procedure of
. Imposing a 5% false discovery rate threshold and a target log fold change of 1.5, we found that a study would require 41 participants to achieve 90% power to identify transcriptomic markers of biologic response for patients with chronic plaque psoriasis. Relaxing the number of differentially expressed genes to 5%, we can maintain 90% power with 34 participants. We present power curves projected across an expected range of fold changes at 1% and 5% differential expression in Supplementary Figure S7 online.
These results were presented at the 66th annual Montagna Symposium on the Biology of Skin. In this study, we present a framework for multi-omic analysis of biologic response. Our results are transparent and fully reproducible via companion markdown documents. This makes our analysis framework suitable for larger studies of a similar nature, such as the PSORT program. We emphasize that this proof of concept study is not powered for discovery; however, our results do suggest that signals of response to therapy in patients with severe psoriasis treated with the TNFi etanercept may be systemically detectable in lesional skin, nonlesional skin, and blood at baseline, before commencement of therapy. Evidence of differential expression correlated with treatment response was observed across all tissue types and time points but differed across omic platforms.
The choice of the TNFi etanercept related to the timing of study design and the observed rates of etanercept response were within the range observed in studies of larger cohorts (
). Prior pharmacogenomic evaluations of patient cohorts have centered on the use of genetic or genomic techniques, predominantly using skin biopsy samples, although several studies have used skin and blood (
), with consideration of detection of response early in treatment. Although no prospective biomarkers have yet been validated in adequately powered cohorts, there has been substantial progress, with the creation of predictors or classifiers of response (
Our focus was RNA-seq technology as the criterion standard for gene expression profiling. RNA-seq provides counts of all the genes expressed in a sample, including miRNAs and other potentially important noncoding RNA species. Use of high-quality RNA inputs (RNA integrity number > 8) ensured high-quality libraries, which passed relatively stringent quality control thresholds.
We selected an RNA-seq platform to enable direct comparison with other open access research data and to data from a future larger validation cohort. RNA-seq is now becoming the platform of choice for transcriptome analysis, especially as costs of high-throughput sequencing techniques reduce over time. Our use of RNA-seq allowed for the same technique to be applied across evaluation of tissue types, directly comparing samples from lesional skin, nonlesional skin, and blood, in addition to proteomics assessment. We have evaluated a range of exploratory visualizations of our pilot data, which showed differing performance: for example, a comparison of principal component analysis and t-distributed stochastic neighbor embedding visualization of lesional versus nonlesional skin highlighted the former method’s greater sensitivity to local effects.
The individual genes identified in differential expression analyses were not further evaluated, because our study is not powered for discovery, and this approach has been comprehensively reported elsewhere (
). However, at a systems level, upstream regulator analysis (IPA) of differentially expressed genes associated with clinical response across tissues and time points indicated that changes in genes controlled by the target of the drug, TNF, were the most predictive of response. Although this might seem intuitive, previous reports have linked etanercept response to IL-17 signaling rather than TNF early response genes (
). In blood, in addition to TNF regulation, we also saw a strong interferon signature associated with response to etanercept, which has previously been reported in association with etanercept response in skin (
). Comparison of TNF and interferon signatures across time points in association with response also shows an interesting pattern, with strong signals in blood at baseline and in skin at 1 week, potentially indicating the genomic response to TNFi therapy (Figure 3). Concordance of baseline omic platforms in prediction of response showed the strongest association between lesional skin mRNA and blood mRNA. Few response-associated genes were seen in common across tissues and time points; all genes associated in more than one tissue were located in the major histocompatibility complex (see Supplementary Table S1 online). This correlates with previous genetic findings (
Although it is difficult to either identify or validate stable subgroups within small cohorts, we are confident that this approach will be more informative in a larger study, and preliminary evidence here suggests that blood biomarkers may be an informative and less invasive predictor of response. We used our data set to empirically inform a power calculation for the prospective study PSORT, where 80 participants per group are being recruited for assessment of adalimumab and ustekinumab. This shows that the PSORT study is adequately powered to detect moderate to large treatment effects in most scenarios.
We encourage researchers to access our data in ArrayExpress (accession numbers E-MTAB-6428, E-MTAB-6555 and E-MTAB-6556) and review our supplementary R Markdown documents on GitHub to learn more about our pipeline and to fully reproduce our results. Data sharing and open source analytics are the obvious solution to the reproducibility crisis that plagues clinical and omic research today and is becoming more commonplace in fields that are advancing stratified medicine (
Our study went beyond analysis of a single technology appraisal of treatment prediction in one cohort to provide a scalable framework for predictive and inferential analysis of multi-omic data for clinical dermatology. Despite our small sample size, we were able to detect consistent signals of differential expression and build machine-learning models that in adequately powered studies may offer complementary information to clinical factors in the prediction of outcome. We suggest that this ability to detect signals is in part due to the use of a single clinician for cohort ascertainment and sample processing, thereby minimizing clinical confounders and batch issues and allowing bioinformatics expertise to synergize with clinical research strategy from conception through analysis. These results have implications for ongoing studies. Our exploration has provided the framework for the generation of a large-scale omics data set from PSORT. The signals we have detected will be examined for validity using the same robust analytical pipeline in PSORT, which we show is substantially powered to detect true biomarkers of response to therapy. Likewise, as omics techniques are applied to other dermatological diseases such as atopic eczema (
Long-term management of moderate-to-severe atopic dermatitis with dupilumab and concomitant topical corticosteroids (LIBERTY AD CHRONOS): a 1-year, randomised, double-blinded, placebo-controlled, phase 3 trial.
), genomic approaches to personalization and stratification of therapies may have broad applicability.
Materials and Methods
Prospective observational study
Ten participants commencing etanercept therapy (50 mg by subcutaneous injection administered once weekly) were recruited (providing written, informed consent) to take part in a prospective clinical observational study entitled “Pharmacogenomic Signatures of Treatment Response in Psoriasis” (UK Research Ethics Committee reference 11/NW/0500; protocol available in the Supplementary Materials). Patients had a diagnosis of chronic plaque psoriasis of early-onset (≤40 years) disease, were white of European ancestry (to the third generation), and had not received prior systemic or biologic treatments in at least 2 weeks (or four half-lives of last treatment, whichever was longer). Of the 10 participants, 9 were naïve to biologic therapy. Patients completed detailed demographic questioning, including reporting information on comorbidities and concomitant medication. Disease severity and response to therapy were assessed using the PASI, Physician Global Assessment, and Dermatology Life Quality Index. Clinical samples, including blood and skin biopsy samples, were collected at baseline and at 1 week (after the second injection of etanercept), and 12 weeks of treatment. Adherence to therapy was assessed, including witnessed/administered injections at the initial visit, self-reporting of timings of injections between visits, and monitored drug levels at the final visit. The same physician and research nurse conducted all research visits (ACF and JH).
Executable R scripts and R Markdown documents are available to allow complete reproduction of our analysis workflow. All analyses were conducted in R, version 3.4.0 (R Core Team, Vienna, Austria).
Upstream regulator analysis
Functional analysis of systems-level upstream regulators responsible for observed differential gene expression related to response was performed using the Upstream Regulator function in IPA, using all genes with nominal response of P less than or equal to 0.05 as input. For all gene set enrichment analyses, a right-tailed Fisher exact test was used to calculate a pathway P-value determining the probability that each biological function assigned to that data set was due to chance alone. All enrichment scores were calculated in IPA using all transcripts that passed quality control as the background data set. Upstream regulator analysis is based on prior knowledge of expected effects between regulators and their known target genes according to the IPA database. The prediction of activation state is based on the global direction of changes of differentially expressed genes; a z-score is calculated and determines whether gene expression changes for known targets of each regulator are correlated with what is expected from the literature for an activation of this pathway. In this exploratory analysis, we emphasized power over type 1 error, using a nominal z-score threshold of z greater than 2 to indicate activation or z less than –2 to indicate inhibition. For definition of response and differential expression analyses, please see the Supplementary Materials and Methods.
Supervised and unsupervised clusters differ with respect to how genes were filtered across the two groupings. For our supervised analysis, we filtered out the bottom half of probes by association with biologic response, as determined by moderated t tests. With unsupervised clusters, we filtered by the leading fold change between each sample pair, as implemented in limma (
). However, given our limited sample size, we chose to fix k equal to 2, separating samples into two groups that would ideally correspond to responders and nonresponders. Cross-platform concordance was evaluated using the mutual information between cluster assignments, a dependency metric that ranges from 0 to 1 bit when k equals 2.
We built and evaluated a series of random forest models using continuous response measures to compare the predictive power associated with different platforms. To do so, we created a pipeline using tools from the caret package for classification and regression training (
Continuous models, designed to predict a patient’s percent change in PASI, were tuned using the root-mean-square error loss function, which is standard for linear regressions. Response was defined by a winsorized delta PASI distribution, as explained. We selected variables using the two-loop recursive feature elimination algorithm outlined in
. For each platform, we tested 20 different subsets of probes, with dimensionality determined by an exponential function so that relatively low-dimensional subsets of the feature space were explored more closely than high-dimensional subsets. Performance was evaluated using 10-fold cross-validation. Lower root-mean-square error values indicate more predictive models.
ACF has received educational support to attend conferences from or acted as a consultant or speaker for AbbVie, Almirall, Eli Lilly, Leo Pharma, Novartis, Pfizer, Janssen, and UCB. NJR reports grants from PSORT industrial partners as listed in manuscript during the conduct of the study; other research grants from AstraZeneca and Stiefel GSK; and other income to Newcastle University from Almiral, Amgen, Janssen and Novartis for lectures/attendance at advisory boards. CEMG has acted as a consultant and/or speaker for AbbVie, Almirall, Janssen, Novartis, Sandoz, Rock Creek Pharma, Pfizer, Eli Lilly, Sun Pharmaceuticals, UCB, Leo Pharma, Galderma, and Celgene. RBW has acted as a consultant and/or speaker for AbbVie, Amgen, Almirall, Boehringer, Medac, Eli Lilly, Janssen, Leo Pharma, Pfizer, Novartis, Sun Pharma, Valeant, Schering-Plough (now MSD), and Xenoport.
We would like to thank M. Kulesz-Martin and J. Tolar, Symposium Director and Program Chair of the Montagna Symposium on the Biology of Skin for their assistance. We acknowledge the support of the National Institute of Arthritis and Musculoskeletal and Skin Diseases and all co-funding support provided the National Institute on Aging in grant 5R13AR009431-52. We would like to thank the medical and nursing staff at Salford Royal NHS Foundation Trust, in particular J. Howe, and at Broadgreen Hospital Liverpool, in particular A. Al-Sharqi and B. Dever for their assistance in recruitment. We would like to thank Carole Todd, Newcastle University for her help in optimizing RNA extraction from skin, K. Maratou at GSK Stevenage for performance of RNA-seq (skin samples), and M. Hughes and A. Lucaci of the Centre for Genomic Research Liverpool for performance of RNA-seq (blood samples). We thank I. Strickland at MedImmune for his assistance. We thank M. Jani for performing week 12 serum etanercept levels. CEMG and MP are National Institute for Health Research (NIHR) Senior Investigators. SA is funded by DFG SFB 1036. NJR’s research/laboratory is funded in part by the Newcastle NIHR Biomedical Research Centre (BRC), Newcastle’s NIHR Medtech and In vitro diagnostics Co-operative and the Newcastle MRC/EPSRC Molecular Pathology Node. CEMG and RBW are funded in part by the Manchester NIHR BRC. MRB and DW were funded by the NIHR as part of the portfolio of translational research of the NIHR BRC at Barts and The London School of Medicine and Dentistry. This project was enabled through access to the MRC eMedLab Medical Bioinformatics infrastructure supported by the Medical Research Council (grant number MR/L016311/1). The investigators are partially funded by the PSORT Medical Research Council, grant MR/1011808/1.
Long-term management of moderate-to-severe atopic dermatitis with dupilumab and concomitant topical corticosteroids (LIBERTY AD CHRONOS): a 1-year, randomised, double-blinded, placebo-controlled, phase 3 trial.