Quantitative Evaluation of Biologic Therapy Options for Psoriasis: A Systematic Review and Network Meta-Analysis

Multiple biologic treatments are licensed for psoriasis. The lack of head-to-head randomized controlled trials makes choosing between them difficult for patients, clinicians, and guideline developers. To establish their relative efficacy and tolerability, we searched MEDLINE, PubMed, Embase, and Cochrane for randomized controlled trials of licensed biologic treatments for skin psoriasis. We performed a network meta-analysis to identify direct and indirect evidence comparing biologics with one another, methotrexate, or placebo. We combined this with hierarchical cluster analysis to consider multiple outcomes related to efficacy and tolerability in combination for each treatment. Study quality, heterogeneity, and inconsistency were evaluated. Direct comparisons from 41 randomized controlled trials (20,561 participants) were included. All included biologics were efficacious compared with placebo or methotrexate at 3–4 months. Overall, cluster analysis showed adalimumab, secukinumab, and ustekinumab were comparable in terms of high efficacy and tolerability. Ixekizumab and infliximab were differentiated by very high efficacy but poorer tolerability. The lack of longer term controlled data limited our analysis to short-term outcomes. Trial performance may not equate to real-world performance, and so results need to be considered alongside real-world, long-term safety and effectiveness data. These data suggest that it is possible to discriminate between biologics to inform clinical practice and decision making (PROSPERO 2015:CRD42015017538).


Population
All people with psoriasis with moderate to severe disease 1 being treated primarily for their skin disease Strata The following groups will be considered separately if data are available:  Children (up to 12 yrs) & young people (12-18 yrs)  Different psoriasis phenotypes -i.e. plaque, guttate, pustular (generalized pustular psoriasis, localized forms i.e. palmoplantar pustulosis and acrodermatitis continua of Halopeau) and nail psoriasis  People receiving a second biologic (after the failure of the first) Subgroups The following factors will be considered for subgroup analysis if heterogeneity is present:  RCTs or systematic reviews  Cohort studies for long-term efficacy/ safety data Population size and directness  Sample size >50 (i.e. 25 in each arm)  Studies with indirect populations will not be considered  Studies in populations where the proportion being treated primarily for psoriatic arthritis was greater than 50% will be considered indirect Setting  Secondary care  Tertiary care  Community settings in which NHS care is received

Supplementary Figures S7 -S10 Network Forest plots
The figures summarize the evidence base for each comparison in the network meta-analysis. The blue squares represent the summary log odds ratio for each study. The blue lines represent the 95% confidence intervals for each study log odds ratio. The green squares and lines summarize the pooled random effects estimate of direct evidence (pooled within design) for each comparison and its 95% confidence interval. The red squares and lines summarize the random effects estimate of mixed direct and indirect evidence (pooled overall) for each comparison and its 95% confidence interval. The size of the markers representing each point estimate is proportional to the inverse square of the standard error.  Bachelez, 2015Gottlieb, 2003Leonardi, 2003Paller, 2008Strober, 2011Tyring, 2006van der Kerkhof, 2008Langley, 2014Griffiths, 2015Griffiths, 2015Gottlieb, 2004Reich, 2005Yang, 2012 Saurat  Igarashi, 2012Krueger, 2007Zhu, 2013All PBO UST Griffiths, 2015Griffiths, 2015 All PBO ETA IXE Leonardi, 2012 All PBO IXE  Inconsistency between direct and indirect estimates was estimated as the inconsistency factor (IF)the logarithm of the ratio of the direct and indirect odds ratios in each triangular or quadratic closed loop. IF values close to zero signify agreement between the direct and indirect estimates within a loop. The lower end of the 95% confidence interval is truncated at zero. Where the lower limit of the 95% CI is greater than zero there is evidence of statistically significant inconsistency in that loop.

Supplementary Appendix S2 -Supplementary Methods
We conducted a systematic review to examine the efficacy and tolerability of biologic therapies for psoriasis in accordance with the PRISMA-NMA statement (Hutton et al., 2015). The review protocol was registered on the PROSPERO international prospective register of systematic reviews (2015:CRD42015017538). The protocol was amended to incorporate data on ixekizumab as it became a licensed treatment for psoriasis during the process of this review.

Search and study selection
The patient population included all people with psoriasis of any severity being treated primarily for their skin disease.
RCTs were considered for inclusion if the intervention consisted of one or more of the followingadalimumab; etanercept; infliximab; ixekizumab; ustekinumab; and secukinumab. The comparison arm could consist of any of the listed biologic therapies above, placebo or methotrexate. Studies were excluded if there were <50 participants.
Studies with >50% of participants with psoriatic arthritis were considered indirect and therefore excluded.
The systematic literature search was conducted in PubMed, MEDLINE, Embase and Cochrane databases from inception to 09/29/2015, with top-up searches on 10/05/16 and an additional search for ixekizumab on 10/17/16.
Search results were de-duplicated, titles reviewed and irrelevant studies excluded (LE). The search terms and strategy are presented above in Appendix S1 (Supplementary Material). All studies reported in a language other than English were excluded. Title and abstract of studies were screened in a two-step process, initially by two assessors (ZY and ZJL), with any disagreement reviewed by a third assessor (CS). The full-text articles were obtained, read and rechecked against the protocol with those that did not meet it excluded (LE). Systematic reviews and meta-analyses were screened for additional papers (LE). The RCTs were distributed amongst the co-authors for detailed appraisal and extraction of data using a standardized data extraction tool. The extracted data were checked by another (LE).

Outcomes of interest
Outcomes of interest were decided through simple majority voting by the guideline development group, including patient representatives. The 'critical' outcomes were those of efficacy: clear/nearly clear (minimal residual activity/PASI>90/0 or 1 on PGA) and mean change in Dermatology Life Quality Index (DLQI). PASI 75 was considered 'important', rather than 'critical'. The primary safety outcome was tolerability, measured by withdrawal due to adverse events, and this was also considered 'important'. Withdrawal due to adverse events is an accepted proxy for tolerability, for example an NMA on the comparative efficacy and tolerability of antidepressants for major depressive disorder in children published last year in The Lancet (Cipriani et al., 2016). We intended to report the specific AEs leading to withdrawal, however unfortunately the reasons were not reported in sufficient detail in the published papers to allow this. Serious infection was also considered to be an 'important' outcome. However, based on our previous systematic review (Yiu et al., 2016) there were deemed to be insufficient events with which to produce a stable network RCTs of any duration beyond 12 weeks were included. Outcomes were extracted at 3-4 months, 1 year and 3 years.
As there was a significant gap in the availability of standardized DLQI outcomes for secukinumab, the relevant pharmaceutical company was contacted for supplementary information for published studies. Data were provided for the following referenced studies in this way Langley et al., 2014;Thaci et al., 2015). The data extraction and appraisal was then repeated by one assessor for all eligible articles (ZJL). Where studies only presented mean, SD for particular doses of a drug, a weighted average was taken of both the mean and SD of the different doses so that these could be analyzed consistent with the other treatment data.

Data analysis and quality assessment of evidence
NMA was performed using a random-effects model within a frequentist approach in Stata 13 (Stata Corp) using the network suite of commands based on the mvmeta multivariate meta-analysis program (Chaimani et al., 2014, White, 2011. NMA synthesizes direct and indirect evidence in a network of trials that compare multiple interventions (Mills et al., 2013). Equal heterogeneity across all comparators was assumed and correlations due to multi-arm studies were accounted for.
NMA increases the precision in the estimates and produces a relative ranking of all treatments for the studied outcome (Bucher et al., 1997, Salanti et al., 2011. Geometry of the networks was assessed through visual inspection of network maps. Multi-arm trials were decomposed into their constituent pairwise comparisons. Summary results were presented as an odds ratio (OR), or mean, with a 95% confidence interval. Predictive intervals were calculated to provide an interval within which the estimate of a future study would be expected to be. Cumulative ranking probability plots were used to represent the ranking probabilities of the various treatments with a visual estimation of their uncertainty. Rankings were quantified by the Surface Under Cumulative Ranking Curves (SUCRAs) that expresses the percentage of effectiveness/safety each treatment has compared to an ideal treatment ranked always first without uncertainty (Salantiet al., 2011). The larger the SUCRA value, the better the rank of the treatment.
Outcomes were jointly ranked using hierarchical cluster analysis of the SUCRA values of each outcome using the clusterank command. Cluster analysis is an exploratory data mining technique for grouping objects based on their features so that the degree of association is high between members of the same group and low between members of different groups. The appropriate clustering metric and linkage method was chosen based on the cophenetic correlation coefficient. The optimal number of clusters was chosen based on optimization of clustering gain (Chaimani et al., 2013). Absolute effects were calculated from multiplication of the NMA-derived relative effects estimates by an assumed control risk based on the pooled event rate across all studies of that comparator using GRADEPro GDT (McMaster University). Numbers needed to treat or harm (NNT/H) were calculated as the reciprocal of the corresponding risk.
Study quality was evaluated. Individual studies were assessed for selection bias, lack of blinding, attrition bias, measurement and outcome reporting bias using the criteria outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins et al., 2011) based on information reported in the published paper. Heterogeneity and inconsistency (differences between direct and indirect effect estimates for the same comparison) were evaluated using visual inspection of the forest plots. Inconsistency was also tested formally using an overall Chi-squared test of inconsistency and through loop-specific inconsistency plots and calculation of an inconsistency factor (IF). IF is the logarithm of the ratio of two odds ratios from direct and indirect evidence in the loop: values close to 1 suggest the two sources are in agreement (Chaimaniet al., 2013). Additional subgroup analysis was performed to evaluate the effect of considering just data on licensed biologic doses. Publication bias was assessed with the aid of comparisonadjusted funnel plots which show the difference between each study's estimate of ln(OR) and the direct summary effect for the respective comparison in terms of newer versus older treatments. In the absence of small-study effects,