Chimeric antigen receptor (CAR)–modified T cells with anti-CD19 specificity are a highly effective novel immune therapy for relapsed/refractory acute lymphoblastic leukemia. Cytokine release syndrome (CRS) is the most significant and life-threatening toxicity. To improve understanding of CRS, we measured cytokines and clinical biomarkers in 51 CTL019-treated patients. Peak levels of 24 cytokines, including IFNγ, IL6, sgp130, and sIL6R, in the first month after infusion were highly associated with severe CRS. Using regression modeling, we could accurately predict which patients would develop severe CRS with a signature composed of three cytokines. Results were validated in an independent cohort. Changes in serum biochemical markers, including C-reactive protein and ferritin, were associated with CRS but failed to predict development of severe CRS. These comprehensive profiling data provide novel insights into CRS biology and, importantly, represent the first data that can accurately predict which patients have a high probability of becoming critically ill.
Significance: CRS is the most common severe toxicity seen after CAR T-cell treatment. We developed models that can accurately predict which patients are likely to develop severe CRS before they become critically ill, which improves understanding of CRS biology and may guide future cytokine-directed therapy. Cancer Discov; 6(6); 664–79. ©2016 AACR.
See related commentary by Rouce and Heslop, p. 579.
This article is highlighted in the In This Issue feature, p. 561
Chimeric antigen receptor (CAR)–modified T cells with specificity against CD19 have demonstrated considerable promise against highly refractory hematologic malignancies. Dramatic clinical responses with complete remission (CR) rates as high as 90% have been reported in children and adults with relapsed/refractory acute lymphoblastic leukemia (ALL; ref. 1–4). At our center, patients have been treated with CTL019, engineered T cells composed of an anti-CD19 single-chain variable fragment (scFv), CD3ζ activation domain, and 41BB costimulatory domain. Marked in vivo CAR T-cell proliferation (100–100,000×) leads to efficacy, but can lead to toxicity, including cytokine release syndrome (CRS; ref. 2). CRS is the most common potentially severe toxicity associated with CAR T cells (1–5). CRS is not unique to CAR T cells and occurs with other therapies that engage T cells to kill cancer cells, including bispecific T-cell–engaging (BiTE) antibodies such as blinatumomab (6, 7).
Despite the frequency of CRS after infusion of CAR T cells, relatively little is known about the underlying biology of the syndrome. Improved understanding of CRS may lead to better recognition, improved treatment, and perhaps the ability to prevent or abrogate the most serious complications of CRS. The ability to predict which patients will become critically ill with severe CRS is vital to the development of CAR T-cell therapy, yet there are no published accurate predictors for severe CRS. Our group previously demonstrated that CRS can be successfully ameliorated with the IL6R inhibitor tocilizumab, and its use has become commonplace after T-cell–engaging therapies by our group and others (1–4). Despite its efficacy, the mechanism of tocilizumab in alleviating CRS remains poorly defined. Currently, tocilizumab is used to treat CRS after symptoms become severe. It is unknown whether tocilizumab can prevent CRS or, if used too early, could decrease the efficacy of the CAR T cells.
To better characterize and potentially predict CRS, we evaluated data from 39 children and 12 adults with refractory/relapsed ALL treated with CTL019. We obtained clinical and comprehensive biomarker data, measuring 43 different cytokines, chemokines, and soluble receptors (hereafter collectively called “cytokines”) as well as a number of other laboratory markers. Serial measurements from these patients allowed us to make a number of novel observations that improve our understanding of the biology of CRS and will directly affect clinical practice.
Key results to be discussed herein include: (i) a prediction model for severe CRS; (ii) an overall description of the timing and pattern of cytokine rise and fall after treatment with CAR T cells; (iii) a comprehensive comparison of cytokine profiles between patients who develop severe CRS versus not, which reveals significant details of the underlying biology of severe CRS; (iv) analysis showing that patients who develop severe CRS develop clinical, laboratory, and cytokine profiles that mirror hematophagocytic lymphohistiocytosis (HLH)/macrophage activation syndrome (MAS); and (v) a characterization of the effects of tocilizumab on CRS, establishing that the toxicity of CRS is mediated by trans-IL6 signaling that is rapidly abrogated after tocilizumab treatment in the majority of patients.
Clinical Description of Patients
A total of 51 patients with ALL'39 patients in the pediatric cohort, ages 5 to 23, and 12 in the adult cohort, ages 25 to 72—were treated at The Children's Hospital of Philadelphia (CHOP) and the Hospital of the University of Pennsylvania (PENN), respectively (Supplementary Table S5). The two cohorts were defined based on the clinical trials and treating institutions (see Supplementary Methods). Forty-seven patients (37 pediatric; 10 adults) had B-cell acute lymphoblastic leukemia (B-ALL) in first to fourth relapse, 1 child had relapsed T-cell acute lymphoblastic leukemia (T-ALL) with aberrant CD19 expression, and 3 patients (1 pediatric; 2 adults) had primary refractory B-ALL. Thirty-one patients (27 pediatric; 4 adults; 61%) had relapsed after prior allogeneic hematopoietic stem cell transplant (SCT). Four patients (all pediatric) had previously been treated with blinatumomab, a CD19 BITE antibody. No patient was treated with any other CD19-directed therapy prior to CTL019. Data on response to CTL019 in the first 30 patients (25 children and 5 adults) were recently published, demonstrating a 90% CR rate and a 6-month event-free survival (EFS) rate of 67% (2).
Clinical Description of CRS
Forty-eight of 51 patients (94%) developed CRS; the three that did not were children. Patients with CRS typically presented with flu-like illness. The majority of patients developed mild (grade 1–2; 18/51; 35%) to moderate (grade 3; 16/51; 31%) CRS, and 14 patients (27%) developed severe (grade 4–5) CRS (12 grade 4 and 2 grade 5; Table 1). For patients who developed fever, start of CRS was defined as the day with the first fever ≥ 38.0°C (100.5°F) relative to infusion of CTL019. Stop of CRS was defined as 24 hours without fever or vasoactive medications, indicating recovery from shock. Four patients developed CRS without fever: start and stop of CRS were defined based on the first day with flu-like symptoms and the first 24-hour period without symptoms, respectively. Additional details are included in Table 1 and Supplementary Results.
Nine patients required mechanical ventilation and 20 patients required vasoactive medications for either distributive (19/20) or cardiogenic shock (1/20). Fourteen patients required high-dose vasoactives as defined in Supplementary Table S1B. Only 6 patients developed a documented comorbid infection and only 2 of these infections, both in adults, were clinically consistent with sepsis (Supplementary Results and Supplementary Table S6). Clinical factors related to CRS are summarized in Table 1. Three adults died in the first 30 days after CTL019 treatment. Some children with severe CRS developed organomegaly, including hepatomegaly and splenomegaly, and a number of patients developed encephalopathy. Additional details are provided in the Supplementary Results and Supplementary Table S4.
Laboratory Description of CRS
We serially evaluated laboratory markers of inflammation and organ failure in patients treated with CTL019 (see Supplementary Table S2 for time points). Baseline ferritins (N = 48) were elevated in the majority of patients (median: 1580 mg/dL; range, 232–14,673) as a consequence of systemic inflammation and/or iron overload. Only 6 children out of the 37 measured and no adults had baseline ferritins <500 mg/dL. Peak ferritins (defined as highest value in the first month after CTL109 infusion) were very high in all patients regardless of grade, but the median was significantly higher in patients with grade 4–5 CRS (P < 0.001): grade 0–3 CRS (median 8,290 mg/dL; range, 280–411,936) and grade 4–5 CRS (median 130,000 mg/dL; range, 11,200–299,000). Similar trends were seen in adults and children (Table 2). All patients with grade 4–5 CRS had a peak ferritin >10,000 mg/dL, a value that is considered sensitive and specific for macrophage activation/HLH syndrome in children (8, 9). Thirty patients, including 20 of 39 (51%) children and 10 of 12 (83%) adults with grade 0–3 CRS, had a peak ferritin >10,000 mg/dL.
Baseline C-reactive protein (CRP) was elevated in a majority of patients (median 1.20 mg/dL; range, 0.12–29.4). Three children and 2 adults did not have CRP tested at baseline. Twenty-four of 36 children (67%) and 1 of 10 adults had a baseline CRP >1 mg/dL. Similar to ferritin, 1-month peak CRP was very high in the majority of patients with grade 4–5 CRS (median, 22.9; range, 16.0–37.1) and grade 0–3 CRS (median, 16.2; range, 0.7–56.5), with a statistically significant median difference in grade 4–5 versus grade 0–3 CRS (P = 0.010). Similar trends were seen in adults and children (Table 2). Consistent with generalizable inflammation and hypotension, alanine aminotransferase (ALT), aspartate aminotransferase (AST), blood urea nitrogen (BUN), lactate dehydrogenase (LDH), and creatinine (Cr) markedly increased in the majority of patients with CRS, with a statistically significant increase in grade 4–5 versus 0–3 CRS (Table 2). Although peak values of these clinical labs correlated with severity of CRS, none of these labs were helpful at predicting CRS in the first 3 days. Early CRP elevation was associated with grade 4–5 CRS (P = 0.02), but, contrary to another published report (3), we did not find early assessment of CRP in the first three days following CTL019 infusion was useful in predicting severity of CRS (AUC = 0.73). For example, considering CRP as a screen for high-risk cases, a CRP >6.8 mg/dL would have identified only 72% of the cases and had a positive predictive value (PPV) of 43%. Similarly, early ferritin elevation was associated with grade 4–5 CRS, but it was not useful in predicting CRS. Additional details on CRP and ferritin are included in Supplementary Fig. S1 and Supplementary Results.
Fibrinogen at <150 mg/dL is used in the diagnostic criteria for HLH as it is a sensitive marker of the syndrome (10). We found a strong association with low fibrinogen and grade 4 CRS in the pediatric cohort but not in adults (Table 2). Children became mildly coagulopathic with more significant coagulopathy with severe CRS (Table 2). Adults also developed hypofibrinogenemia and mild coagulopathy; however, this was seen across CRS grades (Table 2). Additional details are provided in the Supplementary Results. Although bleeding was rare, understanding the coagulopathy has direct clinical implications, as many of the patients required cryoprecipitate in addition to fresh frozen plasma to maintain hemostasis.
In order to understand the biology of CRS after CAR T-cell therapy, we performed serial cytokine assessment on the 51 patients. We compared median baseline values from 50 patients with ALL (1 subject did not have a baseline value) with a 10-patient normal donor cohort (Supplementary Table S7). Of note, we found that a number of cytokines, including sIL2Rα and MCP1, were consistently elevated in most patients with ALL compared with the normal donors and significant by Holm's adjusted P value. We determined if certain cytokines were associated with baseline disease burden in children (bone marrows were not collected at the time of infusion in many adults). Only EGF, IL12, and IL13 were associated with higher disease burden at baseline by Holm's P value (Supplementary Table S8).
We compared cytokine profiles in patients who had severe CRS with patients who did not. We found peak levels of 24 cytokines, including IFNγ, IL6, IL8, sIL2Rα, sgp130, sIL6R, MCP1, MIP1α, MIP1β, and GM-CSF sent in the first month after CTL019 were associated with grade 4–5 CRS compared with grade 0–3 CRS and significant by the Holm's-adjusted P value (Supplementary Table S9 and Fig. 1A and B). As a sensitivity analysis, we repeated this analysis with a reduced set of cytokine measures, keeping only one measure per target assessment window specified in the protocol, to equalize the number of measurements between subjects (see Supplementary Methods and Supplementary Table S10). Results were similar: 23 cytokines were significant, all amongst the previously found 24. Only IL1RA, the weakest significant result of the original 24, did not remain significant. Supplementary Table S9 also presents the median peak cytokine values for the pediatric and adult cohorts, separately.
Certain cytokines peaked earlier than others in patients with severe CRS. Understanding the timing of the rise and fall not only improves understanding of the underlying biology but also has potential therapeutic relevance. IFNγ and sgp130, for example, rise very early. These two cytokines were the only ones differentially elevated for severe versus nonsevere CRS in the first 3 days after infusion and prior to patients becoming critically ill after adjustment for multiple comparisons (Holm's; Supplementary Fig. S2; Supplementary Table S11). In contrast, although IL6 is the cytokine most strongly associated with severe CRS over the first month, early IL6 levels (days, 0–3) were unchanged by CRS after adjustment for multiple comparisons. Supplementary Table S11 summarizes the peak value of all cytokines in the first 3 days after infusion by CRS severity.
We found severity of CRS was weakly associated with the peak CAR T-cell expansion by copies/microgram qPCR over 1 month (P = 0.058); however, the peak in the first 3 days after infusion was not associated with CRS severity (Supplementary Fig. S3A and B).
We developed and analyzed 16 predictive models (8 from combined cohort and 8 from pediatric-only cohort) fit based on data from the main (discovery) cohort of 51 patients (Supplementary Table S12). Table 3 lists the best overall regression models and decision tree models for both the combined and pediatric-only cohorts. With the forward-selected logistic regression model, we accurately predicted which patients developed grade 4–5 CRS using IFNγ, sgp130, and sIL1RA with sensitivity 86% (95% CI, 57–98), specificity 89% (95% CI, 73–97), and AUC = 0.93 (95% CI, 0.86–1.0; Fig. 2A and Table 3). Using a decision tree, a combination of sgp130, MCP1, and eotaxin had sensitivity 86% (95% CI, 57–98) and specificity 97% (95% CI, 85–100; Fig. 2B and Table 3). For the pediatric cohort, the modeling was even more accurate; the forward-selected logistic regression model, including IFNγ, IL13, and MIP1α, had sensitivity 100% (95% CI, 72–100), specificity 96% (95% CI, 81–100; AUC = 0.98; 95% CI, 0.93–1.0; Fig. 2C and Table 3). Our group and others have previously published that disease burden prior to infusion can predict severe CRS (2, 3). In the pediatric cohort only, a bone marrow aspirate was collected immediately prior to infusion. We found disease burden did not improve the predictive accuracy of the models over the cytokines alone using regression modeling; however, it was identified as an important predictive variable in the pediatric cohort using the decision tree modeling. A combination of IL10 and disease burden had sensitivity 91% (95% CI, 59–100) and specificity 96% (95% CI, 81–100; Fig. 2D and Table 3). As many trials are not measuring disease burden, we note that a classifier built on predictors from our top candidate logistic model included a combination of IFNγ and MIP1α and had sensitivity 82% (95% CI, 48–98) and specificity 93% (95% CI, 76–99; Fig. 2E and Table 3). The Supplementary Results include examples of application of the models and further discussion of the strengths and weaknesses of the different models. Supplementary Fig. S4 (A–D) depicts the best single and two variable regression models for CRS prediction.
We then tested the accuracy of the models using our validation cohort of 12 additional pediatric patients. Clinical details on the 12 patients in the validation cohort are listed in Supplementary Table S13. We found all of the models performed extremely well in the validation cohort (Supplementary Table S12). Thus, the validation cohort did not change the model rank order.
Hemophagocytic Syndrome/Macrophage Activation Syndrome as a Consequence of CTL019
Nineteen cytokines studied in our patients have been studied in children with HLH (11–14). We found a nearly identical pattern of those cytokines having 1-month peak values differentially elevated in patients with HLH also elevated in patients with versus without severe CRS (Supplementary Table S9, Fig. 3A and B). There were statistically significant differences in IFNγ, IL10, sIL2Rα, IL6, IL8, IP10, MCP1, MIG, and MIP1β in CRS grades 4–5 versus CRS grades 0–3. All of these cytokines are expected to be elevated in patients with HLH (11–14). No significant differences by Holm's adjusted P value were seen in IL1β, IL2, IL5, IL7, IL12, IL13, and IL17, cytokines expected to be normal in HLH based on published work (11–14). GM-CSF and TNFα were differentially elevated in our study in those with severe CRS. GM-CSF and TNFα have been demonstrated to be elevated with HLH in some studies but normal in others (13, 15–17). IL4 is typically normal in patients with HLH (11, 12). It was differentially elevated in our study in the patients with severe CRS, although the levels were low in both groups.
IL6 Signaling and IL6-Directed Therapy
Twenty-one patients were treated with tocilizumab for CRS. Seven of 15 subjects with grade 3 CRS and all 14 subjects with grade 4–5 CRS received tocilizumab. Ten of the 21 subjects (4 pediatric; 6 adults) received more than one dose. Twelve patients were also treated with corticosteroids and two patients received etanercept. No patient received siltuximab (see Supplementary Table S14). Tocilizumab was given a median of 5 days after infusion with CTL019 (range, 2–12 days). Supplementary Table S15 details the time to initiation of tocilizumab relative to infusion, first fever, use of vasoactives, and intubation, if applicable. Response to tocilizumab was rapid. Many patients became afebrile immediately after the first dose. Most patients were able to wean vasoactives over the 24 to 36 hours after receiving tocilizumab, and they were stopped a median of 4.5 days after tocilizumab was given (Supplementary Table S15). Although all of the children with CRS survived and responded to tocilizumab, 3 adults treated with CTL019 died.
Figure 4 depicts the levels of four cytokines (IFNγ, IL6, sIL6R, and sgp130) over time in the 14 grade 4–5 subjects treated with tocilizumab. After the first dose of tocilizumab, there was generally a transient rise in IL6 levels, followed by a rapid decrease. sIL6R generally increased and continued to remain elevated for at least 2 to 3 weeks after tocilizumab, and sgp130 appeared to increase in some patients but not in others after tocilizumab (see Fig. 4 for additional details).
In this study, we have made a number of novel observations. First, we developed models that can predict which patients treated with CAR T cells are likely to become critically ill before they become ill, potentially allowing us to make early interventions that could reduce morbidity or mortality. Second, we established that concentrations of sIL6R and sgp130 are likely clinically and biologically relevant, as this is the first work that has systemically evaluated sIL6R and sgp130 after CAR T cells. Third, we identified 24 distinct cytokines that are differentially expressed in patients with severe versus without severe CRS, adding new insight into the biology underlying severe CRS. Finally, we confirmed our previously published but untested hypothesis that patients who develop severe CRS develop a clinical, laboratory, and biomarker profile consistent with secondary HLH.
The most common and potentially severe toxicity seen across trials using CAR-modified T-cell therapy is CRS. Data from our group and others suggest a correlation between development of CRS and response to CAR T cells (3, 4). Nevertheless, there does not appear to be a strong association between the degree of CRS and outcome (2). Similar to data with other T-cell–engaging therapies, including blinatumomab, our group and others have previously found that the severity of CRS may be associated with disease burden at the time of treatment (2, 3). Although this association exists, as we have demonstrated herein, disease burden alone is not sufficient to predict which patients will develop severe CRS. The PPV of high disease burden alone was poor as only 10 of 23 patients with an M3 marrow (>25% blasts) developed severe CRS. However, low disease burden does have a strong negative predictive value (NPV). In our pediatric cohort, only 1 of 15 patients who had marrow that demonstrated that <5% blasts at the time of CTL019 infusion developed severe CRS. Baseline disease burden was not obtained in most adults and our pediatric trials moving forward are no longer assessing disease burden at the time of infusion. Our data demonstrate that the risk of CRS can be predicted accurately without the need for assessment of disease burden at the time of infusion.
Severe CRS is a potentially life-threatening toxicity. Indeed, two adults in our series died as a consequence of CRS. The ability to predict which patients may develop severe CRS prior to its development may be helpful in mitigating toxicity, as cytokine-directed therapy could be instituted before a patient becomes critically ill. Patients predicted to develop severe CRS could be more closely monitored to allow early initiation of aggressive supportive care. In contrast, the ability to predict which patients are unlikely to develop severe CRS can prevent unnecessary early hospitalization and/or exposure to unneeded cytokine-directed therapy. Accordingly, the models we have developed using a small number of cytokines to predict severity of CRS with both high sensitivity and specificity have direct clinical and therapeutic relevance. It is not known if early intervention or prevention of CRS will limit efficacy. Prospective trials initiating early intervention based on cytokine profile models will need to be carried out carefully.
We did not find any standard clinical laboratory tests were helpful in predicting CRS severity as many (ferritin, CRP, LDH, AST, ALT, BUN, and Cr) peaked after patients became ill. Unlike prior reports by another group, we did not find early assessment of CRP could accurately predict severity of CRS (3). Future work will determine if the early predictive cytokine profiles we have identified are also relevant to other T-cell–engaging therapies such as BiTEs, as well as after cytotoxic T lymphocytes targeted at viruses (7, 18).
In addition to developing accurate predictive models, we have made a number of key insights into the biologic understanding of CRS. Analyzing cytokines sent before patients developed severe CRS, we demonstrated that sgp130 and IFNγ were strongly associated with the later development of severe CRS. We confirmed our earlier observation that IFNγ, IL6, and sIL2Rα show a marked differential increase in patients with severe CRS as compared with patients without severe CRS. We found marked differences in a number of additional cytokines not previously studied after CAR T-cell therapy. Generally, cytokines that were differentially elevated based on CRS grade included either cytokines released from activated T cells (sIL2Rα, IFNγ, IL6, sIL6R, GM-CSF) or activated monocytes/macrophages (IL1RA, IL10, IL6, IP10, MIG, INFα, MIP1α, MIP1β, sIL6R), as well as chemokines that are chemotactic for monocytes/macrophages (MCP1 and MIP1β), and cytokines that are often elevated after tissue damage and inflammation (IL8, G-CSF, GM-CSF, VEGF, IL6, and sRAGE; refs. 13–15, 19, 20).
We found that patients who develop severe CRS develop a clinical phenotype that resembles MAS/HLH, as well as laboratory evidence of abnormal macrophage activation, including elevated ferritin, low fibrinogen, and a cytokine profile that mirrors that seen in genetic forms of HLH. Xu and colleagues demonstrated that an IFNγ level of 75 pg/mL and an IL10 level >60 pg/mL has 98.9% specificity and 93% sensitivity for HLH when measuring cytokine levels in critically ill children (with and without malignancies) with either sepsis or HLH (11). IFNγ is not expected to be elevated in patients with sepsis. All patients with CRS 4–5 in our series had an IFNγ > 75 pg/mL and IL10 > 60 pg/mL. IL4 is the only cytokine tested that was an outlier from our a priori hypothesis; however, the absolute values were very small in all patients. Thus, we hypothesize that IL4 is likely not clinically or biologically relevant in our cohort. Future work will determine if there is any genotype–phenotype association between the development of MAS/HLH after CAR T cells and mutations in genes that predispose to the development of HLH, including PRF1.
IL6-directed therapy is the cornerstone of cytokine-based therapy after treatment with CAR T cells. It has been shown to be effective and, importantly, does not appear to decrease efficacy of the CAR T cells (2–4). That said, as IL6 does not appreciably rise prior to the development of CRS, clinical assessment of IL6 in the first few days after infusion will not help determine which patients will develop severe CRS or require IL6-directed therapy. It is unknown whether early treatment with tocilizumab prior to development of CRS would be of benefit. Tocilizumab has a very long half-life (11–14 days; ref. 21). Thus, if given early, the drug would be present at the time of IL6 peak and could in theory prevent severe CRS.
This study demonstrates the importance of trans-IL6 signaling in CRS. IL6 signals through two mechanisms, either via the membrane-bound or soluble IL6 receptor (sIL6R; refs. 20, 22). In classic IL6 signaling, IL6 binds to its membrane-bound receptor. Most cells do not express IL6R and are not responsive to classic IL6 signaling (23). In trans-IL6 signaling, sIL6R binds IL6, and the complex is associated with membrane-bound gp130 (24). Membrane-bound gp130 is associated with JAK1, JAK2, and TYK2 (20, 22). Accordingly, IL6-trans signaling activates the JAK/STAT pathway. IL6-trans signaling occurs in cells that do not express IL6R. Normally, high levels of soluble gp130 (sgp130) and sIL6R in the blood serve as a buffer, blocking IL6-trans signaling. In healthy persons, IL6 levels are typically on the order of pg/mL, yet sIL6R and sgp130 levels are typically 1,000× higher at ng/mL levels (20, 22). Consequently, IL6-trans signaling occurs only when IL6 levels rise from pg/mL to ng/mL levels. IL6-trans signaling can be blocked either by lowering IL6 levels, blocking the interaction of IL6 with IL6R, raising sIL6R levels, raising sgp130 levels, or blocking the interaction of IL6–IL6R with sgp130 (20, 22). Tocilizumab is an anti-IL6R monoclonal antibody. In other IL6-mediated diseases, IL6 levels often go up and sIL6R levels either increase or decrease after treatment as the interaction between IL6R and IL6 is blocked (21, 23, 24). After treatment with tocilizumab, there appeared to be a transient rise in IL6 followed by a rapid decrease in our CRS cohort. sIL6R levels also appeared to increase significantly after tocilizumab, because the complex of sIL6R and tocilizumab cannot be cleared by the kidney due to its size. These data suggest tocilizumab is blocking IL6-trans signaling through multiple mechanisms: blocking the interaction of IL6R with IL6, raising sIL6R to increase the IL6 buffer, and eventually lowering IL6 levels. Of note, collection of samples was not uniform between patients before and after treatment with tocilizumab, as tocilizumab was given on different days relative to time of infusion and some patients received more than one dose. Uno and colleagues recently published data that suggest that patients with rheumatoid arthritis who have elevated sgp130 levels are more likely to respond to tocilizumab, as higher sgp130 levels will neutralize more IL6/sIL6R complexes, leaving fewer complexes that need to be neutralized by tocilizumab. Thus, we believe the high levels of sgp130 that are seen in our patients prior to treatment with tocilizumab are clinically and biologically relevant; however, future work investigating the importance and function of trans-IL6 signaling is needed (25).
Other agents that target IL6 signaling are either commercially available or in clinical development, including direct IL6 inhibitors such as siltuximab and the IL6-trans signaling blocker sgp130Fc (20, 21, 23, 24). These agents have the potential to be effective for CRS, but future studies are needed. Our extensive cytokine profiling does not support the use of TNFα blockade after CAR T cells. Although some of the soluble TNF receptors were markedly elevated in patients with severe CRS, peak levels of TNFα were quite low. Interestingly, it has recently been shown that induction of shedding of TNF receptors leads to complete unresponsiveness of TNFα target cells (26). Published studies demonstrating efficacy of TNFα blockade used inhibitors in diseases with elevated serum or tissue TNFα levels. Although some TNFα-blocking agents such as etanercept also target TNF receptors, it is unknown whether targeting TNF receptors in patients with low levels of TNFα is efficacious. For patients who become critically ill after CAR T cells and do not respond to IL6 blockade, JAK/STAT, IFNγ, or sIL2Rα inhibitors could potentially be effective in ameliorating CRS symptoms. Unfortunately, these would likely affect the function of the CAR T cells.
Confounding variables that can affect cytokine production should always be considered when interpreting cytokine patterns to understand disease biology or develop predictive models. Mild differences in baseline cytokine values can occur in healthy normal subjects based on age, gender, and ethnic background (27). Disease-related factors, including the type of malignancy or disease burden, can also affect cytokine production. It is also important to evaluate both relative and absolute changes in cytokine production. Differences between populations are sometimes reported as fold changes without consideration of the absolute values, but this can be misleading, as statistically significant differences may not be biologically or clinically meaningful. We considered values in the context of the degree of variation seen in healthy populations and the levels reported in patients with inflammatory diseases and/or infection. We considered both the absolute and relative differences between groups.
Despite several important observations, our study has several limitations. Although we describe CRS after CAR T cells in the largest cohort of patients to date, the total number of patients with grade 4–5 CRS was relatively small. Nevertheless, our findings had adjustments for multiple comparisons, and our prediction models remained accurate in an independent validation cohort. Our data reflect patients treated at two centers with CAR T-cell products generated using the same manufacturing process, and it is unknown if our models will be generalizable. The only laboratory biomarkers that were robust for CRS prediction were cytokines and testing for cytokines is not available with rapid turnaround in many clinical laboratories. Our data allow the design of a focused panel of analytes that can be used to predict and track CRS. Common Terminology Criteria for Adverse Events (CTCAE) grading scales do not adequately or accurately define CRS after T-cell–engaging therapies. Thus, different sites and different publications use different grading scales, which can make comparisons between studies challenging. Lee and colleagues and Davila and colleagues also published CRS grading scales for patients treated with CAR T cells (see Supplementary Tables S16 and S17; refs. 3, 28). We have included a comparison of our CRS grading scale to other published grading scales in the Supplementary Discussion. The grading systems are similar enough that the predictive models we have developed are relevant in the other grading systems. Regardless of “numerical grade,” our models identify patients who develop life-threatening complications of CRS (mechanical ventilation and/or decompensated shock).
The need for predictive models is to distinguish patients who become critically ill from CRS with those who do not. We made the demarcation based on patients who developed life-threatening complications of CRS, including decompensated shock or respiratory failure (grade 4–5). Patients with grade 0–2 CRS developed only mild illness. Grade 3 CRS in contrast represents a heterogeneous clinical spectrum, as some patients only required i.v. fluids or minimal supplemental oxygen, whereas others required low-dose vasoactive medications or developed more significant hypoxemia. Moving forward with certain trials, it may be important to loosen our definition of “severe CRS” and include patients who became very ill but did not develop life-threatening CRS. We performed additional analyses subdividing patients with grade 3 CRS based on the need for any vasoactive medications or significant oxygen requirement (≥40% FI02) into two groups (3a and 3b) and resplit our cohort into severe and not severe, defined as CRS 0–3a versus CRS 3b–5. We performed logistic regression and decision tree modeling to develop new models with this alternate categorization and also tested the accuracy of our “original” models using the alternate categorization. These additional models and the additional analyses are included in the Supplementary Results and as Supplementary Tables S18 and S19 and Supplementary Figs. S5A–S5E.
We studied the biology of CRS and investigated if certain cytokines measured early could predict CRS severity. Additional variables not studied herein may predict severe CRS, and these will be investigated in future work. These include T-cell phenotype of the product, T-cell function of the product, CD19 polymorphisms that may differentially activate CTL019, tumor expression of CD19 or PD-L1, and immune gene polymorphisms. We have previously published that products generated from the majority of patients show high cytolytic activity and produce very similar in vitro levels of most cytokines (29). Because there is little variability in the ex vivo composition and cytokine production of the CTL019 product, but considerable heterogeneity in CRS in patients, we hypothesize we would likely not find differences in the CTL019 product that will correlate with severity of CRS.
In conclusion, these data represent the largest and most comprehensive analysis to date of the clinical and biologic manifestations of CRS after CAR T-cell therapy. We have identified and characterized cytokines that are associated with severe CRS and cytokines that can predict which patients will likely develop severe CRS before it happens. Early prediction will allow trials to determine if early intervention will mitigate toxicity without affecting efficacy. Based on the exciting efficacy seen with CAR T cells in early-phase trials, their use is rapidly expanding from a select number of tertiary care institutions to a larger number of centers. Accordingly, understanding and reducing toxicity is paramount, and these data provide significant novel information that may help achieve that goal.
We collected clinical and laboratory data on 39 patients with ALL treated consecutively with CTL019 at CHOP from April 2012 through September 2014 on a phase I/IIa clinical trial (NCT01626495). We collected clinical and laboratory data on 12 adults treated with CTL019 at PENN between March 2013 through August 2014 on two trials (NCT02030847 and NCT01029366). Additional details on the trial design and CTL019 product are provided in the Supplementary Methods. Written informed consent was obtained from all subjects or their legal guardians according to the Declaration of Helsinki, and all protocols were approved by their respective institutional review boards at CHOP and PENN. We also collected a limited set of clinical and laboratory data on an additional 12 consecutive patients treated with CTL019 at CHOP from October 2014 to May 2015 on NCT01626495. This data set provided a validation cohort for our predictive models. CRS was graded as previously described and as defined in Supplementary Table S1A and S1B (2, 5, 29). This grading scale was developed a priori and before any data analysis. Cytokine markers were measured on serum samples from 10 healthy volunteers (see Supplementary Material for details).
All data were decoded and maintained in secure databases. Forty-three unique cytokines and a panel of clinical laboratory tests, including chemistries, ferritin, and CRP, were serially monitored. CTL019 cells were serially measured in peripheral blood by quantitative PCR (2). Analysis was restricted to the first month after infusion of CTL019. A detailed description of the laboratory tests, cytokines, and collection time points is included in the Supplementary Methods and Supplementary Table S2. Baseline bone marrow aspirate and biopsies were collected in the pediatric cohort (see Supplementary Methods). Minimal residual disease was performed in the Clinical Laboratory Improvement Amendments– and Certified Authorization Professional–approved Children's Oncology Group Western Flow Cytometry Reference Laboratory at the University of Washington (Seattle, WA) as previously described (30, 31).
Clinical, laboratory, and cytokine markers associated with CRS are summarized overall and by occurrence of severe CRS (grade 4–5) for the pediatric, adult, and combined cohorts. For markers measured serially, values were summarized both as the peak over the first 3 days and over the month postinfusion in order to capture early and overall peak values of these biomarkers during the period when patients experienced CRS. In addition, relative changes from baseline (fold changes) were evaluated during the first 3 days after infusion. The month was defined as the first 35 days, allowing for a 1-week window beyond the expected 28-day evaluation. Between-group comparisons were performed using the Fisher exact test for discrete factors and the exact Wilcoxon rank-sum test for continuous factors. Fibrinogen and CRP, due to a few values recorded as exceeding a limit of detection that was within range of other observed values, were analyzed with the generalized Wilcoxon test for right-censored data. Values less than the lower limit of detection were recorded as half the lower limit. Further details are provided in the Supplementary Methods. All statistical tests were two-sided and generally done at the 0.05 level. Consideration for multiple comparisons was given when examining hypotheses for the 43 cytokine biomarkers, as described below. Statistical analyses were performed using R (version 3.2.1; R Development Core Team, Vienna, Austria) and SAS (version 9.4; SAS Institute Inc.) software.
Several hypotheses regarding cytokine levels were investigated, and statistical associations were declared significant only if they remained significant at the 0.05 level, after the Holm's adjustment for the 43 multiple comparisons (32). All hypotheses that were tested were developed a priori. Comparisons of cytokine levels analyzed in this manner included: 1-month peak and 3-day peak between those with versus those without severe CRS, patient baseline between those with high versus low disease burden and baseline patient versus healthy adult subjects; 1-month and 3-day peak analyses were performed for the combined, adult, and pediatric cohorts separately. Multiple published studies have demonstrated that although cytokines can vary between children and adults with disease or after antigen stimulation, baseline values in normal healthy children and adults are similar (33–35). Thus, we did not include a separate healthy pediatric cohort. In order to rule out that any variation around the sampling frequency of cytokine levels between patients could be biasing these comparisons, 1-month analyses were repeated including measurements from a reduced, common sampling schedule that was shared by nearly all subjects (details provided in the Supplementary Methods and Results). We hypothesized that patients treated with T-cell–engaging therapies, including CTL019, who experience severe CRS develop abnormal macrophage activation with secondary HLH. We made this hypothesis based on clinical symptomatology and marked hyperferritinemia. Numerous studies have shown that ferritin >10,000 ng/mL is highly sensitive and specific for HLH in children (33, 34). To establish whether the patients with severe CRS were manifesting HLH (diagnostic criteria in Supplementary Table S3), we compared the cytokine profiles from patients who developed severe CRS with published reports of cytokine profiles in patients with primary HLH associated with a genetic predisposition. Of the 43 tested cytokines, 19 have been previously studied in children with HLH (14–17). We reconsidered this subset of cytokines for the association with severe CRS to compare the HLH pattern, with Holm's adjustment for 19 multiple comparisons.
In order to understand which factors may be most intrinsically involved with CRS syndrome and the immune system's initial response, we sought to develop a prediction model for severe CRS that considered clinical and laboratory factors measured within the first 3 days after infusion. Models were kept small due to the limited number of severe CRS cases (14 overall and 11 in the pediatric cohort). Candidate variables included those factors for which the 3-day peak was missing in no more than 2 of 14 cases and 10% overall: ALT, AST, BUN, Cr, ferritin, qPCR, LDH, the 43 cytokine markers, as well as CRS-defining symptoms in the first 2 days after infusion (yes/no) and age at infusion. Three-day peak fold change was also considered for the 43 cytokines, and baseline disease burden was an additional candidate variable for the pediatric cohort. Two patients (both in pediatric cohort) developed severe CRS on day 3; however, all data included in the models were collected at least 12 hours prior to the development of severe CRS (see Supplementary Appendix and Supplementary Table S4 for additional details). Logistic regression and classification tree models were fit in the combined and pediatric cohorts, hereafter referred to as the discovery cohort. The adult cohort was too small to model separately. For the logistic regression models, forward selection using the Akaike information criterion was used to select the final models. The deviance statistic was used to select the tree models. Further details are provided in the Supplementary Methods. Models were validated in an independent cohort of 12 pediatric patients, referred to as the validation cohort.
Disclosure of Potential Conflicts of Interest
D.T. Teachey reports receiving commercial research support from Novartis. S.F. Lacey reports receiving a commercial research grant from Novartis. P.A. Shaw reports receiving a commercial research grant from Novartis. J.J. Melenhorst reports receiving a commercial research grant from Novartis. S.L. Maude is a consultant/advisory board member for Novartis. S.L. Weiss has received speakers bureau honoraria from ThermoFisher Scientific. B.L. Levine reports receiving a commercial research grant and speakers bureau honoraria from Novartis, has ownership interest (including patents) in Novartis, and is a consultant/advisory board member for GE. C.H. June reports receiving commercial research support from Novartis and has ownership interest (including patents) in the same. D.L. Porter is Regional Program Manager at Genentech, reports receiving a commercial research grant from Novartis, and is a consultant/advisory board member for Novartis. S.A. Grupp reports receiving a commercial research grant from Novartis and is a consultant/advisory board member for the same. No potential conflicts of interest were disclosed by the other authors.
Conception and design: D.T. Teachey, S.F. Lacey, P.A. Shaw, J.J. Melenhorst, S.L. Maude, N. Frey, D.M. Barrett, J.C. Fitzgerald, D.L. Porter, S.A. Grupp
Development of methodology: D.T. Teachey, S.F. Lacey, P.A. Shaw, J.J. Melenhorst, F. Chen
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): D.T. Teachey, S.F. Lacey, J.J. Melenhorst, S.L. Maude, N. Frey, F. Chen, J. Finklestein, R.A. Berg, R. Aplenc, C. Callahan, S.R. Rheingold, F. Nazimuddin, D.L. Porter
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D.T. Teachey, S.F. Lacey, P.A. Shaw, J.J. Melenhorst, S.L. Maude, E. Pequignot, J. Finklestein, S.L. Weiss, J.C. Fitzgerald, S.R. Rheingold, S. Rose-John, J.C. White, F. Nazimuddin, G. Wertheim, B.L. Levine, S.A. Grupp
Writing, review, and/or revision of the manuscript: D.T. Teachey, S.F. Lacey, P.A. Shaw, J.J. Melenhorst, S.L. Maude, N. Frey, E. Pequignot, D.M. Barrett, S.L. Weiss, J.C. Fitzgerald, R.A. Berg, R. Aplenc, S.R. Rheingold, S. Rose-John, J.C. White, F. Nazimuddin, G. Wertheim, B.L. Levine, C.H. June, D.L. Porter, S.A. Grupp
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): D.T. Teachey, E. Pequignot, V.E. Gonzalez, J. Finklestein, Z. Zheng, J.C. White, F. Nazimuddin, B.L. Levine, S.A. Grupp
Study supervision: D.T. Teachey, S.A. Grupp
Other (semiautomated and standardized the method in obtaining a massive amount data for this article): F. Chen
Financial support was provided by a grant from Novartis (to C.H. June), grants from the NIH (R01CA165206 to C.H. June; R01CA193776 to D.T. Teachey; R01CA102646 and R01CA116660 to S.A. Grupp; and K23GM110496 to S.L. Weiss), the Pennsylvania Department of Health (to S.A. Grupp), the Leukemia and Lymphoma Society (to S.A. Grupp and D.T. Teachey), the Jeffrey Jay Weinberg Memorial Foundation (to S.A. Grupp), the Children's Hospital of Philadelphia Hematologic Malignancy Research Fund (to D.T. Teachey), a Stand Up To Cancer–St. Baldrick's Pediatric Dream Team translational research grant (SU2C-AACR-DT1113 to S.A. Grupp, D.T. Teachey, D.M. Barrett, and S.L. Maude), St. Baldrick's Foundation Scholar Awards (to S.L. Maude and D.M. Barrett), and a Research Scholar Grant from the American Cancer Society (RSG-14-022-01-CDD to D.T. Teachey). Stand Up To Cancer is a program of the Entertainment Industry Foundation administered by the American Association for Cancer Research.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Discovery Online (http://cancerdiscovery.aacrjournals.org/).
- Received January 8, 2016.
- Revision received April 5, 2016.
- Accepted April 6, 2016.
- ©2016 American Association for Cancer Research.