Abstract No. 
P1-21-08
2019 San Antonio Breast Cancer Symposium
December 10-14
2019

Application of Machine Learning to elucidate the biology predicting response in the I‑SPY 2 neoadjuvant breast cancer trial

Sayaman RW, Wolf DM, Yau C, Wulfkhule J, Petricoin E, Brown-Swigart L, Asare SM, Hirst G, Sit L, O'Grady N, Heditsian D, Albain KS, Chien A, Clark AS, Edmiston KK, Elias AD, Ellis ED, Euhus DM, Han HS, Isaacs C, Khan QJ, Lang JE, Lu J, Meisel JL, Mitri Z, Nanda R, Northfelt DW, Sanft T, Stringer-Reasor E, Viscusi RK, Wallace AM, Yee D, Yung R, Hylton NM, Liu MC, Park JW, Pohlmann PR, Symmans W, DeMichele A, Berry DA, Esserman LJ, LaBarge MA, van 't Veer L

Background

Machine learning relies on algorithms that learn patterns in large, complex datasets to predict outcomes. The adaptive, neoadjuvant ISPY 2 TRIAL evaluates novel agents added to standard therapy, and identifies their most responsive subtype. While previously proposed genes/signatures reflecting an agent’s mechanism of action predicted pathologic complete response (pCR) in some treatment arms/subtypes, not all arms had strong predictive biomarkers. We leverage machine learning to explore the limitations of using only known mechanisms of action in predicting pCR, and the extent to which biology outside known drug action improves response prediction in the first 10 arms of the trial.

Methods

Our study involves 986 patients with pre‑treatment gene expression and pCR data across 10 treatment arms including inhibitors of HER2: neratinib (N), pertuzumab (P), TDM1/P; AKT (MK‑2206; M); IGF1R (ganitumab); HSP90 (ganetespib); PARP/DNA repair (veliparib/carboplatin; VC); ANG1/2 (AMG386); immune checkpoints (pembrolizumab; Pembro); and a shared control arm (Ctr). Each arm/receptor subtype group was evaluated independently for groups with at least 20 patients (n=19), with 25% of data held out as independent test sets. We implemented a 3‑fold cross validation technique with 10 repeats using Random Forest ensemble algorithm with recursive feature elimination. In combination with clinical data, a three‑pronged feature‑selection approach was employed: (1) restricted to mechanism of action genes: AKT/PI3K/HER (m=10 genes), IGF1 (m=11), HSP90 (m=88), DNA repair (m=79), TIE1/2 (m=11), and immune (m=61), as well as HER2 amplicon genes; (2) expanded to include targeted pathways for all 10 agents/combinations plus ESR1 and proliferation genes (m=339); (3) an unbiased whole genome approach (m=17,990). Models were considered predictive if AUROC ≥ 0.75, Sensitivity ≥ 0.6 and Specificity ≥ 0.6 in cross validation and independent test sets.

Results

Table 1 summarizes the results of our analysis (Yes=predictive; NA=no/insufficient data). Prediction of pCR using only genes reflecting the known mechanism of the drug succeeded in 5 subgroups, with DNA repair genes predicting VC response and immune genes predicting Pembro response in HR+HER2‑ and HR‑HER2‑ subsets, and AKT/PI3K/HER + HER2 amplicon genes predicting (P) response in HR+HER2+ patients. Expansion of the feature set to include genes associated with all mechanisms of action of all drugs proved sufficient to produce good predictive models in 8 of 19 subgroups. Examples include DNA repair + immune genes predicting response to ganitumab in HR+HER2‑ and to (N) in HR+HER2+. An unbiased approach using all data yielded predictive power in 8 of 19 subgroups, including 5 with no predictive models from the first two approaches. Examples include HR‑HER2‑ (N) predictors enriched for metabolic, cell division and membrane protein proteolytic processes; HR+HER2+ TDM1/P enriched for metabolic, stress response and cell cycle processes; and HR‑HER2‑ MK‑2206 predictors containing Ser/Thr kinases. In total, we identify predictive biomarkers in 14 of 19 subgroups across the three feature selection approaches.

Conclusion

Our results suggest that hypothesis driven analysis restricted to assumed mechanisms of action of the experimental agents may be insufficient, and that exploration of possible off target effects may be needed to understand the underlying biology of response or resistance.  

View original