The distribution of mutations in cancer is highly non-uniform. Mutations in oncogenes and tumour suppressors are enriched across cancers, and specific sites known as hotspots are more frequently mutated, leading to the hypothesis that hotspot mutations offer a selective advantage1. A paradigmatic example is the tumour suppressor p53. Although TP53 is mutated in more than 50% of cancers, only eight hotspot mutations make up approximately one-third of all missense TP53 mutations3. Several hypotheses have been offered to explain the mechanisms behind this skewed distribution, including biased generative mutational processes during tumour evolution2,3, degree of functional alteration3,4,5, structural stability3,6 and immune editing7,8. However, these hypotheses are not mutually exclusive. Mutations and subsequent selection can lead to substantial alterations in the concentration of oncogenic proteins9,10,11, a factor that has not been quantified as a contributor to the predominance of hotspot mutations. Generally, mutant p53 is present at a higher concentration than wild-type protein, depending on the tissue, copy-number alteration and mutation12,13,14. Yet, divergence from self and overexpression can contribute to mutant p53 neoantigen immunogenicity, constraining the ability of mutant p53 to avoid immune surveillance. Because neoantigens from mutations in tumour driver genes that are shared across patients and tumour types represent attractive immunotherapeutic targets15,16, understanding this issue is of critical importance. Here we examine the relationship between oncogenicity and immunogenicity for tumour driver mutations, using p53 as a primary example, to develop a model for predicting therapeutic targeting strategies, such as for neoantigen-based immune therapies.
We found that mutation frequency distributions for commonly mutated driver genes were conserved across multiple cancer mutation databases (Fig. 1a, b) and that innate mutation rates based on trinucleotide context significantly correlated with mutation frequencies for several genes (Supplementary Information). We next quantified amino acid conservation over homologous proteins, a proxy for functional phenotype (Fig. 1c), and in silico–predicted reduced neoantigen presentation by major histocompatibility complex class I (MHC-I) molecules (Fig. 1d) across driver genes7. Several genes have hotspots at conserved sites and are poorly presented (Fig. 1e), implying that the fitness advantages the mutations confer may be driven by both features. We focused on TP53 because it is widely mutated in tumours, with well-established, order-conserved pan-cancer hotspots (Fig. 1b and Supplementary Table 1) and broadly available functional phenotypic data5. We quantified the altered transcription factor function of mutant p53 across eight principal transcriptional targets with a quantitative yeast assay5 (Fig. 1f and Extended Data Fig. 1). We found that, although loss of transactivation was present for hotspot mutations, many non-hotspot mutations had comparatively low transactivation capacity. Moreover, we predicted MHC-I molecule presentation for the set of nonamer neopeptides surrounding p53 hotspot mutations to be worse than for non-hotspot peptides in The Cancer Genome Atlas (TCGA; P = 4.748 × 10–7, two-sided Welch’s t-test; Fig. 1g). Mutant p53 loss of transcriptional activity and neoantigen presentation of derived neopeptides showed only weak rank correlation (Fig. 1h), leading us to conclude that all of the mechanisms proposed to underlie mutant p53 fitness are likely to provide some predictive information.
We therefore sought to harmonize this proposed feature set within a mechanistic mathematical model of mutant p53 fitness17,18,19,20,21. A model based on background mutation rates alone was insufficient to separate the hotspots from other mutations (Fig. 2a). We further looked to capture variation in mutant p53 concentration, which affects both the transcription factor function and neoantigen presentation. We assigned TCGA samples a normalized p53 protein concentration and effective MDM2 promoter affinity to infer typical per-allele mutant-specific concentrations22,23. We consistently found a significant inverse relationship between these two variables across tumour types (Fig. 2b and Extended Data Fig. 2a) and a significant correlation between our concentration estimates and immunohistochemistry data (Extended Data Fig. 2b, c). We constructed a nonlinear, two-parameter model that separates mutant p53 fitness onto a positive pro-oncogenic probability and a negative immunogenic probability (Supplementary Methods) coupled to mutant p53 concentration. Each component is given an appropriate weight by maximum-likelihood fitting with respect to TCGA mutation frequencies. Our fitness model successfully predicts the distribution of mutation frequencies, both per mutation and per codon (Fig. 2c and Supplementary Information), and accurately predicts the increase or decrease in each mutant frequency with respect to background frequency (Extended Data Fig. 3a, b). We found that predicting the distribution of TP53 mutations requires both functional and immune components through determining the relative likelihoods of the models (Supplementary Table 2 and Supplementary Methods). Model optimization depended strongly on the sampled MHC-I haplotype and all mutant phenotypes (Extended Data Fig. 3c, d and Supplementary Information). We optimized and applied similar models to other driver genes, with conservation used as a proxy for function (Extended Data Fig. 4a and Supplementary Methods). Combined models were more predictive for mutation distributions with larger frequency variance across all database mutations, which implies that increased mutation frequency variance relates to increased selection, as expected from Fisher’s theorem24 (Extended Data Fig. 4b), such as for PTEN (Extended Data Fig. 4c). To build a predictive model for KRAS, we were able to include measured binding affinities to the downstream Raf effector protein for a limited set of hotspot mutations25 (Supplementary Methods), in addition to inferences in conservation and immunogenicity (Extended Data Fig. 4d).
To represent the landscape of mutant p53 fitness, we defined a ‘free fitness’ function of each mutation as the sum of the positive functional fitness, the negative immune fitness and the logarithm of the background frequency (Supplementary Methods), analogous to a free energy in statistical physics with the multiplicity of states derived from the background mutation rate. We plotted the free fitness landscape (Fig. 2d) and observed a general trade-off between intrinsic fitness (logarithm of the background frequency and functional fitness; Supplementary Methods) and extrinsic immune fitness. The trade-off observed in TP53 is reminiscent of other evolutionary trade-offs, and we theorized that TP53 hotspots were Pareto optimal26,27. We computed the Pareto front and identified the optimal fitness coordinate constrained by the front when using our model (Fig. 2d and Supplementary Methods). We found that hotspots had statistically higher free fitness (Fig. 2e) and occupied an optimal regime in which they successfully trade off between the pro-tumorigenic benefit of functional loss and the cost of presenting immunogenic neoantigens. However, there was substantial variation among the hotspot mutations. For instance, R175H is functionally the most wild-type-like hotspot but typically has the poorest MHC-I binding capacity. By contrast, the R248Q and R248W (R248Q/W) mutations have nearly complete loss of transcriptional function and therefore can more often afford to generate potentially immunogenic neoantigens, because the proliferative competitive advantage induced by mutation would offset the cost of immunogenicity. For KRAS, under more restrictive assumptions, we observed evidence for a trade-off between functional and immune fitness for hotspot mutations in pancreatic adenocarcinoma, where KRAS is typically mutated (Extended Data Fig. 4e and Supplementary Methods).
One possible explanation for the inverse relationship is that mutations that alter protein function are generally more likely to generate differentially immunogenic peptides. We therefore compared non-pathogenic and pathogenic mutations in a curated set of non-cancerous disease driver genes and found that both types of mutation generated comparably predicted immunogenic peptides (Extended Data Fig. 5), implying that the trade-off observed is not to be expected a priori. Moreover, because our functional predictions for mutant TP53 are based on precision yeast assays, we checked for evidence of an oncogenic–immunogenic trade-off using independent TCGA assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA sequencing assay to develop a score for the lack of mutant p53 binding site occupancy (Supplementary Methods). We found that the functional component of our fitness model correlated significantly with lack of binding (Extended Data Fig. 6a) and that samples with increased lack of p53 binding consistently showed decreases in p53 target gene RNA expression (Extended Data Fig. 6b). We independently re-derived the oncogenicity–immunogenicity trade-off by comparing the inferred immunogenicity to our scores for lack of binding (Extended Data Fig. 6c). Finally, as a further control, we found a correlation between the yeast assay-derived probability of DNA binding and median target gene RNA expression conditioned on chromatin accessibility (Extended Data Fig. 6d).
We tested our immunogenicity predictions for mutant p53 using peptides from hotspot mutations predicted to be presented on human leukocyte antigen (HLA)-A*02:01 (Supplementary Table 3 and Supplementary Methods), which is the most frequent MHC-I allele in TCGA. First, we asked whether these peptides had differential ability to bind and stabilize HLA on the cell surface, using the TAP2-deficient human lymphoblastoid T2 cell line (Supplementary Methods). We found that R248Q/W peptides but not R175H peptide could significantly stabilize HLA-A*02:01 expression on T2 cells in a dose-dependent manner in comparison with the respective wild-type peptide sequence (Extended Data Fig. 7a and Supplementary Table 3). We next asked whether R175H and R248Q/W TP53 hotspot mutations elicit differential immune responses in vivo in patients with cancer. We identified seven HLA-A*02:01-positive patients with either bladder or ovarian tumours with these mutations and available peripheral blood mononuclear cell (PBMC) samples at Memorial Sloan Kettering Cancer Center (MSKCC). In total, three samples were from patients with R175H-mutant tumours (07E, 38A and 72J) and five samples were from patients with R248Q-mutant tumours (72J, 01A, 39A, 82A and 105A) (Supplementary Table 4). One patient’s tumour (72J) had both mutations, although the R175H clonal fraction was far lower (Supplementary Table 4). All but two patients (72J and 07E) were immunotherapy naive at the time of sample collection. Patient 72J, who had a tumour with both hotspot mutations, had an ongoing complete response to nivolumab (anti-programmed death (PD)-1) treatment with no disease detectable at the time of PBMC collection. Patient 07E, who harboured the R175H mutation, was on atezolizumab (anti-PD-L1) treatment at the time of PBMC collection. All other samples were collected before treatment initiation. We stimulated the PBMCs with peptides harbouring the R175H or R248Q mutations or with a CEF (cytomegalovirus, Epstein–Barr virus, and influenza virus) peptide pool or DMSO as positive and negative controls, respectively (Supplementary Table 3). We then measured the interferon-γ (IFNγ) and tumour necrosis factor-α (TNFα) production in CD8+ T cells by flow cytometry (Fig. 3a, b and Extended Data Fig. 7b). We found responses in three of the five R248Q samples, with the response proportional to the size of the CD8+ T cell population (Fig. 3a, b and Extended Data Fig. 7c, d). This indicates responses might correlate with the frequency of CD8+ T cell precursors recognizing the neopeptides. By contrast, only one of the three patients with R175H-mutant tumours had neopeptide reactivity; this patient (07E) had one of the largest expansions for the mutant TP53 allele and a concomitant increase in protein abundance as well as a positive response to anti-PD-L1 treatment (Fig. 3a and Extended Data Fig. 7e). This finding in combination with the lack of T cell reactivity in the immunotherapy-naive patient (38A) with four mutant R175H alleles indicates despite expansion of the mutant allele, R175H tends to be less immunogenic than R248Q/W, but anti-R175H T cell responses may be unleashed by immune checkpoint blockade therapy. Consistent with this, we found no reactivity in patient 72J, who harboured both hotspot mutations at lower abundance (Extended Data Fig. 7e) and had a complete response to immune checkpoint blockade therapy. This indicates that, in cancer, expansion and/or persistence of cognate T cell pools depends on the levels of the mutant protein.
We next asked whether differential immunogenicity of TP53 hotspots was a broad phenomenon in the healthy population and therefore potentially linked to the frequency of T cell precursors recognizing a mutant peptide. We compared the capacity of R175H and R248Q/W peptides when loaded onto autologous antigen-presenting cells to prime and expand specific T cells in two healthy donors with the HLA-A*02:01 allele (Extended Data Fig. 7b, Supplementary Table 3 and Supplementary Methods). We consistently noted greater IFNγ and Ki67 expression in T cells stimulated with R248Q/W peptides than in those stimulated with R175H peptides in both donors (Fig. 3c, d and Extended Data Fig. 7f). Furthermore, we assessed the yield of TP53 hotspot-specific T cell clones by multiplex identification of T cell receptor (TCR) antigen specificity (MIRA) assay (Adaptive Biotechnologies) in PBMC samples from 107 healthy donors representing a set of distinct HLA alleles, including 25 HLA-A, 46 HLA-B and 20 HLA-C alleles (Supplementary Methods). Forty mutant epitopes from R175, R282, R273 and R248 loci covering the top six p53 hotspots were screened for multiple peptide lengths. The distribution of normalized TCR yield per antigen peptide per donor, indicative of specific clonal expansion, was plotted for each hotspot position (Fig. 3e). Notably, we found that the R175 hotspot yielded statistically lower TCR reactivity per peptide as compared with all other hotspots, having a median value of zero reacting TCRs per peptide. Moreover, we found that hotspot reactivity corresponded to fitness model predictions (Fig. 3f). These results indicate that the MHC-I haplotype and TCR repertoire distributions of the healthy population may be more likely to react to the R248 locus than the R175 locus.
Validating the link between increased immunogenicity and immune response to mutant p53, we found that the protein abundance of the CTLA-4, PD-1 and PD-L1 immune checkpoint proteins was higher in TCGA samples with TP53 mutations that were predicted to be more immunogenic (Extended Data Fig. 8). Our results suggest increased immune activation and concurrent establishment of adaptive immune resistance. When we segregated survival on the basis of functional, immune and combined fitness in TCGA and a cohort of patients with non-small-cell lung cancer (NSCLC) treated with anti-PD-1 at MSKCC (Extended Data Fig. 9), we found that functional and immune fitness components were required to achieve significant survival separation in TCGA, whereas immune fitness on its own significantly separated immunotherapy-treated patients with NSCLC by survival. For robustness, we retrained our models across a range of relative weights between functional and immune fitness (Supplementary Methods). We demonstrated that both components contributed to a model optimized for survival separation across TCGA, with the functional component carrying greater weight, whereas the immune component was the main determinant for an equivalent model in the immunotherapy-treated NSCLC cohort (Fig. 4e).
Because germline TP53 mutations are the primary cause of Li–Fraumeni syndrome (LFS), which is a highly cancer-prone autosomal dominant disorder28, we theorized that mutant p53 fitness relates to the time to first tumour formation in patients with LFS. We plotted Kaplan–Meier curves showing the age of tumour onset for persons with germline missense TP53 mutations in the International Agency for Research on Cancer (IARC) R20 germline dataset and for an independent LFS cohort coordinated by the National Cancer Institute (NCI)29, stratified on the basis of mutant p53 fitness (Supplementary Methods). We found that functional and immune components were required for significant separation of patients based on time to onset, with the immune component required across a range of relative weights (Fig. 4a, b and Extended Data Fig. 10). These results may seem counterintuitive in that mutant p53 may be interpreted as ‘self’ by the adaptive immune system in patients with LFS. However, increased mutant p53 abundance, compounded by additional somatic mutations, may increase tumour immune surveillance and mutant p53 antigenicity during tumorigenesis. These findings suggest a possible role for immune surveillance and the potential for immune intervention in germline TP53-mutant tumours.
Finally, non-cancerous cells in diverse tissues harbour somatic TP53 mutations that confer a competitive advantage, predisposing the clones containing such mutations to develop into cancer30. We collated mutation data from multiple published works across many mutated tissues (Supplementary Information) and found the same cancer hotspots in non-neoplastic cells (Fig. 4c). Unexpectedly, however, the frequency of the hotspot mutations was different. R175H was markedly under-represented in non-neoplastic cells compared with tumours (P < 0.0001, two-sided binomial test), whereas the potentially more immunogenic R248Q/W mutations were among the most frequent. The addition of an immune component in the non-neoplastic setting improved predictions to a substantially lower degree than in the neoplastic setting (Fig. 4d and Supplementary Table 5), supporting the hypothesis that the difference in hotspot frequency between non-cancerous and cancerous datasets is driven by the hotspot mutation’s immune fitness. We then split the non-neoplastic TP53 mutation dataset into the largest tissue-specific subgroups and found that immune weight depended on the tissue type (Fig. 4d), although the weight was always weaker than the optimal value for fitting the TCGA mutation distribution. Overall, these findings suggest that more functionally fit mutations probably predominate in non-cancerous and precancerous lesions owing to their selective replicative advantage; for cancer to form, however, immune escape becomes critical (Fig. 4f).
We present a general mathematical framework for predicting the fitness of tumour driver mutations. For p53, we used a free fitness model that integrates the background mutation rate, protein concentration, functional fitness advantage and immune fitness cost. Hotspots were predicted to fall on a near-optimal Pareto front, with trade-offs constraining driver mutations from completely evading immune selection, as has been shown for specific hotspot mutations31,32,33. Immune fitness has less of a role in predicting the distribution of non-cancerous TP53 mutations, which is consistent with recent observations that immune editing is less relevant in precancerous lesions34. Our insights therefore help define a window of opportunity for prophylactic immune intervention against mutant p53. Additionally, our model shows that mutant p53 fitness may have a role in determining the age of tumour onset in LFS, implying a benefit in targeting germline TP53 mutations immunotherapeutically. Inducing prophylactic immunity against mutant p53 seems to be possible according to our in vitro data showing the possibility of inducing anti-mutant p53 T cell responses in healthy individuals and even against poorly immunogenic mutations when sufficient antigen concentration and proper immune co-stimulation are delivered. Our approach captures critical mechanistic determinants of mutant p53 fitness and is amenable to extensions as data become available. For instance, although we considered only functional alterations for a set of canonical p53-regulated genes in this study, future models can include additional new measures for describing mutant gain of function, such as novel binding interactions between mutant p53 and other molecules due to changes in protein conformation or concentration. Similarly, other functions reflecting the vital role of p53 as a central transcription factor may be incorporated with additional data, such as induction of apoptosis at the mitochondria, immune regulation and surveillance of transposons and other genome parasites. The latter evolutionary role of p53 in preserving genome integrity may be responsible for p53’s centrality as a bottleneck across transcriptional networks35,36,37. Finally, our free fitness framework lends itself naturally to interpretable, free energy-based machine learning models38, which broadens the applicability of our approach to additional topics and modalities. By quantifying the underlying mechanisms of driver mutation fitness, we can therefore uncover both fundamental knowledge about tumour evolution and new opportunities for precision therapies.