The immunopeptidome landscape associated with T cell infiltration, inflammation and immune editing i
Issuing time:2023-05-05 11:22
Abstract
One key barrier to improving efficacy of personalized cancer immunotherapies that are dependent on the tumor antigenic landscape remains patient stratification. Although patients with CD3+CD8+ T cell-inflamed tumors typically show better response to immune checkpoint inhibitors, it is still unknown whether the immunopeptidome repertoire presented in highly inflamed and noninflamed tumors is substantially different. We surveyed 61 tumor regions and adjacent nonmalignant lung tissues from 8 patients with lung cancer and performed deep antigen discovery combining immunopeptidomics, genomics, bulk and spatial transcriptomics, and explored the heterogeneous expression and presentation of tumor (neo)antigens. In the present study, we associated diverse immune cell populations with the immunopeptidome and found a relatively higher frequency of predicted neoantigens located within HLA-I presentation hotspots in CD3+CD8+ T cell-excluded tumors. We associated such neoantigens with immune recognition, supporting their involvement in immune editing. This could have implications for the choice of combination therapies tailored to the patient’s mutanome and immune microenvironment.
Main
Tumors are composed of heterogeneous populations of nonmalignant and malignant cells with variable genetic and epigenetic characteristics that shape their ability to coexist and coevolve. This evolutionary process diversifies the expression of tumor antigens, the human leukocyte antigen (HLA) presentation of those antigens to cytotoxic T cells and the induction and the duration of effective anti-tumor immunity. In patients with lung cancer, it has been shown that the tumor immune microenvironment (TME) is highly variable between and within patients1. Tumors have been grouped into two main subtypes—infiltrated and excluded—according to the magnitude of infiltration of cytotoxic T cells2,3,4. Patients with infiltrated tumors typically respond better to immune checkpoint blockade (ICB) therapy5. Never-smoker patients with lung cancer respond poorly to ICB6 and the low responsiveness is thought to be associated with low tumor mutational burden (TMB), low neoantigen load and lower expression of programmed cell death-ligand 1 (PD-L1)7,8. In addition, high density of tissue-residence memory T cells within non-small-cell lung cancers (NSCLCs) is associated with response to ICB9. However, most patients harbor excluded tumors and even patients with a high TMB may not respond10. Moreover, it remains unknown whether the repertoire of HLA-bound peptides presented in T cell-infiltrated lung cancer tumors is substantially different from the repertoire presented in excluded tumors, and which immunogenic antigens mediate tumor killing. Certainly, the rational development of more effective immunotherapy treatments targeting tumor antigens in T cell-infiltrated and -excluded tumors would benefit from a more complete understanding of the tumor antigenic landscape.
Immune editing of tumors is a dynamic process and the timing of immune pressure plays an important role in tumor evolution. Chronic tobacco smoking induces immune surveillance, promoting the growth of tumor clones capable of immune evasion early in carcinogenesis11. In a therapeutic setting, clonal neoantigens (that is, detectable in all cancer cells) were shown to have been eliminated after ICB treatment in resistant tumors12. It is commonly accepted that clonal mutated neoantigens are ideal targets for vaccine or adoptive cell therapies. However, the clonality and heterogeneity of other tumor-specific canonical and noncanonical antigens13 that can potentially manifest tumor recognition are largely unknown. Once identified, these new antigens may serve as biomarkers and guide the development of advanced personalized immunotherapy.
To capture the complex interplay between the tumor antigenic landscape and anti-tumor immunity in lung cancer, we integrated genomics, transcriptomics, immunopeptidomics, spatial transcriptomics and multiplexed immunofluorescence (mIF) imaging to investigate the antigenic landscape in tumors with variable degrees of immune infiltration. We surveyed 61 tumor regions and adjacent nonmalignant lung tissues in 8 patients with lung cancer and performed deep antigen discovery combining HLA-I and HLA-II mass spectrometry-based immunopeptidomics, identified tumor antigens and explored their heterogeneous presentation. We associated diverse immune cell populations with the HLA-II immunopeptidome and identified a panel of source proteins, the presentation of which is associated with either CD3+CD8+ T cell infiltration or inflammation. We found that CD3+CD8+ T cell-excluded tumors not only have a higher expression, but also a higher presentation efficiency of tumor-associated antigens (TAAs). A significantly higher frequency of predicted neoantigens within HLA-I presentation hotspots was detected in the excluded tumors and nonsmokers compared with T cell-infiltrated tumors or smokers. With an unbiased external resource of validated immunogenic neoantigens, we associated such neoantigens in presentation hotspots with immune recognition, supporting their involvement in immune editing. Our approach could guide the choice of combination therapies tailored to the patient’s mutanome and the TME.
Results
Characterization of the antigenic landscape and the TME
In the present study, we analyzed a collection of multiple lung tumor regions derived from the same masses and paired nonmalignant adjacent lung tissues (here defined as macro-regions) from 8 primary NSCLCs collected in treatment-naive patients. We subjected a total of 61 macro-regions from 5 lung adenocarcinomas (LUADs), 2 lung squamous-cell carcinomas (LUSCs) and 1 large-cell neuroendocrine carcinoma (LCNEC) to deep proteogenomic analyses which included generation of whole-exome sequencing (WES) and bulk RNA-sequencing (RNA-seq) datasets, as well as mass spectrometry-based HLA-I and HLA-II immunopeptidomics, applying data-dependent and -independent acquisition methods (DDA and DIA, respectively)14 (Fig. 1 and Supplementary Table 1). We accurately identified, in total, 102,323 HLA-I and 53,343 HLA-II peptides, as corroborated by the high fraction of peptides predicted to bind the respective HLA alleles (ranging from 90% in 02289 to 96.2% in 02672 for HLA-I and from 75.3% in 02287 to 84.2% in 02288 for HLA-II) and the typical peptide length distributions and binding specificities (Fig. 1, Extended Data Fig. 1a–e and Supplementary Tables 2 and 3). The exceptionally low recovery of peptides from samples 02288-5 and 02288-6 was probably due to the highly (95%) necrotic tissue (Fig. 1 and Supplementary Table 1). The number of identified HLA-I and -II peptides correlated with the amount of tissue available for analysis in individual patients (P = 0.027) but not across patients (P = 0.845; Extended Data Fig. 1f,g). Across patients, the number of HLA-I- and HLA-II-bound peptides correlated with the respective HLA expression as assessed by bulk RNA-seq (P = 0.0003 and 7.3 × 10−6, respectively; Extended Data Fig. 1h,i), suggesting important interpatient variability. This could relate to variable prevalence of immune cells, which typically express high levels of HLA molecules and may contribute substantially to the measured immunopeptidome.
A summary of tissues and analyses done on the multiregion tissues, as well as information on the number of somatic mutations affecting protein sequences passing our pipeline’s thresholds, mutational load, tumor purity, necrosis level, number of unique HLA-I and HLA-II peptides identified by mass spectrometry and the percentage of peptides predicted as binders to the respective HLA allotypes (rank <2%). Patient characteristics and processing information can also be found in Supplementary Tables 1 and 2.
As expected, we found pathogenic mutations in oncogenes including KRAS and EGFR in LUAD samples, and multiple mutations in TP53 in both LUAD and LUSC samples (Fig. 2a), and prominent smoking mutational signatures were found in patients 02671, 03023, 02672 and 02290 (referred to below as ‘smokers’; Fig. 1). Principal component analysis (PCA) of genes known to be overexpressed exclusively in LUSC or LUAD tumors15 confirmed the classification of our samples (Fig. 2b and Supplementary Table 4). We calculated an inflammation score16 from bulk RNA-seq data using a defined immune-related gene panel17, shown to have optimal performance for lung cancer transcriptomes1. We assigned to each macro-region an inflammation status against the landscape of 1,012 LUADs and LUSCs from The Cancer Genome Atlas (TCGA) program (Fig. 2c,d). A wide range of inflammation was observed across patients and within individual patients, whereas the adjacent nonmalignant lung tissues were overall scored as inflamed.
a, Heat map of detected mutations (n = 157 mutations) that were annotated as pathogenic by the FATHMM prediction in COSMIC. Colors represent different patients and every line is a macro-region (n = 51 macro-regions). Mutations in KRAS, TP53 and EGFR are highlighted in red. b, PCA of genes associated with either LUADs or LUSCs confirming the classification of the samples. The list of genes was taken from Reili et al.15 and is provided in Supplementary Table 3 (n = 53 macro-regions). c, Inflammation scores calculated for each macro-region as well as LUAD and LUSC tumors from TCGA using expression levels of the immune-related gene panel as in Danaher et al.17. The different macro-regions (n = 53 macro-regions) of patients with lung cancer were superimposed on the TCGA data (n = 1,011 TCGA patients). d, Inflammation scores for each macro-region. The scatter plot denotes 53 regions of the 8 different patients; the red color denotes the healthy samples and red boxes denote the regions subjected to GeoMx analysis. In patient 02287, the tissue selected for GeoMx was not subjected to bulk RNA-seq and therefore not shown in this panel.
Spatial analysis of T cell infiltration and inflammation
Immune classification of lung cancer has proven quite challenging. Indeed, immune infiltration, as determined by detailed pathological evaluation, may disagree with infiltration status inferred by gene expression profiles1. Therefore, we determined the CD3+CD8+ T cell infiltration after pathological inspection with hematoxylin and eosin staining and mIF staining of T cell tumor infiltration markers (CD3, CD8, granzyme B (GrzB), Ki67, cytokeratin (CK) and DAPI) (Fig. 3a,b and Extended Data Fig. 2) in one randomly selected macro-region tissue per patient. The level of double-positive CD3+CD8+ T cells in tumor versus stroma areas and the level of GrzB in the tumor regions were relatively higher in samples 03023, 02290 and 02672. These samples were therefore assigned as CD3+CD8+ infiltrated and the remaining samples were assigned as CD3+CD8+ T cell excluded (Student’s t-test P = 0.036) (Fig. 3c).
a,b, The mIF images of 03023-02 (a) and 02288-07 (b) demonstrating the masking approach defining infiltration of CD3+CD8+ double-positive T cells expressing GrzB within tumor and stroma. c, The mIF quantification per patient (n = 8). Infiltrated samples (n = 3) have higher GrzB expression (dot size and inlay plot) and more CD3+CD8+ T cells in tumor than in stroma (one-sided Student’s t-test, P = 0.036). d, Micro-regions manually selected without independent repetition and classified into tumor, stroma, TLSs, CD45+-rich and ‘other’. Five micro-regions of sample 02671, representing 95 micro-regions, are shown. e, CD45 expression in tumor and stroma micro-regions calculated from the GeoMx transcriptome. The blue–red line and color scale denote the threshold classifying immune-high and immune-low tumors. Inset: CD45 expression in immune-high (n = 44 stroma and tumor micro-regions) or immune-low (n = 26 stroma and tumor micro-regions). f, Scheme of our relative classification. g, Expression in tumor micro-regions of immune activation markers calculated from the GeoMx transcriptome (excluded-high: n = 14; excluded-low: n = 11; infiltrated-high: n = 11; infiltrated-low: n = 7). h, The transcriptomes of all micro-regions (n = 95, GeoMx) were correlated with all macro-regions (n = 53, bulk RNA). The black boxes highlight correlations considering tumoral micro-regions per patient. i, The mean variance of these correlations in the boxes calculated as variance of correlation coefficients per patient. j, Increasing variance from tumors marked as infiltrated-low (02290, n = 7 tumor micro-regions), infiltrated-high (03023, 02672, n = 11 tumor micro-regions), excluded-high (02289, 02671 and 03421, n = 14 tumor micro-regions) and excluded-low (02287, 02288, n = 11 tumor micro-regions). k, LUSC tumors exhibiting a higher variance. l, In excluded tumors, the variance of correlation between tumoral micro-regions shown to be similar in LUADs (02287, 02671 and 03421, n = 14 tumor micro-regions) and LUSCs (02288 and 02289, n = 11 tumor micro-regions). m, The variance of correlation between macro- and micro-regions in excluded tumors. n, LUADs showing a significantly higher variance between micro- and macro-regions in excluded tumors (n = 14 micro-regions) rather than in infiltrated tumors (n = 11 micro-regions). Apart from c, one-sided Wilcoxon’s nonparametric tests were used. All boxplots show the median (line), the interquartile range (IQR) between the 25th and 75th percentiles (box) and 1.5× the IQR ± the upper and lower quartiles, respectively. No adjustments were made for multiple testing.
The presence of various immune cells is expected to affect the tumor antigenic landscape through potential immune editing, whereas immune cells are expected to contribute directly to the immunopeptidome. To explore the latter, we assessed overall inflammation level (on a scale of high versus low) by spatial transcriptome analyses using the GeoMx Cancer Transcriptome Atlas (CTA) platform. Using CD45, CK and DAPI (to capture immune cells, tumor and epithelial cells, and for segmentation, respectively) we selected for each patient defined micro-regions of interest that were subjected to spatial proteomic and transcriptional analyses. According to the morphological differences and the above markers, the selected micro-regions were annotated as: (1) tumor islets, (2) necrotic, (3) stroma (with variable contributions of tumor cells and immune cells), (4) CD45+ (immune) cell rich, (5) tertiary lymphoid structures (TLSs) and (6) other (including blood vessels and nonmalignant lung) (Fig. 3d, Supplementary Fig. 1 and Supplementary Table 5). CD45 expression in tumor and stroma micro-regions was relatively lower in sample 02290 compared with 03023 and 02672, as well as in samples 02287 and 02288 compared with 02289, 02671 and 03421. We therefore assigned samples 02290, 02287 and 02288 as relatively low and the rest as high inflammation (Fig. 3e).
Based on the above results, we grouped the patients in a two-dimensional (2D) space relative to each other. On the horizontal axis we ordered the patients on the scale of CD3+CD8+ T cell infiltration (excluded versus infiltrated) and on the vertical axis based on overall inflammation level (low versus high, Wilcoxon’s test P = 0.00022; Fig. 3f). Specifically in tumor micro-regions, the expression of the immune-related genes18CCL5, CD27 (PD-L1), CD8A, CMKLR1, CXCL9, CXCR6, IDO1, LAG3, NKG7, PDCD1LG2 (PD-L2), PSMB10 and STAT1 followed the profile of CD45, supporting our classification (Fig. 3g and Extended Data Fig. 3a,b). This rather irregular classification was relevant for downstream assessment of immune editing mediated by CD3+CD8+ T cells and for the assessment of the global contribution of immune cells to the immunopeptidome. Furthermore, tumoral micro-regions in immune-infiltrated tumors are expected to better ‘mirror’ the bulk tissue because these micro-regions contain components of the immune compartment, as opposed to tumoral micro-regions of immune-excluded tumors. Indeed, correlating the GeoMx gene expression profiles of each tumor micro-region and the respective patient macro-regions’ bulk RNA-seq data revealed increasing variation (calculated as variance of correlation coefficients) from tumors marked as CD3+CD8+ T cell-infiltrated-low (02290, better mirror), CD3+CD8+ T cell-infiltrated-high (03023 and 02672), CD3+CD8+ T cell-excluded-high (02289, 02671 and 03421) and CD3+CD8+ T cell-excluded-low (02287 and 02288, poor mirror) (Student’s t-test P = 0.082; Fig. 3h–j), supporting our classification above. It is interesting that, compared with LUADs, LUSC tumors were reported to be more heterogeneous, due to both tumor-intrinsic factors (for example, driver mutations, copy number variations, gene expression profiles) and heterogenic composition of the TME, and these are often linked19. Indeed, the above variance of correlations revealed that the two LUSC tumors are more variable than LUADs (P = 0.0019; Fig. 3k). We next minimized the bias introduced from the components of the immune compartment by calculating this variance only between tumoral micro-regions in the excluded tumors. The variance in LUAD (02287, 02671 and 03421) and LUSC (02288 and 02289) tumors was similar (P = 0.43; Fig. 3l). We then compared the variance of correlation between macro- and micro-regions similarly, only for excluded tumors, and found a higher variation for LUSCs compared with LUADs (P = 0.11; Fig. 3m), confirming that these two LUSC tumors are indeed more heterogeneous and the immune compartment may play an important role. Furthermore, considering only the five LUAD cases, we found a significantly higher variance of correlation between micro- and macro-regions in excluded tumors (P =1.8 × 10−6; Fig. 3n), supporting our conclusion about this complementary approach to validate our classification.
Biomarkers of immune infiltration in the HLA-II peptidome
HLA-II complexes are often abundantly and constitutively expressed on various immune cells in the TME. Furthermore, tumor-intrinsic and -extrinsic factors may influence their expression on the malignant cells. To investigate how such factors influence the HLA-II immunopeptidome, we first assessed the expression of the HLA-II presentation machinery in the different micro-regions. HLA-II machinery expression was higher in infiltrated-high tumor micro-regions compared with other groups, but similar to stroma micro-regions (except sample 03421, as explained below; Fig. 4a). In the CD3+CD8+ T cell-infiltrated-low sample, the expression of the machinery was higher in tumor micro-regions than in the stroma micro-regions, whereas, in excluded-high and excluded-low samples, the class II machinery was, as expected, more abundant in the stroma than in the tumor micro-regions (Fig. 4a). Next, we constructed a panel of source genes that were exclusively presented along the axis of infiltration (infiltrated versus excluded) and inflammation (high versus low), belonging to enriched immune-related terms (Extended Data Fig. 4 and Supplementary Table 6). For example, toll-like receptor 9 (TLR9) was presented in the HLA-II peptidome of infiltrated samples (03023 and 02672). TLR9 is known to be predominantly expressed by plasmacytoid dendritic cells and B cells20 and can reactivate immune surveillance to recognize tumor-specific antigens21. These results suggest that the HLA-II peptidome is influenced by the TME and it is a source of biomarkers that capture information about the TME.
a, Expression of genes of the HLA-II presentation machinery (HLA-DRA, HLA-DRB, HLA-DRB-3/4/5, HLA-DOA, HLA-DOB, HLA-DQA-1/2, HLA-DQB-1/2, HLA-DPA1, HLA-DPB1, HLA-DMA, HLA-DMB, CTSS and CD74) across all measured GeoMx regions (n = 95 micro-regions). b, Quantification of HLA-DRB expression in stroma and tumor regions by mIF. c, HLA-DR molecules expressed on the surface of cancer cells detected only in 03421 and 02672 samples with these tumors assigned as HLA-II+, representing n = 2 patients. Sample 02288 is shown as an example of an HLA-II− tumor, representing n = 6 patients. d, Expression of the transcription factor NKX2-1 in stroma (LUADs: n = 28; LUSCs: n = 9; LCNECs: n = 5) and tumor micro-regions (LUADs: n = 25; LUSCs: n = 11; LCNECs: n = 7) in LCNEC, LUAD and LUSC tumors. e, Expression of NKX2-1 in stroma, TLS and the CD45+ micro-regions (depicted here are stroma) and in tumor micro-regions in HLA-II+ (tumor: n = 12; stroma: n = 16), HLA-II− (tumor: n = 16, stroma: n = 9) and LUAD tumors. f,g, HLA-II sampling scores of source genes not found to be presented in any of the healthy tissues and found presented exclusively in HLA-II+ tumors (f) and their GO enrichment analysis (g). TOR, target of rapamycin. h, GO analysis of genes with higher expression in HLA-II+ (n = 12 tumor micro-regions; n = 16 stroma, TLS and CD45+ micro-regions) versus HLA-II− (n = 16 tumor micro-regions; n = 19 stroma, TLS and CD45+ micro-regions). ER, endoplasmic reticulum; NMDA, N-methyl-D-aspartate; UV, uiltraviolet light. Top terms, according to the P value (Fisher’s exact test), are displayed. All statistical tests have been performed as one-sided Wilcoxon’s nonparametric test. All boxplots show the median (line), the IQR between the 25th and 75th percentiles (box) and 1.5× the IQR ± the upper and lower quartiles, respectively. No adjustments were made for multiple testing.
To explore this further, we assessed the expression of HLA-DRB across tumors and found higher expression in tumor regions than in stroma regions, specifically in the LUAD patients 03421 and 02672 (Fig. 4b), in whom HLA-II molecules were indeed immunolocalized to the membrane of tumor cells (assigned as HLA-II+ tumors; Fig. 4c). LUAD predominantly arises from a subset of alveolar type 2 (AT2) cells that are known to constitutively express HLA-II22,23. Mouse models suggest that de-differentiation of AT2 cells into a LUAD state is initiated by loss of the lineage transcription factor NKX2-1, which is a master regulator of pulmonary differentiation24. NKX2-1 was significantly more abundantly expressed in LUADs compared with LUCSs and LCNECs in tumor micro-regions, and slightly, yet not significantly, more in LUAD HLA-II+ tumors (samples 03421 and 02672; Fig. 4d,e). HLA-II peptides derived from source genes that were presented exclusively in the HLA-II+ tumors and not in any of the healthy tissues were associated with variable cellular processes (Supplementary Table 7). An interesting example is the category called activation of cysteine-type endopeptidase activity involved in the apoptotic process, including proteins such as CASP4, which is an inflammatory caspase that acts as an essential effector of inflammasomes25, and the human growth and transformation-dependent protein (HGTD-P), which promotes intrinsic apoptosis in response to hypoxia26 (Fig. 4f,g). HLA-II expression on the LUAD cancer cells may therefore reflect cancer intrinsic and de-differentiation states, but other factors may also be involved. Gene ontology (GO) enrichment analysis of genes overexpressed (z-score > 2) in tumor micro-regions of the two above HLA-II+ cases (patients 03421 and 02672), relative to all other patients, revealed a significant enrichment for genes associated with processing and presentation of exogenous antigens on HLA-II and on HLA-I, whereas terms related to cell cycle, regulation of transcription and cellular response to DNA damage were mostly enriched in HLA-II− tumors (Fig. 4h); however, these differences were not obvious when stroma, CD45+ and TLS micro-regions were analyzed (Fig. 4h). Overall, tumors 03421 and 02672 were classified as CD3+CD8+ T cell-infiltrated and -excluded tumors, respectively, suggesting a more complex underlying biology associated with the HLA-II immunopeptidome.
HLA-II peptidome associated with immune cells in the TME
Next, we explored the extent to which immune cell markers are captured by the immunopeptidome in the different groups of tumors. We leveraged a previously published immunopeptidomics dataset of isolated human immune cells before and after in vitro activation, including CD14+ precursor cells, immature and mature dendritic cells and CD19+ B cells, CD4+, CD8+ and their corresponding activated cells27. For each cell type, we obtained a list of source gene markers that were at >99th and >80th percentiles of the overall sampling score distribution across all the genes, for HLA-I and HLA-II immunopeptidomes, respectively (Supplementary Table 8), and assessed the presentation level of these immune cell markers in our cohort. Remarkably, significantly higher HLA-II presentation levels of CD8+ and CD4+ T cells, and their activated counterpart cells were found in infiltrated tumors and smokers, but not in the tumors annotated as immune high (Fig. 5a–c). By contrast, CD14+, immature and mature dendritic cells, as well as CD19+ and activated CD19+ cells, were significantly more represented only in the immune-high tumors (Fig. 5a–c). Not surprisingly, the HLA-I immunopeptidome did not reveal as much, potentially because HLA-I molecules are ubiquitously expressed (Extended Data Fig. 5). We concluded that activated CD8+ and CD4+ T cells are represented in the HLA-II immunopeptidome and even more substantially in their activated states, specifically in tumors annotated as T cell infiltrated and in smokers, whereas the presentation of B cells and dendritic cells is associated with overall high inflammation.
a–c, Contribution of immune cells to the HLA-II immunopeptidome based on sampling scores of immune cell markers in tumors annotated as excluded (n = 29 tumor macro-regions) (a) and infiltrated (n = 15 tumor macro-regions), nonsmokers (n = 21 tumor macro-regions) and smokers (n = 23 tumor macro-regions) (b) and immune-high (n = 27 tumor macro-regions) and immune-low (n = 17 tumor macro-regions) (c) per cell type. P values were calculated using one-sided Wilcoxon’s test. The boxplots show the median (line), the IQR between the 25th and 75th percentiles (box) and 1.5× the IQR ± the upper and lower quartiles, respectively. No adjustments were made for multiple testing. d, The z-score distribution of the gene expression comparisons of tumor versus stroma + TLS + CD45+ micro-regions in the infiltrated-high samples. Genes in the upper quartile are more highly expressed in tumor micro-regions whereas those in the lower quartile are highly expressed in stroma micro-regions. e, Example of correlation of CD79B expression and B cell abundance in infiltrated-high samples (n = 26 stroma + TLS + CD45+ and tumor micro-regions). The error bands represent the 95% CI. f, The z-score distribution of the gene expression comparisons of tumor versus stroma + TLS + CD45+ micro-regions in excluded-high samples. g, Example of correlation of CD14 expression and macrophage abundance in excluded-high tumors (n = 34 stroma + TLS + CD45+ and tumor micro-regions). The error bands represent the 95% CI. h,i, Correlation of all genes attributed to stroma + TLS + CD45+ micro-regions (lower quartile) or with tumor micro-regions (upper quartile) with cell-type abundance in infiltrated-high (h) and excluded-high (i) samples. DCs, dendritic cells; NK cells, natural killer cells; Treg cells, regulatory T cells. j,k, Sum of sampling score for genes correlates with different immune cell type (Pearson’s correlation r > 0.5) in infiltrated-high (n = 2 patients and n = 163 genes) (j) and excluded-high (n = 3 patients and n = 168 genes) (k).
With an independent approach guided by the GeoMx transcriptome data, we further explored whether the presence of particular immune cell types in the different micro-regions could affect and contribute to the presented HLA-II immunopeptidome. We calculated the relative amount of immune cells in each micro-region17 (Extended Data Fig. 6a). As expected, immune cells were found to be more abundant in the stroma micro-regions than in the tumor micro-regions of excluded-high and excluded-low tumors, and vice versa in the infiltrated-low sample. Next, we focused on all source genes found to be presented in the HLA-II peptidome and further grouped these source genes as tumor related (upper quartile) or stroma, TLS and CD45+ related (lower quartile) (Fig. 5d–i), based on their expression in the micro-regions. We correlated their expression with the relative amount of immune cells (Pearson’s correlation coefficient; Fig. 5d–i and Extended Data Fig. 6) in each of the four groups separately. For example, the expression of stroma-, TLS- and CD45+-related CD79B gene correlated highly with the B cell abundance across all the micro-regions of the T cell-infiltrated-high patient samples (02672 and 03023), and the expression of the stroma-, TLS- and CD45+-related CD14 gene correlated highly with macrophages in excluded-high patients (03421, 02289 and 02671) (Fig. 5h,i, respectively). Last, to assess which immune cell types were most associated with the HLA-II peptidome, we summed up, per cell type, the HLA-II presentation sampling scores (which is an approximation of the presentation level) of all genes with Pearson’s correlation coefficient >0.5 (Methods, Fig. 5j,k and Extended Data Fig. 6). It is interesting that the HLA-II peptidome (represented by the presentation of these source genes) of infiltrated-high samples was associated with the presence of CD8+ T cells, cytotoxic T cells and exhausted CD8+ T cells in the tumor micro-regions, as well as most of the other immune cell types in the stroma, TLS and CD45+ micro-regions (Fig. 5j). By contrast, in excluded-high tumors, most of the immune cell types were contributing almost exclusively due to their presence in stroma, TLS and CD45+ micro-regions (Fig. 5k). These results highlight the influence that CD3+CD8+ T cell infiltration has on the HLA-II immunopeptidome.
HLA-I antigenic landscape and TAA presentation efficiency
The global HLA-I peptidome repertoire eluted from bulk tumor tissues is not expected to reveal immune-editing processes because peptides mainly derive from normal proteins and HLA-I molecules are ubiquitously expressed on nontumoral cells. Therefore, we focused on potentially immunogenic source antigens and we matched the mass spectrometry data against customized reference databases that included patient-specific genomic variants (SNPs and somatic mutations), as well as expressed noncanonical genes including long noncoding (lnc)RNAs, transposable elements and a publicly available ribo-seq-derived database of new open reading frames and pseudogenes (nuROFs)28 (see Methods for more information and Supplementary Table 9). Although we predicted 812–3,399 HLA-I- and 2,570–10,674 HLA-II-mutated neoantigens (MixMHCpred binding rank ≤2%) across the different samples, we could not detect any by mass spectrometry after manual inspection of tandem mass spectrometry (MS–MS) spectra. Similarly, HLA-II peptides from noncanonical sources were not confidently identified. We identified 18,342 and 12,856 HLA-I and HLA-II peptides, respectively, derived from canonical proteins that were not detected in the immunopeptidomes of adjacent healthy macro-regions and of other benign tissues after re-analysis of the HLA atlas29 (Supplementary Tables 2 and 3). Nevertheless, almost all of them were found to be expressed in the adjacent healthy tissues. We detected 218 unique peptides from transposable element sources and 773 unique peptides from other noncanonical sources such as lncRNAs and pseudogenes, but these were uniformly expressed in all tumor macro-regions as well as in the adjacent healthy tissues, indicating no tumor specificity (Extended Data Figs. 7 and 8 and Supplementary Table 9). In addition, most of the 1,409 nuORF-derived peptides were also found presented in the healthy macro-region tissues, with a fraction of those in addition detected in the HLA atlas29 (Extended Data Fig. 9 and Supplementary Table 9). The detection of the above noncanonical peptides was associated with HLA allotypes having basic amino acids in the carboxy terminus of their binding motifs, hence, in this small cohort, it was not feasible to associate the presentation level of such a new class of peptides with T cell infiltration.
Alternatively, we defined a set of 893 tumor-associated genes derived from canonical and noncanonical sources, collectively named TAAs, which were expressed (>1 transcript per million (TPM)) in at least one tumor macro-region but not in any of the nonmalignant tissues in the Genotype-Tissue Expression (GTEx) database (retaining genes with GTEx expression ≤1 TPM, except in testis) or in any of the adjacent healthy macro-regions (retaining genes with expression ≤1 TPM) (Fig. 6a, Extended Data Fig. 10 and Supplementary Table 10). Of these, 31 source TAAs were found to be presented by HLA-I in at least 1 macro-region in any of the patients. Presented-source TAAs were defined as those detected in the respective macro-region’s HLA-I immunopeptidome, whereas non-presented-source TAAs were those that were not detected, potentially due to lack of presentation resulting from too low expression or limited sensitivity of the immunopeptidomics analyses. Across patients, the expression of presented-source TAAs was higher in tumor macro-regions than in the adjacent healthy macro-regions (Fig. 6b) and higher than the expression of nonpresented-source TAAs (Fig. 6c,d). Furthermore, presented-source TAAs were expressed more abundantly on CD3+CD8+ T cell-excluded tumors (Fig. 6d) and source TAAs were presented mainly by HLA-I complexes (Wilcoxon’s test P = 1.7 × 10−8; Fig. 6e). To infer the propensity of a tumor to present TAAs, we computed the mean presentation efficiency of TAAs by normalizing the HLA-I sampling score with TAA gene expression and HLA-I expression levels (Methods). Remarkably, the mean presentation efficiency was higher in macro-regions of tumors classified as immune-low or CD3+CD8+ T cell excluded, and those of nonsmokers relative to inflamed-high, CD3+CD8+ T cell-infiltrated samples and smokers (Wilcoxon’s test P values of 0.0041, 0.045 and 0.27, respectively) (Fig. 6f–h). This suggests limited immune surveillance that may result in a rather more antigenic immunopeptidome landscape in cohort nonsmokers and CD3+CD8+ T cell-excluded tumors, and vice versa in smokers and infiltrated tumors.
a, Tumor-associated source genes from canonical and noncanonical sources (n = 893 genes), collectively named TAAs, expressed in any of the tumor macro-regions but not in the GTEx databases (GTEx ≤ 1 TPM, except in testis) and not in any of the adjacent healthy macro-regions (≤1 TPM) defined by Wilcoxon’s one-sided test P = 2.22 × 10−16. No adjustments were made for multiple comparison. b,c, Across patients, there was higher expression of presented-source TAAs in tumor macro-regions than in the adjacent healthy macro-regions (n = 29 TAAs) (b) and higher expression of nonpresented-source TAAs (n = 31 TAAs) (c). d,e, Presented-source TAAs (n = 31 TAAs) expressed more abundantly across CD3+CD8+ T cell-excluded macro-regions (nonpresented_excluded: n = 148; presented_excluded: n = 45; nonpresented_infiltrated: n = 86; presented_infiltrated: n = 22; n refers to aggregated TAAs expression per macro-region) (d) and presented mainly by HLA-I complexes (averaged across n = 41 HLA-I versus n = 43 HLA-II macro-regions, respectively; P = 1.7 × 10−8) (e). f–h, The presentation efficiency of TAAs seen as higher in macro-regions of tumors assigned as immune-low (n = 12 macro-regions) versus immune-high (n = 22 macro-regions) (f), nonsmokers (n = 17 macro-regions) versus smokers (n = 17 macro-regions) (g) and CD3+CD8+ T cell excluded (n = 20 macro-regions) versus infiltrated (n = 14 macro-regions) (h), with P values of 0.0041, 0.045 and 0.27, respectively. i, Heat map of source TAAs found to be presented exclusively in tumor macro-regions. Non-normalized log2(peptide intensity values) from the DIA analyses are shown. All statistical tests were performed as one-sided Wilcoxon’s nonparametric test.
We further retained source TAAs that were not found to be presented in any of the adjacent healthy macro-regions, resulting in 14 HLA-I and 4 HLA-II peptides (Fig. 6i). Ten HLA-I bound peptides derived from the melanoma-associated gene family sources MAGE-A1 and MAGE-A4, which are known to be expressed in many tumor types but not in normal tissues except for testis and placenta, were expressed and presented mainly in the CD3+CD8+ T cell-excluded LUSC tumors (that is, 02288 and 02289), supporting a previous study showing an association of MAGE-A4 expression in LUSCs compared with LUADs30. MAGE-A4 was the most abundantly expressed and presented TAA, from which six peptides in total were found in four patients and mostly in patient 02288. Furthermore, we found a new tumor-specific, noncanonical peptide in the tumor macro-regions of the CD3+CD8+ T cell-excluded and nonsmoker patient 02287, derived from the LINC02261 lncRNA.
Pruning of neoantigens from HLA-I presentation hotspots
We defined intratumor heterogeneity by calculating the prevalence of clonal mutations (observed in all macro-regions) and subclonal mutations (observed in a subset of the macro-regions) and inferred each tumor’s phylogeny (Fig. 7a)31. We found a positive correlation across TMB, expression of GrzB in tumors and the detection of smoking mutational signatures (Student’s t-test P values 1.3 × 10−6 and 0.13, respectively; Fig. 7b–d). Furthermore, we found that CD3+CD8+ T cell infiltration (in patients 03023, 02672 and 02290), as well as smoking mutational signatures (in patients 02671, 03023, 02672 and 02290), were significantly associated with higher fractions of truncal mutations (Student’s t-test P values of 0.0066 and 0.019, respectively; Fig. 7b,e,f). Indeed, Łuksza et al. demonstrated recently that rare long-term pancreatic cancer survivors, who had stronger T cell activity in their primary tumors, developed recurrent tumors with less genetic heterogeneity and fewer high-quality immunogenic neoantigens, despite having more time to accumulate mutations32. They modeled neoantigen quality by the antigenic distance required for a neoantigen to differentially bind to the HLA or activate a T cell compared with its wild-type peptide and by the similarity to known antigens (Fig. 7g). In our cohort, we found that the most prominent difference in the quality of neoantigens was found among the truncal and private mutations in the two infiltrated-high patients 03023 and 02672, in whom truncal mutations had lower quality (Fig. 7h–j and Supplementary Table 11). These are evidences of neoantigen-mediated immune editing resulting in truncal tumors in smokers and is consistent with earlier results33.
a, Phylogenetic trees based on all high-confidence mutations found across all regions per patient. b, The number of private, shared and truncal mutations in each patient plotted and fraction of truncal mutations calculated per patient (white numbers). For each patient, GrzB expression in tumor subregions based on mIF analysis and the defined CD3+CD8+ T cell infiltration status is indicated. Smoking status was defined based on deconvolution of the eight different mutational signatures and comparison to known mutational signatures from Alexandrov et al.62 with a threshold of >50% for tobacco smoking signature. c,d, Positive correlations found between the TMB and the smoking status (smokers n = 24 macro-regions; nonsmokers: n = 26 macro-regions; one-sided Student’s t-test P = 1.3 × 10−6) (c), as well as between the expression of GrzB in tumor subregions (smokers: n = 4 patients; nonsmokers: n = 4 patients; mIF, one-sided Student’s t-test P = 0.13) (d). e,f, A higher fraction of truncal (clonal) mutations was found to be significantly associated with smoking status (smokers: n = 4 patients; nonsmokers: n = 4 patients; one-sided Student’s t-test P = 0.019) (e) and with CD3+CD8+ T cell infiltration (infiltrated: n = 3 patients; excluded: n = 5 patients; one-sided Student’s t-test P = 0.0066) (f). g, Schematic overview of the predicted neoantigen quality model from Łuksza et al.32. h, Neoantigen quality score distributions of private and truncal mutations in each patient (02287: n = 99/121; 02288: n = 26/92; 02289: n = 79/130; 03421: n = 68/24; 02290: n = 21/225; 02671: n = 59/187; 02672: n = 38 of 489; 03023: n = 32/191 (private neoantigens/truncal neoantigens)). i,j, The ratio between the neoantigen quality of truncal versus private mutations in excluded and infiltrated tumors (excluded: n = 5 patients; infiltrated: n = 3 patients; boxplot lines show the mean) (i), as well as in nonsmokers (n = 4 patients) and smokers (n = 4 patients) (j). Unless indicated otherwise, all statistical tests were performed as one-sided Wilcoxon’s nonparametric test and boxplots show the median (line), the IQR between the 25th and 75th percentiles (box) and 1.5× the IQR ± the upper and lower quartiles, respectively. No adjustments were made for multiple testing.
By mining ipMSDB, a large collection of immunopeptidomics databases we acquired in recent years across a variety of tumor and healthy samples, we have previously observed that immunogenic mutated neoantigens accumulate in HLA-I presentation hotspots34, that is, regions in source proteins that are more frequently detected in immunopeptidomics datasets. Somatic mutations in these regions are therefore more likely to be presented than mutations in other regions or proteins that are rarely naturally presented. We theorized that, because of the immune-pressure taking place during tumor evolution, cells expressing mutations within HLA-I presentation hotspots will be more frequently eliminated. We predicted in silico HLA-I neoantigen binding to the respective HLA-I allotypes of each patient (rank <2%), and examined for each predicted mutated peptide whether its exact wild-type counterpart peptide was included in the HLA-I presentation hotspot in ipMSDB (Supplementary Table 11). We exemplify this concept in Fig. 8a. The predicted neoantigen covering EXOSC8E178K is an ‘exact’ HLA-I presentation hotspot mutation, whereas the predicted neoantigens IDH1K236N and IGFBP1H148Y do not have a matched ‘exact’ wild-type peptide in ipMSDB. As controls, for each patient we calculated the presence of ‘exact’ matches covering synonymous variants, because these variants are not expected to be affected by immune pressure (Fig. 8a). A higher fraction of ‘exact’ nonsynonymous-predicted neoantigens was found for CD3+CD8+ T cell-excluded tumors versus infiltrated, whereas no difference was found in the fraction of synonymous mutations (P = 0.001 and 0.8, respectively; Fig. 8b,c). We normalized the fraction of nonsynonymous mutations with the fraction of synonymous mutations per patient to eliminate any inherent bias related to the overall representation of the patient’s HLA alleles in ipMSDB. The normalized fractions of ‘exact’ matches almost reached significance (Fig. 8d). A significantly lower fraction of ‘exact’ nonsynonymous-predicted neoantigens was detected also in tumors of smokers (patients 02671, 02290 and 03023, yet not in 02672) relative to nonsmokers (P = 2.3 × 10−8, Fig. 8e), whereas no difference was found in the fraction of synonymous mutations (P = 0.14, Fig. 8f). The normalized fractions of ‘exact’ matches were still significantly lower among smokers (P = 9.6 × 10−5, Fig. 8g). These results suggest that excessive immune pressure in T cell-infiltrated tumors and smokers may have led to the development of tumors expressing relatively fewer neoantigens within HLA-I presentation hotspots.
a, EXOSC8E178K, an example of ‘exact’ HLA-I presentation hotspot neoantigen. IDH1K236N and IGFBP1H148Y are examples of ‘nonexact’. b, The fraction of predicted neoantigens with nonsynonymous mutations matching ‘exact’ wild-type peptides in ipMSDB that is significantly higher in excluded (n = 31 macro-regions) than in infiltrated (n = 17 macro-regions) tumors (P = 0.001). c, No difference found when considering predicted neoantigens with synonymous mutations (P = 0.8, n as in b). d, Enrichment of ‘exact’ neoantigens in excluded tumors of nonsynonymous versus synonymous mutations per patient (P = 0.054). e, The fraction of nonsynonymous ‘exact’ neoantigens shown to be significantly higher in nonsmokers (n = 24 macro-regions) than in smokers (n = 24 macro-regions; two macro-regions were excluded because of lack of neoantigens) (P= 3.1 × 10−8). f, No difference found when considering synonymous mutations (P = 0.2, n as above). g, In smokers versus nonsmokers, significant enrichment per patient (P = 4.3 × 10−6, n as above). h, Similar enrichment in immune-high (n = 38 samples), -low (n = 52 samples) and -mixed (n = 46 samples) tumors of the TRACERx cohort1. i, Mean expression of immune markers17 in TRACERx cohort grouped by smoking status1 (never-smokers: n = 11; ex-smokers: n = 73; recent ex-smokers: n = 48; current smokers: n = 10; n refers to samples). j, The enrichment per smoking status1. k, TRACERx cohort re-classified (light: n = 39; intermediate: n = 76; and heavy smokers: n = 21; n refers to samples), considering mutational signature of tobacco smoking and pack-years1. l, The enrichment in the refined classification. m, Probability of inducing spontaneous CD8+ T cell responses to ‘exact’ and ‘nonexact’ neoantigens calculated using Gartner et al.’s cohort of validated immunogenic mutations35. n, Parameters used to calculate the relative immunogenicity per macro-region. o, The relative immunogenicity of our eight patients. p,q, Relative immunogenicity shown to be higher in nonsmokers (n = 24) versus smokers (n = 24) (p) and in excluded (n = 31) versus infiltrated tumors (n = 17) (q), P = 2.3 × 10−8 and 0.001, respectively (n refers to macro-regions). One-sided Wilcoxon’s nonparametric test was used for b–g, p and q and one-sided Student’s t-test for h–j and l. Boxplots show the median (line), IQR between the 25th and 75th percentiles (box) and 1.5× the IQR ± the upper and lower quartiles. No multiple testing adjustments were made.
To validate these results, we first analyzed samples from 63 patients from the TRACERx lung cancer cohort for which both WES and RNA-seq data were published by Rosenthal et al.1 (Methods). Initially, we directly used the immune score classification reported by Rosenthal et al.1, who also used the Danaher et al. method17 to estimate immune cell populations. With this larger dataset, we again found a higher fraction of ‘exact’ neoantigen matches (enrichment of nonsynonymous/synonymous) in tumors classified as having a low immune score compared with high immune score tumors (Student’s t-test P = 0.026; Fig. 8h and Supplementary Table 12). Furthermore, as expected, T cells, exhausted CD8+ T cells and cytotoxic cells were positively associated with the smoking status documented for these patients (Fig. 8i). Remarkably, a higher enrichment of nonsynonymous/synonymous ‘exact’ matches was observed for never-smokers compared with smokers (Student’s t-test P = 0.054; Fig. 8j). In addition, when we re-classified the patients into ‘light’, ‘intermediate’ and ‘heavy smokers’, according to the cumulative smoking severity, considering both the level of mutational signature of tobacco smoking and pack-years, we found a significantly higher enrichment of nonsynonymous/synonymous ‘exact’ matches in the ‘light’ group (Student’s t-test P = 0.02; Fig. 8k,l).
Finally, to assess to what extent predicted mutated neoantigens matching ‘exact’ peptide sequences in ipMSDB can mediate spontaneous CD8+ T cell responses in patients, we reanalyzed a large dataset published recently by Gartner et al.35, where immunogenicity was assessed by the mini-gene screening approach for thousands of mutations in tens of patients across tumor types. Importantly, this screening method is unbiased because it is not dependent on HLA-binding affinity prediction and, in addition, immunopeptidomics and HLA presentation hotspots information were not considered as selection criteria and therefore could not bias the results. We downloaded data for 77 patients, for which WES, RNA-seq and at least one confirmed immunogenic mutation were available. We analyzed the WES and RNA-seq datasets and flagged the mutations as: ‘immunogenic’, ‘nonimmunogenic’ and ‘not tested’ by the mini-gene approaches (when applicable and as reported by Gartner et al.35). We found that mutations predicted to be covered with at least one ‘exact’ match neoantigen have a fivefold higher probability of inducing spontaneous CD8+ T cell responses compared with all other mutations (Fig. 8m). We therefore derived the probabilities of a mutation being immunogenic, with Pexact = 0.0195 and Pnonexact = 0.00392, and with these probabilities we calculated the relative immunogenicity for each macro-region of our eight patients (see Methods for more details; Fig. 8n). After normalizing for the total number of mutations, the relative immunogenicity of tumors was higher in the nonsmokers than in the smokers, and higher in CD3+CD8+ T cell-excluded than in CD3+CD8+ T cell-infiltrated tumors (Student’s t-test P = 2.3 × 10−8 and 0.001, respectively, Fig. 8o–q). These results support our conclusion that ‘exact’ neoantigens are associated with CD3+CD8+ T cell-mediated recognition and that the lower fraction of ‘exact’ matches in smokers is associated with immune editing.
Discussion
A key barrier for improving efficacy of advanced personalized immunotherapies that are tailored to specific tumor antigens or the patient’s mutanome, such as neoantigen cancer vaccines and adoptive transfer of neoantigen-enriched T cells, remains patient stratification and the characterization of the antigenic landscape. We therefore aimed to deeply characterize the tumor antigenic landscape and the TME using multiple -omics and imaging approaches. Characterization of the TME from bulk RNA-seq data in lung cancer tissues is challenging, not only in the small cohort we studied here, but also in larger cohorts of tens of samples, as reported by Rosenthal et al.1, where lung cancer samples with high inflammation scores were finally classified by pathologists as having low infiltration of cytotoxic T cells and vice versa. Technical variability related to sampling of mirrored formalin-fixed paraffin-embedded (FFPE) tissue sections for staining, and snap-frozen tissues for RNA extraction, which may also include variable amounts of adjacent nonmalignant lung tissue, as well as the natural wide tissue heterogeneity, can be sources of such discrepancies. To overcome this, we applied mIF imaging techniques in combination with GeoMx spatial transcriptome analyses to define niches in the tissues. This approach facilitated the annotation of the samples in a 2D space. On the horizontal axis we ordered the patients on the scale of CD3+CD8+ T cell infiltration as excluded and infiltrated, and on the vertical axis we ordered them based on overall inflammation level indicative of immune-low and -high tumors. Importantly, mIF and GeoMx data were generated for one macro-region per patient, whereas bulk RNA-seq was done on all macro-regions. However, as the bulk RNA-seq approach was inconsistent for defining the immune compartment using the immunoscore, we did not focus on studying variability between macro-regions of each patient, and instead we compared the groups of patients, considering the different macro-regions as multiple biological replicates per patient.
TAAs were rarely found to be presented by HLA-II complexes. In addition, HLA-II molecules were found to be expressed directly by tumor cells only in samples 03421 and 02672. We therefore hypothesized that the HLA-II peptidome could represent the tumor-immune compartment. Higher or similar gene expression of the HLA-II machinery was found in stroma and tumor micro-regions of T cell-infiltrated samples, whereas in excluded samples, as expected, the machinery was more abundant in the stroma than in the tumor. Activated anti-tumor CD3+CD8+ T cells secrete interferon-γ that enhances HLA-II expression on neighboring cells in the TME. Hence, insights into the composition of the immune compartment can be uniquely captured by the HLA-II peptidome. We demonstrated that CD8+ and CD4+ cells were represented in the HLA-II immunopeptidome and even more profoundly in their activated states, specifically in tumors annotated as CD3+CD8+ T cell infiltrated and in smokers, whereas the presentation of activated B cells and dendritic cells was associated with overall high inflammation. It is interesting that, from the HLA-II presentation level of the source genes that were found to correlate most strongly with different immune cell subtypes in stroma or tumor micro-regions, the presence of CD3+CD8+ T cells, cytotoxic and exhausted cells in tumor micro-regions distinguished excluded-high and infiltrated-high samples. We have revealed that the HLA-II peptidome was found to capture the presence and activation of immune cells in the TME. Furthermore, we demonstrated associated presentation of several HLA-II peptides with T cell infiltration or inflammation. Therefore, if validated in a larger cohort, the repertoire of HLA-II peptides derived from immune-related genes should allow the classification of a TME. It may help the design of peptide-specific therapeutic modalities by revealing potential tumor-specific targets and reflecting the anti-tumor immune activation state.
So far, it was unclear whether CD3+CD8+ T cell-excluded tumors express and present TAAs to the same extent as infiltrated tumors. From our results in eight patients with lung cancer, we concluded that, rather unexpectedly, CD3+CD8+ T cell-excluded tumors express TAAs more abundantly and they have a higher presentation efficiency of TAAs.
Furthermore, we found that the most prominent difference in the quality of neoantigens32 was present in infiltrated-high tumors, where truncal mutations had a lower quality. In infiltrated tumors and smokers, mutations were probably edited during tumor evolution11. In addition, a significantly higher frequency of predicted neoantigen sequences within HLA-I presentation hotspots was detected in the excluded tumors and in nonsmokers, potentially due to the absence of immune surveillance. This was further validated in the TRACERx cohort. We further demonstrated that the probability to induce spontaneous CD8+ T cell responses against mutations predicted to be covered with at least one ‘exact’ match neoantigen was about fivefold higher compared with mutations covered by ‘nonexact’ predicted neoantigens. Accordingly, in our cohort, the relative immunogenicity of tumors was higher in the nonsmokers and CD3+CD8+ T cell-excluded tumors than in the smokers and T cell-infiltrated tumors, respectively. We therefore propose that accumulation of mutations in presentation hotspots reflects limited immune pressure and lower infiltration of T cells, leading to development of rather heterogeneous and branched tumors.
Nonsmoker patients with lung cancer respond poorly to ICB6 and it has been suggested that the low responsiveness is associated with low TMB and lower expression of PD-L1. However, our results from the present study suggest that, even when low in number, neoantigens in nonsmokers and CD3+CD8+ T cell-excluded tumors have potentially a better chance to be presented to T cells. Consequently, adoptive transfer of neoantigen-enriched autologous T cells, in combination with immune modulators that can revert inhibitory signals in the TME and facilitate homing and persistence of the T cells, could potentially have a therapeutic impact. On the other hand, in CD3+CD8+ T cell-infiltrated tumors or smokers, too few immunogenic tumor antigens may be presented probably due to prolonged immune editing. In this case, additional therapeutic interventions, for example, epigenetic modulation, targeted therapy, DNA-damaging chemotherapy, irradiation or even hypoxia-inducing anti-angiogenesis therapy, may be needed to induce the expression of new tumor-specific antigens. An integrated exploration of the tumor antigenic landscape and the TME composition would advance the development of personalized immunotherapies that are more effective by tailoring them to clinically relevant tumor antigens for each patient, and identifying which patients are most likely to benefit from these treatments.