Xinqidi,Your biological research partner
Xinqidi Biotech Co.,Ltd,Wuhan,China 2008-2021
R&D 13th year

Chromatin-based, in cis and in trans regulatory rewiring underpins distinct oncogenic transcriptomes

Issuing time:2021-09-18 09:38

Abstract

Multiple myeloma is a genetically heterogeneous cancer of the bone marrow plasma cells (PC). Distinct myeloma transcriptome profiles are primarily driven by myeloma initiating events (MIE) and converge into a mutually exclusive overexpression of the CCND1 and CCND2 oncogenes. Here, with reference to their normal counterparts, we find that myeloma PC enhanced chromatin accessibility combined with paired transcriptome profiling can classify MIE-defined genetic subgroups. Across and within different MM genetic subgroups, we ascribe regulation of genes and pathways critical for myeloma biology to unique or shared, developmentally activated or de novo formed candidate enhancers. Such enhancers co-opt recruitment of existing transcription factors, which although not transcriptionally deregulated per se, organise aberrant gene regulatory networks that help identify myeloma cell dependencies with prognostic impact. Finally, we identify and validate the critical super-enhancer that regulates ectopic expression of CCND2 in a subset of patients with MM and in chronic lymphocytic leukemia.

Introduction

Multiple myeloma (MM) is a common, genetically heterogeneous and incurable cancer of the bone marrow (BM) plasma cells (PC)1, the terminally differentiated, immunoglobulin-secreting B lineage cells. The first level of genetic heterogeneity in MM is imparted by well-defined myeloma-initiating events (MIE) that are associated with distinct transcriptome profiles. In nearly half of MM cases, MIE include overexpression of the oncogenes CCND1, MAF and MMSET by their juxtaposition to the immunoglobulin heavy chain (IgH) enhancer, thus defining the t(11;14), t(14;16) and t(4;14) cytogenetic subgroups, respectively. Hyperdiploidy (HD), a functionally heterogeneous subgroup characterised by additional odd number chromosomes, is the MIE in the rest of MM cases2. Secondary events comprising copy number aberrations, single nucleotide variants and indels generate additional genetic heterogeneity and further shape the distinct impact of the MIE on oncogenic transcriptomes3. This heterogeneity converges, in most cases, to a functionally dichotomous, mutually exclusive overexpression of the cell cycle regulators CCND1 and CCND2 to which myeloma PC remain addicted, irrespective of primary or secondary genetic events4,5. The transcriptional mechanisms that result in CCND2 over-expression, seen in nearly 50% of MM cases and spanning all genetic subgroups except t(11;14), are not fully known. A previous study of a single myeloma cell line identified but not validated a super-enhancer that spans the promoter of CCND2, leaving the possibility of a distal enhancer/super-enhancer regulating transcription of CCND2 unexplored6.

Chromatin accessibility profiling by ATAC-seq has been used to characterise the regulatory landscape of hundreds of different solid tumour and blood cancers, such as chronic lymphocytic leukaemia (CLL)7,8. Further, by means of transcription factor (TF) footprinting, ATAC-seq allows inference of TF binding profiles. This, in combination with paired transcriptome profiles, enables the construction of gene regulatory networks8,9. Such networks may identify TFs with previously unrecognised roles in the biology of a given cancer.

In previous studies10,11 of chromatin-based regulatory changes of gene expression, myeloma was treated as a homogeneous cancer. However, how heterogeneity links chromatin accessibility and regulatory status with distinct oncogenic gene expression profiles has not been elucidated, while global insights into the interplay between in cis and in trans regulatory factors of gene transcription in MM are limited.

Here, by integrating chromatin accessibility dynamics of myeloma PC with their respective transcriptome profiles and other epigenetic datasets we resolve the chromatin changes that regulate distinct oncogenic transcriptomes and biological pathways. Through this process we discover cis- and trans-regulators of myeloma biology, including those involved in the regulation and aberrant expression of CCND2 in MM.

Results

Enhanced accessibility of distal chromatin elements is associated with gene over-expression in myeloma plasma cells

We isolated fresh, highly purified BM (CD19+ /−) PC from three healthy normal donors (ND) and myeloma PC from 30 MM patients, covering the main MIE subgroups as defined by fluorescent in situ hybridization (Fig. 1a, Supplementary Fig. 1a and Supplementary Data 1) and spanning both diagnostic and relapsed stages of the disease. For each sample, we obtained paired chromatin accessibility and transcriptome profiles by ATAC-seq and RNA-seq, respectively.

Fig. 1: Over-accessible distal chromatin classifies myeloma genetic subgroups.

a Study patient population and design. Transcriptome and chromatin accessibility were assessed by RNA-seq and ATAC-seq respectively in myeloma PC from 30 MM patients and PC from three healthy donors. For two of the controls we obtained samples of both CD19+ and CD19 PC. b Changes in average ATAC-seq signal over pan-myeloma (left) and subgroup (right) peaks expressed as normalized log2 myeloma/normal PC read count. Significant changes in the pan-myleoma plot (left) are shown in a darker colour. c Logistic regression of number of differentially accessible regions (DAR) within 500 kb of a gene (x-axis), whether a gene has a significantly more open promoter in myeloma (red – yes, black – no) on the probability of the gene being differentially upregulated in myeloma. Points represent fraction of genes upregulated with a given number of DAR within 500 kb and TSS status, and lines represent model fit, grey ribbon 95% confidence interval. Genes with 10 or more DAR were pooled together. d Correlation of signal for differentially accessible ATAC-seq regions and differentially expressed genes within the same topology associating domain (TAD). eVariance explained by the first 17 latent factors (LF) for each of RNA-seq and ATAC-seq signals as calculated by multi-omics factor analysis (MOFA). f, g LF scores for each sample for the first five latent factors in the MOFA model. Subtype for each sample is denoted by colour, as per (a). LF1 and LF2 distinguish normal from MM samples, LF5 distinguishes MAF and CCND1 samples from the rest. h LF scores for a subset of regions and genes used as predictors by MOFA analysis and previously identified as MM-subgroup classifiers12. The main driving oncogenes MAF, CCND1, CCND2 and NSD2are highlighted here.

ATAC-seq analysis generated 295,238 chromatin accessibility peaks from all samples (Supplementary Fig. 1b). These peaks were non-randomly distributed on promoters, coding and intergenic regions, and were found to be unique or shared between two or more genetic subgroups (Supplementary Fig. 1b and c). In each genetic subgroup and overall, chromatin accessibility of myeloma PC was enhanced in comparison to ND PC (Fig. 1b and Supplementary Data 2).

Transcriptome profiling identified 3036 differentially expressed genes (DEG) between myeloma and ND PC, including overexpression of known MIE-driven oncogenes in myeloma PC (Supplementary Fig. 1d and Supplementary Data 3). In addition, using genesets previously identified as classifiers for each myeloma genetic subgroup and the corresponding MIE12, we clustered the transcriptome profiles of our samples along with those of 892 myeloma PC samples from the MMRF CoMMpass study (Supplementary Fig. 1e). This analysis confirmed the accuracy of genetic subgroup annotation of our cohort samples by FISH, and assisted in the classification of samples with unknown cytogenetics status.

Malignant vs normal state analysis showed that over-expressed genes were associated with significantly more non-TSS peaks with increased accessibility than non-overexpressed genes (Supplementary Fig. 1f), and the distance from over-expressed genes to the closest such peak was shorter (Supplementary Fig. 1g). Combined logistic regression analysis revealed that the number of more accessible non-TSS regions was predictive of over-expression (p < 2 × 10−16) but a more open promoter was not (p = 0.5; Fig. 1c), although gain in TSS accessibility was mildly predictive when ignoring distal peaks (over expression odds ratio of 1.3 for TSS accessibility gain vs not, p = 0.01).

Using the 3D genome architecture of GM12878 B cells as reference13, we found a significant enrichment for differentially accessible regions (DARs) correlating with DEG within the same topologically associating domain (TAD)14 as compared to permuted DEG-DAR associations (p < 0.001, see Fig. 1d). We also found that the strength of correlations, i.e., mean DAR-DEG correlation coefficient and the percentage (%) of DAR-DEGs that show significant correlation (p < 0.05), is higher when DEG-DAR pairs are in the same TAD compared to distance matched controls not in the same TAD (p = 0.003, Supplementary Fig. 1h and i).

Thus, myelomagenesis leads to an overall increase in chromatin accessibility and, within the spatial limits of TADs, chromatin decondensation at distal regions rather than at promoters is associated with gene over-expression in the cancer state.

Over-accessible chromatin in myeloma PC partially distinguishes distinct myeloma transcriptomes according to MIE

Next, we sought to explore and potentially resolve the heterogeneity present across MM patients in an unsupervised manner, by employing multi-omics factor analysis (MOFA)15(Fig. 1e and f, Supplementary Fig. 2a). We found that within the combined ATAC-seq/RNA-seq model, chromatin accessibility accounted for more variance than expression (Fig. 1e). Of the top five identified MOFA latent factors (LF), the mostly chromatin accessibility-driven LF1 and the mainly transcriptome-driven LF2 distinguished ND from myeloma PC; the next three factors, LF3-5 separate different MM patients from another. To investigate possible sources of this heterogeneity, we superimposed the MIE-defined genetic subgroup characterisation of each sample. This revealed a clear separation of ND, MAF and CCND1 subgroups and less so of HD and MMSET by combined LF2, 3 and 5 (Fig. 1f and g).

By interrogating further the MOFA predictors underlying the observed segregation, we found that genes, previously identified as MIE-defined genetic subgroup classifiers, along with their linked chromatin changes delimited in the same TAD, displayed the same subgroup-segregation profile on those three LF (Fig. 1h). We validated this using two MAF-, two CCND1- and one MMSET-translocated myeloma cell lines as a test set not used in training the model, and confirmed separation of samples according to MIE in the LF2-LF3-LF5 MOFA space (Supplementary Fig. 2b–d).

To explore the properties of MOFA clustering further, we applied silhouette width (i.e., a measurement of cluster cohesion and separation) and Discriminant Ratio (i.e., the ratio of between group variance to within group variance; see “Methods”). Silhouette width showed that the combination of accessibility and transcriptome information of LF1-5 provides a clear separation of ND, MAF and CCND1 but not HD and MMSET subgroups, and this separation was clearer than in the model built using RNA-seq alone (Supplementary Fig. 2e).

When the discriminant ratio was applied to measure the ability of each LF to discriminate subgroups, LF5 had the highest ratio at 22.7. To understand how much discriminatory power ATAC-seq data was adding to the model, we trained a second model using the transcriptome data only. The best discriminating factor in this model was LF4, with a discriminant ratio of 11.3, showing that the addition of chromatin accessibility data almost doubled the power of the most discriminatory factor to separate subgroups. Applying linear discriminant analysis to each model (Supplementary Fig. 2f), showed the subtypes are well separated in the combined ATAC-RNA model, with the exception of one MMSET sample located with the HD samples, and one HD sample located with the CCND1 samples. Conversely, for the RNA-only samples, while MAF and ND samples are well separated, the HD, MMSET and CCND1 samples are largely overlapping.

Since there was no difference between CD19+ and CD19 ND PC cells, they were merged for subsequent analyses.

Given these findings, we conclude that combined epigenome–transcriptome-based categorisation of MM maps away myeloma PC from normal PC while within the main MM genetic subgroups it clearly delineates the MAF and CCND1 subgroups and less so the HD and MMSET subgroups.

The myeloma enhanceome is linked to known and novel MIE-associated genes and biological pathways

To expand our ability to delineate heterogeneity and given the importance of distal genetic elements into the regulation of MM transcriptomes, we next sought to identify the changes that together comprise the myeloma-specific enhanceome using a genetic subgroup supervised approach. We identified distal regions and genes where genetic subgroup (i.e., MAF/MMSET/CCND1/HD/ND) is a significant explanatory variable for accessibility/expression using an omnibus LRT test to maximise power compared to pairwise testing (adjusted LRT p-value <0.05 and >2-fold change between at least one MIE and ND, Supplementary Data 3). Integration of these two sets (4635 regions and 3,096 genes) gave 4199 DAR–DEG pairs within 1 Mb of one another (Supplementary Data 4), comprising 2581 unique DAR and 1354 unique DEGs. Hierarchical clustering of the DAR accessibility signal clearly separated the different myeloma MIE subgroups from each other (Fig. 2a). While this showed a group of regions where changes correlated with MIE, it also highlighted accessibility changes that are shared between different myeloma subgroups. The MAF subgroup for instance, demonstrated the highest number of distal DAR that are >2-fold different from ND, some of which were unique to MAF while others were also >2-fold more accessible in other genetic subgroups. To validate this further, we clustered myeloma PC (n = 26) from a previous study11 using these same 2581 regions, and confirmed that these samples mostly clustered with samples carrying the same MIE from this study (Supplementary Fig. 3a).

Fig. 2: Developmental origins and oncogenic pathways regulated by the myeloma enhanceome.

a Heatmap showing the ATAC-seq signal for all peaks found to be differentially open and within 1 Mb of a significantly differentially regulated gene. Data are row scaled. Samples are clustered using Pearson’s correlation distance and the bar above the columns shows the subtype of the sample coded as in Fig. 1a, i.e., MAF: orange, CCND1: light blue, MMSET: green, HD: blue, others: pink and ND PC: black/grey. Vertical bars to the right highlight regions where signal is >2-fold different to ND PC. b Example region around HGF showing: average ATAC-seq signal in each subtype; H3K27ac signal in a MAF-translocated cell line (orange) or CCND1-translocated cell line (blue); FP density: density of footprints in each of the ATAC-seq signals. c, d Enrichment analysis for upregulated genes with a peak of increased accessibility within 1 Mb using gene sets from the “Oncogenic signatures”, “Hallmarks” or “Curated gene sets” subsets of the MSigDB database. e Chromatin state of differentially open pan-MM peaks across a developmental range of B-cell types as determined by the Blueprint Epigenomics consortium using ChromHMM (nB naive B cells, GCB germinal center B cell, mB memory B cell, tPC tonsil PC, MM myeloma PC). Enhancers with de novo formed peaks in myeloma (254) are indicated. Chromatin states with strong combined H3K27ac and H3K4me1 signals were considered as active enhancers and indicated by an asterisk. f Radial plot of Homer motif analysis displaying the top 50 over-represented TF motifs in de novo myeloma enhancers.

Of note, 977 of the 2581 (38%) DAR were proximal to more than one DEG, while 962 of 1354 DEG (71%) were within 1 Mb of more than one DAR (Supplementary Data 4), and often in more than one genetic subgroup. These findings suggest that diverse MIE functionally converges to aberrantly regulate the same regions of chromatin, a process consistent with chromatin accessibility-based convergence evolution.

At least 14 of the DEG linked to distal DAR have been implicated in myeloma pathogenesis, including HGF16, DKK117 and UCHL118 (Fig. 2b, Supplementary Fig. 3b and c and Supplementary Data 4). In two myeloma cell lines studied, these same regions are marked by H3K27ac, a histone hallmark of active enhancers, and therefore in terms of chromatin status they can be considered as bona fide enhancers.

The 1354 unique DEGs linked to 2581 unique DAR showed overall enrichment for previously defined myeloma transcriptional signatures (Supplementary Fig. 4a), and amongst other pathways, they were also notably enriched for the oncogenic RAS pathway, activated in 40% of MM patients3,19 (Fig. 2c and Supplementary Data 5a). There was also enrichment of the Hedgehog pathway, previously implicated in the regulation of CCND1 and CCND220 (Fig. 2c) an over-representation of genes marked by H3K27me3 and Polycomb repressive complex components in several cell types (Supplementary Fig. 4b, Supplementary Data 5b) and enrichment for genes selectively expressed in neuronal cell types, particularly in the HD and MMSET subgroups (Supplementary Fig. 4b and c, Supplementary Data 5c).

Finally, enrichment of previously validated MIE geneset classifiers was highest in their corresponding genetic subgroups (Fig. 2d).

Together, these findings suggest that DAR with regulatory potential are to a large extend shaped by MIE, although secondary genetic events are also likely to contribute.

Developmentally ‘re-commissioned’ and de novo formed enhancers in MM

Next, we sought to gain insights into the developmental origins of the MM over-accessible candidate enhancers linked to DEG. For this purpose, we tracked the candidate enhancer chromatin status, as defined by combinatorial enrichment of histone marks (ChromHMM states21), across different mature B lineage cells22 (Fig. 2e and Supplementary Data 6).

Considering that an active enhancer requires combined enrichment for H3K27ac with H3K4me123, we identified 254 (out of 832) DAR with predicted regulatory activity over 201 DEG within the same TAD that are only present in myeloma and not normal PC (Fig. 2e and Supplementary Data 6), i.e., they are de novo formed (e.g., the enhancers predicted to regulate HGF and UCHL1 shown above). TF motif enrichment analysis of these 254 DAR identified IRF and MEF families of TF amongst others as possible leading transcriptional regulators of their activity (Fig. 2f). In addition, a smaller number of DAR are ‘re-commissioned’ in myeloma PC, i.e., active in one or more B lineage cells but inactive in ND PC (Supplementary Data 6).

Therefore, biochemical annotation of the distal over-accessible chromatin profile identifies enhancers that are myeloma PC unique or developmentally inherited.

TF ‘rewiring’ reveals myeloma dependencies with prognostic impact

Next, we employed ATAC-seq footprinting24 to identify the predicted association of DNA binding factors with chromatin across myeloma and ND PC. In total, 138 of 254 expressed TF (Fig. 3a and Supplementary Data 7a–c) displayed higher or similar predicted binding frequency in at least one myeloma subgroup compared to ND PC, and included TF such as XBP1, IRF4 and PRDM1 known to regulate myeloma transcriptomes, but also TF such as CBFB and ZNF384 which have not been previously linked to myeloma biology (Fig. 3a, b and c). Another 116 TFs were predicted to bind to chromatin in at least one MM subgroup but not in ND PC, including established, subgroup-specific oncogenic drivers (e.g., MAF). Almost one third of these 116 TF were predicted to be exclusively active in individual subgroups, with ISL2, a neural TF25 not previously linked to MM, showing activity solely in the HD subgroup (Fig. 3a–c, and Supplementary Data 7b).

Fig. 3: ‘Rewiring’ of transcription factors underpins aberrant regulatory gene networks in MM.

a Heatmap representation of the relative frequency of TF footprints across different MM genetic subgroups. TF are clustered based on their presence in both normal donor and myeloma PC (top) or in myeloma PC only (bottom). Grey values indicate lack of TF expression and/or predicted binding. b Footprints of indicated TFs identified in HD myeloma PC as determined by ATAC-seq. c Difference in relative footprint frequency of active TFs between different myeloma subgroups and normal donor PC. d TF dependency analysis using high-throughput CRISPR screens data from DepMap database (20 myeloma cell lines representing CCND1, MAF and MMSET genetic subgroups; colour-coded). Of the 247 TF shown, 137 are predicted to generate dependency, the top 100 of which are shown here. Examples of established and novel (bold) dependencies are indicated. In this analysis, dependency is defined as CERES score < −0.2 (horizontal dotted line) in at least 4/20 myeloma cell lines. e) TF that are differentially expressed between ND PC and myeloma genetic subgroup PC.

CRISPR/Cas9 screens involving depletion of 247/254 TF as retrieved from the DepMap database, suggested myeloma cell dependency on 55% (137/247) of TF in at least 4/20 myeloma cell lines at a CERES score of <−0.2 (Fig. 3d and Supplementary Data 7d), and confirmed MM cell dependency on the TF CBFB and ZNF384 identified by ATAC-seq footprinting (Fig. 3c and d). Interestingly only 13% (18/137) of these TF are differentially expressed in one or more myeloma subgroups compared to ND PC (Fig. 3e). This is consistent with a pattern of TF activity ‘re-wiring’ in myeloma PC that does not necessarily require transcriptional deregulation of the TF themselves.

To define this ‘re-wiring’ of TF further, for each MM subgroup and ND PC we built TF regulatory gene networks based on the weighted frequency of binding and level of expression (Fig. 4a). In general, compared to ND PC, we observed a higher number of active TF in all myeloma subgroups. These formed higher density regulatory connections with other TF and displayed auto-regulatory loops (Supplementary Fig. 5a and Supplementary Data 7e), commensurate with increased binding frequency for >90% of TF in each myeloma subgroup (Supplementary Fig. 5b). For example, the established myeloma cell dependencies and PC lineage-defining TF IRF4, XBP1 and PRDM1, are also predicted to be connected to a higher number of other TF in HD, MMSET and CCND1 subgroups than in ND PC, while TF of the MEF family that have not been linked to myeloma biology before and are predicted to regulate de novo formed myeloma enhancers (Fig. 2f) display higher connectivity and betweenness (centrality). A high level of myeloma cell line dependency to MEF2C in the DepMap CRISPR/Cas9 screen (Fig. 3d), highlights an important role of this TF in the biology of MM, least because of its inferred role in activating transcription of TF such as IRF4, XBP1 and PRDM1 (Supplementary Data 7f).

Fig. 4: TF regulatory gene networks provide biological and clinical insights into MM disease.
figure4

a TF regulatory gene networks per myeloma subgroup and normal donor PC, as inferred from footprinting analysis. TF are weighed by relative TF binding frequency (colour) and expression (node size). b Heatmap of ranked expression of CXXC1, BPTF, MAZ, KLF13, CBFB and RFX5 across >1000 human cancer cell lines (CCLE dataset). Multiple myeloma cell lines are highlighted in red. Bar plot depicts the number of cell lines per cancer group. c MM patient stratification based on CXXC1, BPTF, MAZ, KLF13, CBFB and RFX5 expression (red, high; blue, low) and analysis of overall survival using the Multiple Myeloma Arkansas (n = 414) and the MMRF Compass Dataset (n = 745). HR hazard ratio.

Focusing on the MAF-translocated subgroup, we performed ChIP-seq to obtain the cistrome of oncogenic MAF in the MAF-translocated myeloma cell line MM.1S (Supplementary Fig. 5c). MAF binding was predominantly enriched at TSS/promoters and intergenic areas (Supplementary Fig. 5d). Motif analysis performed on the MAF binding sites identified significant enrichment of the MAF motif (Supplementary Fig. 5e) and motifs of IRF1-4, NRF2/NFE2L2, ATF3, BACH1 TFs (Supplementary Fig. 5f), which were also predicted to be active within the MAF subgroup gene regulatory network (Supplementary Data 9b).

Six TFs (CXXC1, BPTF, MAZ, KLF13, CBFB and RFX5) were identified as showing higher connectivity in all myeloma subgroups compared to ND PC (Supplementary Data 7e), thus exemplifying the process of existing TF ‘re-wiring’ in MM. These TF, none of which have been previously linked to MM, share another three notable features: (a) they demonstrate dependency on DepMap upon CRISPR/Cas9 and siRNA screens (Fig. 3d and Supplementary Fig. 5g), (b) display highest expression in multiple myeloma cell lines compared to >1000 other cancer cell lines; and (c) their higher expression has a significant adverse impact on survival in two independent myeloma patient cohort datasets (Fig. 4b and c). In addition, myeloma cell dependency on CXXC1 was further validated by independent shRNA knockdown experiments (Supplementary Fig. 5h). Together, this multi-layered approach reveals TF dependencies and prognostic variables in MM.

Identification and characterisation of the CCND2 super-enhancer

Furthermore, we sought to identify and characterise the regulatory mechanisms of CCND2overexpression in MM.

The LF5 of MOFA analysis completely separated MAF-translocated from CCND1-translocated samples (Fig. 1f and g), placing extreme opposite weights on the expression of CCND2 and CCND1, respectively (Fig. 5a). LF5 also places high importance on a set of open-chromatin regions upstream of CCND2, linking them to enhanced expression of CCND2 and transcripts upstream, but not downstream, of CCND2 (Fig. 5b). Accessibility of this region correlates with CCND2 activity irrespective of MIE (Fig. 5c). Further, super-enhancer calling using H3K27ac and MED1 chromatin marks in MAF-translocated MM.1S myeloma cells identified the region of interest as a bona fide super-enhancer by both markers (Fig. 5d and Supplementary Fig. 6a).

Fig. 5: A super-enhancer regulates transcription of CCND2 in myeloma.
figure5

a Inverse correlation between expression of CCND1 and CCND2 across all MM and normal PC samples. The best linear regression fit is indicated as a blue line with standard error in grey. bLoadings from MOFA analysis for 5 regions and 6 genes within 1 Mb of CCND2 that were selected for MOFA analysis for latent factors 1–5. Both regions and genes ordered in a 5’->3’ direction. c Average chromatin accessibility of regions upstream of CCND2 and CCND2 gene expression across all samples in this study. d Genomic tracks visualization of the CCND2 region. From top to bottom: Hi-C signal for interactions with CCND2 promoter in GM12878 B cells; target location of sgRNAs designed for the CRISPRi experiment; Predicted footprints for MAF TF in MAF-translocated (orange) and MMSET-translocated (green) samples; ChIP-seq signal against MAF, H3K27ac and MED1 in MM.1S cells; normalised ATAC-signal in each subtype, ordered by mean CCND2 expression. e CCND2 expression as assessed by RNA-seq in samples across different MM subgroups and normal donor PC. Subgroups as indicted in adjacent tracks in (d). Boxplots display all values as points, whisker’s box (min to max) and mean expression per subgroup. f CCND2 expression as assessed by qPCR 4 days after CRISPRi of CCND2 super-enhancer and promoter regions. Two sgRNAs were employed to target the promoter (P) and each of four major peaks (1–4) of the CCND2 super-enhancer, as shown in (d), and compared with a non-targeting control (Gal4). Error bars represent mean + SEM for three independent experiments. Statistical analysis: one-way ANOVA with Dunnett’s post-hoc multiple comparisons correction. *p < 0.05, ***p < 0.001, ****p < 0.0001. gHeatmap illustration of TF motifs significantly enriched in CCDN2high versus CCND2low HD samples, as identified by differential footprinting analysis. Differential footprints were identified in peaks 3 and 4 (see Fig. 5d).

Consistent with it being a ‘re-commissioned’ enhancer of CCND2, the region of interest is Polycomb repressed in GCB and PC but active in naïve and memory B cells (Supplementary Fig. 6b); accordingly, CCND2 is expressed in naïve and memory B cells but not GCB cells or PC (Supplementary Fig. 6c).

Using Hi-C genome data from GM12878 B cells13, we identified exclusive, long-range interactions of the CCND2 promoter with the upstream accessible clusters of the putative enhancer (Fig. 5d and Supplementary Fig. 6d). In a complementary approach, we employed KRAB-dCAS9 CRISPRi in MAF-translocated myeloma cells to repress the activity of four prominent constituent peaks 1–4 (Fig. 5d) which engage in high-frequency interactions with the CCND2 promoter. As expected, targeting a promoter accessibility peak resulted in a significant decrease in CCND2 expression, while in the CCND2 enhancer, the most pronounced effect, similar to that of the promoter peak, was conferred by targeting the proximal peak 4 and distal peak 1 accessibility regions (Fig. 5f). Notably, accessible peaks 1 and 4, but not others in between, are Polycomb-repressed in GCB and PC but active in naïve and memory B cells (Supplementary Fig. 6b). Thus, the relative importance of peaks 1 and 4 is also validated from a developmental perspective.

Having dissected the in cis regulatory mechanisms of CCND2 expression, we proceeded with the characterisation of trans factors involved in this process.

Previous work showed that MAF binds to CCND2 promoter in vitro26, providing some insight into how CCND2 is regulated in the MAF genetic subgroup. Importantly, our ChIP-seq analysis in MM.1S myeloma cells shows that MAF binds to the enhancer of CCND2 in vivo (Fig. 5d), thus consolidating its role as a critical regulator of CCND2 over-expression in MAF-translocated MM cells. This finding provides also insights into CCND2 regulation in the MMSET genetic subgroup, in which, both CCND2 and MAF are expressed at lower levels26(Fig. 5e and Supplementary Fig. 6e). Since chromatin accessibility signal is also lower in the MMSET than in the MAF subgroup (Fig. 5d and Supplementary Fig. 6e), together, these findings are consistent with the notion that the transcriptional activity of CCND2 enhancer in the MAF and MMSET subgroups is MAF dosage-dependent.

To identify other TF potentially regulating CCND2 enhancer activity in CCND2-expressing HD MM (which lack expression of MAF), we performed differential footprinting analysis in CCND2high vs CCND2low HD myeloma PC (Fig. 5g and Supplementary Fig. 6f and g). In addition to TF known to be implicated in MM (IRF4, PRDM1, FLI1)11,27, we also identified a potential regulatory role for TF previously not linked to MM (e.g., CXXC1, ZNF394, and IRF3).

Finally, we explored the activity of the CCND2 super-enhancer in B cell chronic lymphocytic leukemia (CLL), the most common blood cancer (Supplementary Fig. 7a and b). High CCND2 expression has been previously documented in CLL B cells residing in proliferation centers28, structures in secondary lymphoid organs where malignant B cells receive survival and proliferative signals28. CLL malignant B cells express significantly higher CCND2 than CCND1 levels (Supplementary Fig. 7b) and as in CCND2high myeloma PC, we found that the same CCND2 super-enhancer is active in CLL B cells (Supplementary Fig. 7a). This observation extends the importance of the CCND2 enhancer in a wider range of B cell lineage malignancies.

Discussion

Our analysis of complementary datatypes sheds insights into how changes in the regulatory genome, and in particular of the MM-specific enhanceome, shape distinct myelomagenic transcriptomes and downstream biological pathways.

Increased overall chromatin accessibility in all myeloma genetic subgroups with reference to normal BM PC, the myeloma PC normal counterparts, is our first fundamental observation. Unlike previous studies, here we addressed chromatin and gene expression-derived heterogeneity in MM. Surprisingly, we found that chromatin accessibility explained more of the variance than gene expression. The main axes of variation correlated surprisingly well with ND/MM and MIE categorizations, and defining subgroups as clusters, the combined model provided better separation than either data type alone. This was possible for two of the three IgH-translocated subgroups, and less so for HD MM, likely because the latter represents an inherently biologically heterogenous group. While the small number of samples in each subgroup argues for caution in interpreting this, we were able to validate these axes with independent data from cell lines not used in training the model. While using peaks found in the primary samples to assess cell lines is likely to overemphasise the similarity of cell lines and primary samples overall, it should not make specific cell line subtypes appear more like their primary counterparts. Larger future datasets will undoubtedly be helpful for increasing further our confidence in difference between the programs associated with different MIE. We found that marker genes and genesets that have been previously shown to be regulated by MIE, are also correlated with distal DAR of myeloma PC chromatin. This suggests that such chromatin changes are, to a large extent, directly or indirectly, dependent on MIE. Nevertheless, some of the identified putative enhancers are predicted to regulate genes downstream of the oncogenic RAS pathway which is activated by secondary gain-of-function N- or K-RAS somatic mutations in 40–50% of MM cases3,19, thus highlighting the ability of our approach to also reveal chromatin traces of secondary driver genetic events. Future studies, specifically designed to address the impact of pre-defined, high-frequency secondary genetic events (e.g., RASmutations and MYC structural variants), will reveal the extent to which such secondary events impact chromatin and its regulatory activity. Another notable feature, mostly restricted to HD and MMSET MM, is the proximity of putative enhancers to genes involved in neurogenesis with ectopic expression of the TF ISL2 in HD MM being a prime example of this. While the functional significance of this observation requires further investigation, we note that recent work demonstrated ectopic activation of neural genes in different cancers with such genes engaging in functional cross-talk with the central and peripheral nervous system29,30.

By taking advantage of a comprehensive set of chromatin marks in the whole spectrum of B lineage cells, we validated the presence of H3K27ac in 832 over-accessible, candidate enhancer regions in myeloma PC. The candidate enhancers of HGF and UCHL1 exemplify de novo formed enhancers, i.e., not present at any stage of late B lineage development, while the CCND2 enhancer provides an example of ‘re-commissioned’ super-enhancer, i.e., active in myeloma but not in normal PC. In the case of CCND2, ‘recommissioning’ entails activation in myeloma PC of Polycomb-imposed ‘poised’ transcriptional states in GCB cells and normal PC.

As well as discovery of critical cis regulatory elements, ATAC-seq also affords the opportunity for inferring TF binding to chromatin through footprinting and motif analysis. A notable finding is that the majority of trans factors that are predicted to display increased binding frequency to chromatin are already active in ND PC and their expression is not deregulated in MM. Such TF engage at higher frequency interactions with other TF than in ND PC and are more likely to self-regulate, properties that have been associated with enhanced regulatory potential31. These properties, as revealed through construction of TF regulatory gene networks, led to insights with functional and prognostic implications. In the case of MEF2C, the set of its predicted target TF, includes archetypical myeloma PC dependencies, i.e., IRF4, PRDM1, thus in part explaining the high degree of myeloma cell dependency on MEF2C. Similarly, myeloma cells appear to be addicted to CBFB and ZNF384 which have not been linked to myeloma PC biology thus far. In addition, the case of CXXC1, BPTF, MAZ, KLF13, CBFB and RFX5, six TF with unknown function in MM, also highlights the strength of our functional epigenomics approach. None of these TF is differentially expressed in MM, yet they display increased predicted regulatory potential across all myeloma subgroups, likely reflecting their high expression levels in myeloma cells compared to other cancers. Moreover, all six TF demonstrated prominent myeloma cell dependency and were found to strongly predict prognosis. For CXXC1, this is in accordance with its known function in the regulation of H3K4me3 as part of the COMPASS activating complex and through its binding to CpG islands and interaction with the histone methyl-transferases SETD1A/B32,33. Although little is known about the role of CXXC1 in cancer, it has been reported that its overexpression portends adverse prognosis in all stages of gastric cancer34. Therefore, the inferred TF networks provide a firm basis for future research which will further validate and define the role of TF identified herein in myeloma biology.

Extensive genetic heterogeneity and diversification in MM poses significant therapeutic challenges. One of the striking features of myeloma biology is the early observation that a dichotomous over-expression of the cell cycle regulators CCND1 and CCND2 overarches genetic diversification4. While in the majority of MM cases overexpression of CCND1 can be explained by somatic structural variants i.e., juxtaposition to IGH enhancer or chr11q25 gain31, such aberrancies do not account for overexpression of CCND2. Our identification and functional validation by complementary approaches of the distal cis and trans regulators of CCND2 expression addresses this gap in the biology of MM and allowed further insights into how the activity of this enhancer is regulated in different myeloma genetic subgroups. While expression of MAF is highest in the MAF subgroup as a result of its juxtaposition to the powerful IgH enhancer, in MMSET MM expression of MAF is lower and previously shown to be regulated by the TF FOS in response to activated MAPK pathway35. It is likely that these differences in MAF dosage account for the strongest chromatin accessibility signal at the MAF-bound CCND2 enhancer and a higher level of CCND2 expression in MAF-translocated MM. Interestingly, the same enhancer with the same developmental chromatin features regulates expression of CCND2 in CLL cells, a finding that tallies with the notion that CLL originates either from a naïve or memory rather GCB cell36.

Overall, our dissection of the MM regulatory genome offers insights and a resource for further biological exploration. Discovery and subsequent functional dissection of the critical CCND2 enhancer is a prime example of the power of our multi-layered integrative computational approach and affords opportunity for enhancer-based therapeutic approaches.


Article classification: Biological abstract
Share to:
Tel:+86-027-87610298
Tel:+86-027-87610297
Add:Room A11-329, 1st Floor, No.1, SBI Venture Street, Optics Valley, East Lake
New Technology Development Zone, Wuhan, China.
Certificate NO.:U18Q28010569R0S