Xinqidi Biotech Co.,Ltd,Wuhan,China 2008-2021
R&D 13th year

# Functional and epigenetic phenotypes of humans and mice with DNMT3A Overgrowth Syndrome

Issuing time:2021-07-30 14:38

## Abstract

Germline pathogenic variants in DNMT3A were recently described in patients with overgrowth, obesity, behavioral, and learning difficulties (DNMT3A Overgrowth Syndrome/DOS). Somatic mutations in the DNMT3A gene are also the most common cause of clonal hematopoiesis, and can initiate acute myeloid leukemia (AML). Using whole genome bisulfite sequencing, we studied DNA methylation in peripheral blood cells of 11 DOS patients and found a focal, canonical hypomethylation phenotype, which is most severe with the dominant negative DNMT3AR882H mutation. A germline mouse model expressing the homologous Dnmt3aR878H mutation phenocopies most aspects of the human DOS syndrome, including the methylation phenotype and an increased incidence of spontaneous hematopoietic malignancies, suggesting that all aspects of this syndrome are caused by this mutation.

## Introduction

Overgrowth syndromes are a heterogeneous group of rare disorders that are characterized by a global or localized tissue hypertrophy; the genetic causes of these syndromes are emerging as more patients’ genomes are molecularly scrutinized. The first report of DNMT3A overgrowth syndrome (DOS, also called Tatton–Brown–Rahman Syndrome; TBRS; MIM 615879) described a syndrome of increased growth, defined as height and/or head circumference at least two standard deviations above the mean, associated with facial dysmorphism and intellectual disability occurring in patients with de novo heterozygous germline mutations in DNMT3A1. In the initial report, 13/152 patients with overgrowth had mutations in the DNMT3A gene, and all mutations were located within the functional domains of the protein. A subsequent publication assessing 55 patients2 and numerous case reports3,4,5,6,7,8,9,10,11 confirmed the original findings, and also described an additional array of mutations scattered throughout the functional domains of DNMT3A, including missense, frameshift, and nonsense mutations. For the DOS patients described to date, the most prevalent mutations alter amino acid position R882, which is also the most common site of DNMT3A mutations in patients with AML12. To date, 12/100 patients identified in the literature have mutations affecting R8822,3,4,5,6,13. More than 80% of individuals with DOS had overgrowth and a variable degree of intellectual disability. Obesity (weight of more than two standard deviations above the age and sex-adjusted mean) has been reported in two-thirds of patients. A characteristic facial appearance with heavy, horizontal, and low-set eyebrows with prominent and enlarged upper central incisors are frequent findings. Some patients have been reported to have joint hypermobility, hypotonia, kyphoscoliosis, and afebrile seizures2.

Despite the emerging phenotypic and molecular characterization of DOS, limited data are available on the epigenetic consequences of the broad array of DNMT3A mutations seen in these patients. The precise consequences of most of the DNMT3A mutations are neither yet clear, nor are genotype: phenotype correlations for patients with this syndrome.

Somatic mutations in DNMT3A are the most common cause of clonal hematopoiesis14,15,16and are the most common initiating mutation in AML patients with a normal karyotype17,18,19. Although mutations occur throughout the DNMT3A gene in AML patients, more than half occur at amino acid R882 (e.g., R882H, R882C, R882S, R882P, etc.)12. DNMT3AR882 mutations encode a dominant-negative protein that is thought to act via two mechanisms: first, R882 mutant proteins fail to homodimerize, reducing the activity of the enzyme by ~80%, and second, R882 mutant proteins preferentially interact with the wild-type (WT) protein, creating a catalytic sink that traps the WT protein is inactive heterodimers (dominant-negative effect)20,21. Previous studies have shown that the reduced methyltransferase activity of cells heterozygous for an R882 mutation is associated with a focal, canonical hypomethylation phenotype at specific regions within the genomes of AML cells17,20,22. Further, we previously showed that the morphologically normal peripheral blood cells of one 9-year-old DOS patient with a heterozygous germline DNMT3AR882Hmutation had a focal hypomethylation phenotype that was similar to that of AML cells with this mutation22. These data strongly suggest the hypomethylation phenotype precedes (and may contribute to) the development of AML. To date, the global methylation phenotypes of the DOS patients with non-R882 mutations (defined by WGBS) have not yet been described and compared to R882 mutations, although array-based methylation studies have been reported5,23. Furthermore, the risk of developing AML with germline DNMT3A mutations (especially R882) is not yet clear4.

In this work, we investigate the DNA methylation phenotypes of peripheral blood cells from 11 DOS patients with DNMT3A mutations, including three at R882, and eight with alternative mutations (including one with a heterozygous deletion of the DNMT3A gene). We also describe the phenotype of mice with a germline Dnmt3aR878H mutation (the murine homologue of DNMT3AR882H), which includes many features of the human syndrome, including obesity, overgrowth, as well as behavioral and movement deficits. Whole-genome bisulfite and single-cell RNA-sequencing studies reveal overlapping methylation and gene-expression phenotypes between mouse and human hematopoietic cells. Finally, we show that Dnmt3aR878H mice spontaneously develop B-cell and myeloid malignancies with a long latency, suggesting that patients with DOS should be prospectively monitored for the development of hematologic cancers.

## Results

### DNMT3A overgrowth syndrome (DOS) patients have focal hypomethylation in nonleukemic hematopoietic cells

To determine whether DOS patients have the DNA methylation changes in the genomes of their hematopoietic cells, we obtained peripheral blood samples from 11 children and adults with DOS. Clinical features are summarized in Table 1. Patients ranged in age from 20 months to 36 years when samples were collected, and all exhibited hallmarks of DOS, including overgrowth, intellectual and developmental delays, behavioral disorders (including autism, anxiety, and panic disorder), hypotonia, and distinct facial features. One patient (UPN 894912) was diagnosed with AML (French Amrican British classification subtype M4) 4 years before sample banking and was in morphologic complete remission when the test sample from the peripheral blood was collected. Three patients (UPN 624400, 154605, and 894912) had R882 mutations, and the remaining eight had unique mutations occurring throughout the DNMT3A gene; six were missense mutations, one was a nonsense mutation (UPN 228211) (Fig. 1a), and one (UPN 518693) had a heterozygous 135 kb deletion that encompassed the entire DNMT3A gene (Chr2:25,228,254–25,363,376; hg38) (Supplementary Fig. 1a).

To assess the methylation of genomic DNA in the blood cells of these patients, we performed whole-genome bisulfite sequencing (WGBS), with a median ~18x coverage of the human genome. We compared the methylation levels of 11 DOS samples to the peripheral blood samples of 15 healthy donors aged 4–43 years (DNMT3A+/+; eight male, seven female). We subcategorized the DOS patients into two groups: DNMT3AR882 (n = 3; one male, two female) and DNMT3Anon-R882 (n = 8; five male, three female) for subsequent analysis, since each non-R882 patient had a unique mutation. There was no significant difference in age between control vs. DNMT3AR882 (p = 0.157) or control vs. DNMT3Anon-R882 (p = 0.824) groups (two-sample t test). Globally, the methylation levels across the genome were subtly decreased in DNMT3AR882 and DNMT3Anon-R882 patients (Fig. 1c). Utilizing established methods24, we identified differentially methylated regions (DMRs) between DNMT3A+/+ (n = 15) vs. DNMT3AR882 samples (n = 3; Supplementary Data 1), and DNMT3A+/+ vs. DNMT3Anon-R882samples (n = 8; Supplementary Data 2). DMRs were defined as having $\ge$10 CpGs, a mean methylation difference of $\ge$0.2, a false discovery rate (FDR) of $\le$0.05, and a standard deviation (SD) of $\le$0.1 among the test samples. To ensure that the 2,209 DMRs called in the R882 samples were independent of age and/or sex effects on CpG methylation, we used linear regression to test for the effect of genotype on methylation level while adjusting for sex and log(age). All of the DMRs remained significant (at FDR < 0.05) in this regression analysis. At DMRs, the mean methylation values of DNMT3AR882 samples were lower than the DNMT3Anon-R882 samples at all annotated regions of the genome examined, including gene bodies and promoters (p  ≤ 0.0001, Fig. 1b, c). However, the average width of DMRs (in base pairs) in the DNMT3AR882 samples (662.7 ± 441.6) was not statistically different from those in the DNMT3Anon-R882 (614.5 ± 366.8, p = 0.0587; Fig. 1d). Interestingly, the fraction of gene bodies containing at least one DMR (1426/18951, 7.52%) was significantly greater than that observed in other annotated regions of the genome (0.06–2.4%; ***p  ≤ 0.0001; Supplementary Fig. 1b).

The relative severity of the methylation phenotype in the DNMT3AR882 samples compared to DNMT3Anon-R882 (as well as the canonicality of the phenotype among samples within a category) is shown in a heatmap representation of the 2209 DMRs identified by comparing the healthy DNMT3A+/+ donors vs. DNMT3AR882 samples (Fig. 1e). Passively plotting the methylation values of the non-R882 samples revealed the attenuated methylation difference in the same regions in those samples. In contrast, the direct comparison of DNMT3Anon-R882samples to healthy donors revealed only 332 DMRs (Fig. 1f). In both comparisons, all DMRs identified were hypomethylated, and 215 DMRs (65% of DNMT3Anon-R882 and 10% of DNMT3AR882 DMRs) overlapped between DNMT3AR882 and DNMT3Anon-R882 samples (where overlap required at least 1 bp of shared sequence). An example of a region within the HOXBcluster with DMRs (highlighted by black boxes) is shown in Fig. 1g, along with two striking DMRs in exons four and five of the RASIP1 gene, that are specific for the R882 samples (Fig. 1g). The DMRs present in these regions are more pronounced in the R882 mutant samples, suggesting that R882 mutations cause a greater reduction in DNA methyltransferase activity than the non-R882 mutations.

### DNMT3AR882 alters the transcriptional signatures of hematopoietic cells

Utilizing the 10x Genomics Chromium platform25, we performed single-cell RNA-sequencing (scRNA-seq) on fresh peripheral blood samples from UPN 624400 (at age 14) and his unaffected male sibling (at age 17). The two samples showed a high level of similarity (Supplementary Fig. 2a). Graph-based clustering identified 12 clusters that were functionally categorized with ToppGene26 by inputting the top 50 identifying genes for each cluster (Fig. 2a). These populations were then validated by the assessment of well-established gene markers enriched in different subsets (Supplementary Fig. 2b). The cell type distribution was different between the DNMT3AR882H patient and his sibling control (p < 0.0001, chi-squared), and subtle differences in several individual populations (relative to total cells) were observed (Fig. 2b). The fraction of CD4 + naïve T-cells (cluster 5) was reduced (13.94–4.77%), and the NK-cell (cluster 8) and NKT (cluster 10) fractions were increased (3.09–15.88% and 3.80–7.01%, respectively, Fig. 2b). The reduction in T-cells and increase in NK-cells was orthogonally validated by 15-color flow cytometry of the same samples (Supplementary Fig. 2c). We identified differentially expressed genes in each cluster in the DNMT3AR882H vs. sibling control samples with a p-value and expression cutoff of 0.05 and log2 ratio of ±1, respectively (Supplementary Data 3). Out of 12 clusters, nine had differentially expressed genes (DEGs; Fig. 2c); however, the total numbers of DEGs were relatively small, ranging from five (cluster 11) to 72 (cluster 1). Interestingly, 50 genes were identified as dysregulated in two or more clusters, and their dysregulation was canonical across cell types, suggesting they were dysregulated by mechanisms not specific to lineage or cell type (Fig. 2d). For example, the RASIP1 gene (which contains two DMRs; Fig. 1g) was upregulated in 7/9 clusters with detectable DEGs (Fig. 2e). This was due to an increase in the number of expressing cells, rather than an increase in the expression level of RASIP1 per cell, since the control sample had a single cell with detectable RASIP1 expression, while in the DNMT3AR882H sample, 15% of cells had detectable reads. In contrast, HOXB genes, also associated with DMRs, were downregulated across multiple cell types. HOXB2 was downregulated due to a decrease in the number of expressing cells, and significantly decreased mean reads per cell (Fig. 2e). However, there was no direct correlation between DMR location and expression of the HOXB genes, suggesting there may be additional local or long-range regulatory elements within the cluster that influence the expression of HOXBgenes27. Gene Ontology terms for DEGs by cluster identified numerous biological processes, including the terms related to T-cell activation and proliferation (IL-4 and IFNG signaling), and vascular function (angiogenesis and endothelial cell migration) (Supplementary Fig. 2d).

Across all 12 clusters, there were 242 differentially expressed genes in total; only 38 of these were within 10 kb of a DMR (Supplementary Fig. 2e). This suggests that for the majority of DEGs, there is no direct correlation to a local DMR. Using bulk RNA-sequencing, we confirmed the presence of DEGs in this DNMT3AR882H patient, as well as one additional DNMT3AR882H patient (aged 1.7 years), including validation of RASIP1 and HOXB2dysregulation (Supplementary Fig. 2f, g). While bulk RNA-seq was able to identify additional DEGs (in part to increased sequencing depth), the lack of a global correlation between differential gene expression and DMRs was recapitulated, even across specific annotated regions of the genome (Supplementary Fig. 2h and Supplementary Data 4).

### Mice with germline Dnmt3aR878H/+ exhibit overgrowth and obesity

To achieve germline expression of the Dnmt3aR878H/+ allele from its endogenous locus, we utilized the model established by Guryanova et al. a minigene combining exons 23 and 24 carrying the point mutation encoding R878H was inserted in the place of the endogenous Dnmt3a+/+ exon 23 downstream of a lox-stop-lox cassette28. Heterozygous Dnmt3aR878H/+mice were crossed with B6.C-Tg(CMV-cre)1Cgn/J deleter mice. The floxed founder line mice with Dnmt3aR878H/+ and CMV-cre were then backcrossed to C57Bl/6 J mice to transmit the mutant allele through the germline, and to select for mice without CMV-Cre. Heterozygous floxed Dnmt3aR878H/+ mice were born at the expected ratios (for over 30 genotyped litters, the ratio of Dnmt3aR878H/+ to Dnmt3a+/+ was 1.105) and were viable, surviving to 2+ years. Expression of the R878H allele after floxing was virtually identical to that of the WT allele28. Dnmt3aR878H/R878H mice were severely runted at birth, and did not survive past 1 week of age. Female heterozygous Dnmt3aR878H/+ mice had a significant incidence of dystocia during pregnancy, and were therefore not utilized to produce experimental mice. All experimental mice were generated by crossing male heterozygous Dnmt3aR878H/+ germline mice (CMV-Cre negative) with C57Bl/6 J female mice. Germline Dnmt3aR878H/+ mice had normal weight and size at birth, and no obvious developmental defects.

Many DOS patients have increased height and obesity2. We therefore tracked weight for heterozygous mutant Dnmt3aR878H/+ mice (n = 120; 59 females, 61 male) and Dnmt3a+/+littermate controls (n = 90; 48 females, 42 male) from 21 to 600 days of age. Before 100 days of age, there was no difference in weights of the two cohorts (p = 0.5781 for genotype*time). However, with aging, the weights of the R878H mice diverged from littermate controls, reaching a mean of 37.73 and 31.2 g at 380 days of age, respectively (p ≤ 0.0001 for genotype*time; Fig. 3a). Visual inspection revealed a clear size difference for the mutant mice at 1 year of age (Fig. 3b). CT scans performed on four pairs of 210-day old mice showed significantly longer femur lengths (but not humerus lengths), which represents a surrogate for increased height (Fig. 3c–e). We also measured craniofacial landmarks to determine whether Dnmt3aR878H/+ mice had macrocephaly, a phenotype often observed in DOS children1,2. Although statistical analyzes of skull measurements did not show a difference in head circumference (length of and width of the neurocranial bones), some features of the skulls of Dnmt3aR878H/+ mice are slightly larger than their WT littermates, including the mandible and localized structures in the cranium (Fig. 3f). The MRI measurements of body composition revealed a significant, age-dependent increase in body fat (Fig. 3g), but not lean mass (Fig. 3h) paralleling the obesity phenotype observed in DOS patients. Obesity was not associated with an increase in food consumption; Dnmt3aR878H/+ mice ate significantly less chow than WT-aged matched controls (mean = 3.51 ± 0.07 vs. 4.33 ± 0.17 grams/mouse/day, p = 0.0013; Fig. 3i). In addition, metabolic cage analysis confirmed that Dnmt3aR878H/+ mice ate less, a finding that was associated with slightly (albeit not significantly) reduced movement and overall speed of movement (Supplementary Fig. 3a–c). The metabolic cage analysis also revealed that Dnmt3aR878H/+ mice under 6 months of age did not have significant differences in O2 consumption or CO2 expiration (although both trended toward reduced levels; Supplementary Fig. 3d, e). In mice over 6 months of age, there was a decrease (although not significant) in O2 consumption and a significant decrease in CO2 production (p = 0.0313). Together, this suggests that there is a slight decrease in the respiratory quotient (RQ) in Dnmt3aR878H/+ mice (Supplementary Fig. 3f); although not significant, this change may indicate a subtle reduction in basal metabolic rate in these mice, contributing, at least in part, to obesity.

The obesity phenotype was exacerbated by feeding a high-fat diet, which did not cause a similar weight gain in Dnmt3a+/+ littermates (Fig. 3j). Finally, we fasted mice for 6 h, and then assayed plasma for common analytes relevant to increased adiposity. The plasma levels of leptin, triglycerides, cholesterol, glucose, and free fatty acids were not significantly different between Dnmt3aR878H/+ and Dnmt3a+/+ control mice (Supplementary Fig. 3g–o).

### Mice with germline Dnmt3aR878H/+ exhibit behavioral abnormalities

To assess behavioral and neurological phenotypes in the Dnmt3aR878H/+ mice, we carried out a battery of tests on littermate and age-matched cohorts of WT and R878H mice (all mice were 100–200 days old at the time of testing, Fig. 4 and Supplementary Fig. 4); and the measurement goals are detailed in the “Methods” section. A number of results were significantly altered in Dnmt3aR878H/+ mice, including reduced total ambulations and rearing events in 1 h open field tests, reduced pole climb down and inverted screen (60 and 90°) climb-up times, increased time spent freezing in contextual, conditional and cued fear testing, differential foot-shock response, and reduced marble-burying (Fig. 4a–k, respectively). Outcomes that were not statistically different in R878H mice included the sensorimotor battery, such as balance (ledge test and platform test), walking initiation, grip strength (inverted screen test) and motor coordination (rotarod), as well as the intellectual disability, memory and anxiety tests, including the elevated plus maze, Morris water maze test, probe trial (Supplementary Fig. 4). These data indicate that Dnmt3aR878H/+ mice display reductions in volitional movement with an absence of frank changes to exploratory behavior, suggestive of complex emotionality in these mice. Overall, the behavioral analyses demonstrated that germline Dnmt3aR878H/+ mice have a predominant phenotype of volitional movement deficits, accompanied by complex emotionality, and subtle cognitive alterations.

### Germline Dnmt3aR878H/+ mice have a focal hypomethylation phenotype in nonleukemic hematopoietic cells

To understand the methylation phenotypes of bone marrow cells derived from unmanipulated, littermate-matched Dnmt3a+/+ (n = 10; 2–52 weeks of age, five male, five female) or Dnmt3aR878H/+ mice (n = 6; 8–38 weeks of age, three male, three female), we performed WGBS. There was no significant difference in age between control and Dnmt3aR878H/+ groups (p = 0.779; two-sample t test). We also included the bone marrow cells from unmanipulated germline Dnmt3a-/- mice (n = 4; 2 weeks of age, two male, two female) and Dnmt3a+/- mice (n = 4; 12–52 weeks of age, three male, one female) as comparators to calibrate the methylation phenotypes of Dnmt3aR878H/+ relative to deficiency or haploinsufficiency. The DNA methylation phenotypes of Dnmt3a-/- 29 and Dnmt3a+/- 30models have been described in the literature31,32,33, and will not be further discussed here.

Our WGBS sequence coverage (median of 18x) assessed 98% of individual CpGs in the mouse genome. The methylation values had Pearson’s correlation r > 0.8 within samples for each genotype, highlighting the reproducibility between biological replicates, despite the range in ages for the mice used in the study. Globally, the mean methylation differences among all CpG sites in the Dnmt3aR878H/+ bone marrow samples were not significantly different from the Dnmt3a+/+ samples (Fig. 5a). In contrast, Dnmt3a-/- samples had significantly reduced mean global methylation across all CpGs. We next defined DMRs (using the same parameters used for the human samples) for Dnmt3a+/+ vs. Dnmt3aR878H/+ samples (#DMRs = 2172, Supplementary Data 5), Dnmt3a+/+ vs. Dnmt3a-/- samples (#DMRs = 20161, Supplementary Data 6), and Dnmt3a+/+ vs. Dnmt3a+/- samples (#DMRs = 8, Supplementary Data 7) and found that the Dnmt3aR878H/+ phenotype was intermediate between the Dnmt3a-/- and Dnmt3a+/+ mice (Fig. 5b). To ensure that the 2172 DMRs called in the R878H samples were independent of age and sex effects on CpG methylation, we used linear regression to test for the effect of genotype on methylation level, while adjusting for sex and log(age). All of the DMRs remained significant (at FDR < 0.05) in this regression analysis. Comparison of the 2172 Dnmt3aR878H/+ -specific DMRs with the Dnmt3a-/- samples revealed virtually complete overlap, but the degree of methylation reduction in the Dnmt3aR878H/+samples was uniformly less severe. Within those 2172 DMRs, Dnmt3aR878H/+ samples had a mean CpG methylation value of 50.10%, Dnmt3a-/- DMRs had a mean methylation value of 33.35%, and Dnmt3a+/- DMRs had a mean methylation value of 85.5% compared to that of Dnmt3a+/+ controls (defined as having 100% methylation within the same DMRs). This trend was observed across all annotated regions of the genome and suggests that the Dnmt3aR878H protein behaves as a dominant-negative at all functional regions of the genome (Fig. 5a). In addition, the mean size of DMRs was 890.1 + /− 552 bp for Dnmt3a-/-samples, 751.1 + /− 474.2 bp for the Dnmt3aR878H/+ samples, and 610.3 + /− 295.7 bp for the Dnmt3a+/- samples (Fig. 5c).

Of 22,026 gene bodies annotated in the mouse genome, 1375 (6.24%) contained at least one DMR (Supplementary Fig. 1c), a significant enrichment compared to other annotated regions in the genome (p ≤ 0.0001). In human peripheral blood, 7.52% of gene bodies contained a DMR (Supplementary Fig. 1b). The fractions of other annotated regions associated with DMRs were also similar between human and mouse samples, suggesting there is an overlap of the functional consequences of human R882 and mouse R878H mutations.

The focal and canonical nature of the 2172 and 20,161 DMRs identified in Dnmt3aR878H/+ and Dnmt3a-/- bone marrow samples, respectively, are highlighted in Fig. 5d, e. The methylation pattern across all samples within a genotype was highly reproducible. From these heatmaps, it was also clear that essentially all DMRs in Dnmt3aR878H/+ samples are likewise hypomethylated in the Dnmt3a-/- samples. An intersection analysis revealed that 81.4% of Dnmt3aR878H/+ DMRs were also detected in Dnmt3a-/- samples; in contrast, the Dnmt3a+/-samples had very few DMRs and most closely resembled Dnmt3a+/+ samples (Fig. 5d). The dramatic methylation loss at DMRs in Dnmt3a-/- bone marrow was less severe in the Dnmt3aR878H/+ samples (Fig. 5e). Examples of the focality and canonicality of methylation changes at specific, homologous loci are demonstrated for the Hoxb cluster and the Rasip1gene (Fig. 5f), where similarly located DMRs were detected in human R882 samples. To compare DMRs across species we used the UCSC lift-over tool (http://genome.ucsc.edu) to translate the human R882 DMR coordinates to the mouse genome, and found that 101 human DMRs (4.6%) directly corresponded to a mouse R878H DMR and 1713 (77.5%) were within 10 kb of a mouse DMR. Conversely, when we lifted the mouse coordinates over to the human genome, 95 (4.37%) mouse DMRs intersected directly with a human DMR, and 1889 (87%) were within 10 kb of a human R882 DMR.

### Germline Dnmt3aR878H/+ mice have differentially expressed genes in specific hematopoietic cell types

We performed scRNA-seq on whole bone marrow cells from two pairs of Dnmt3a+/+ and Dnmt3aR878H/+ mice using the 10x Genomics Chromium platform30. One littermate-matched pair was evaluated at 1 month of age, and the other at 9 months, and the paired samples showed remarkable similarity by tSNE (Supplementary Fig. 5a). After processing aligned data with Partek Flow software, we performed graph-based clustering and ToppGene analysis to identify functional cell types based on the top 50 defining genes for each cluster (Fig. 6a, b). Inferred lineages for each cluster were verified using a k-nearest neighbor algorithm trained on the Haemopedia Database34 (Supplementary Fig. 5b), and all lineages were present in both genotypes (Fig. 6a, b, Supplementary Fig. 5a–c). There were subtle, yet significant, differences in pre-B-cells, monocytes, macrophages, and MPPs in the Dnmt3aR878H/+ bone marrow relative to controls (by Fisher’s exact tests for ratios) suggesting that the expression of germline R878H does not lead to large disturbances in normal hematopoietic populations. Graph-based clustering identified one population that was expanded in the 9-month Dnmt3aR878H/+ sample that did not fit into a single lineage assignment because it had both B-cell and myeloid gene-expression signatures (cluster 14; mixed lineage).

We next assessed how many differentially expressed genes overlapped between the 1- and 9-month samples (6047 and 5573 unique DEGs, respectively across 24 clusters), and found 2,775 DEGs common to both ages (Supplementary Data 8). A comparison of differentially expressed genes identified in scRNA-seq data from DNMT3AR882 human peripheral blood (242 total unique DEG across 12 clusters, Fig. 2) and mouse Dnmt3aR878H/+ bone marrow (8,845 total genes across 1- and 9-month timepoints) showed concordant dysregulation for 121 genes (50% of human genes) based on gene name alone. We show examples of conserved dysregulation in the Hoxb gene cluster (Hoxb4; Fig. 6c) and for Rasip1 (Fig. 6d).

To further validate the dysregulation of Hoxb gene expression in Dnmt3aR878H cells, we evaluated its expression in a previously published Dnmt3a-/- scRNA-seq dataset that also utilized a doxycycline-inducible, WT DNMT3A transgene to restore DNA methylation29. Inducing DNMT3A activity by feeding mice dox chow results in time-dependent remethylation at DMRs, with partial remethylation of the Hoxb locus occurring at approximately 24 weeks (Supplementary Fig. 6a)29. In Dnmt3a-/- hematopoietic cells from these mice, Hoxb4 expression was likewise decreased, predominantly in myeloid lineage cells (PMNs; Supplementary Fig. 6b, monocytes; Supplementary Fig. 6c, and GMPs; Supplementary Fig. 6d); restoring DNMT3A expression with doxycycline in vivo led to a time-dependent increase in mean Hoxb4 expression per cell, and an increase in the fraction of myeloid cells expressing Hoxb4 (Supplementary Fig. 6b–d). These data suggest that the DNA methylation status of the Hoxb cluster may directly influence the expression of Hoxbgenes.

### Hematopoietic phenotypes and spontaneous leukemias in germline Dnmt3aR878H/+ mice

The sizes of cell populations defined in the scRNA-seq data suggest that steady-state hematopoiesis is relatively unperturbed in Dnmt3aR878H/+ mice. Using 21 color flow cytometry, we verified this data in a larger cohort of mice, assessing both peripheral blood and bone marrow cells from Dnmt3a+/+ and Dnmt3aR878H/+ mice ranging from 1 month to 2 years of age (Supplementary Fig. 7). There were very few perturbations in mature cell populations (B-, T-, erythroid and myeloid cells), stem populations, or progenitor populations, although age-related alterations were observed for both genotypes.

We next asked whether Dnmt3aR878H/+ derived hematopoietic cells would exhibit defects following the stress of a cytotoxic challenge with doxorubicin and cytarabine; however, there were no significant differences in count recovery for the Dnmt3aR878H/+ mice (Fig. 7a). Regardless, spontaneous, fatal hematopoietic malignancies arose in six out of 80 unmanipulated Dnmt3aR878H/+ mice after 1 year of age, vs. 0/65 WT mice (p = 0.0296; Fig. 7b). Flow cytometry and morphologic examination using the Bethesda criteria35,36 by a board-certified hematopathologist were used to classify two samples as MDS with maturation (mLeuk1 and mLeuk2), two as B-cell malignancies with extensive plasma cells in the bone marrow and spleen (mLeuk3 and mLeuk4), one as AML without differentiation (mLeuk5), and one as CMML-like (mLeuk6) (Fig. 7c and d).

## Discussion

In this report, we describe DNA methylation alterations and their consequences in human patients, and a mouse model of the DNMT3A Overgrowth Syndrome (DOS). The peripheral blood of DOS patients had focal, canonical DNA hypomethylation that was more severe in patients with mutations that altered amino acid R882, but present in all patients with non-R882 mutations. In mice with a germline Dnmt3aR878H/+ mutation, we found similarities to human patients for methylation and gene-expression patterns, as well as similar growth and behavioral alterations, strongly suggesting that this mutation can cause the syndrome. This model supports the observation that patients with clonal hematopoiesis caused by mutations in DNMT3A can live for many years without clinical progression to AML14,15,16,37. Similar to patients with clonal hematopoiesis, some mice with Dnmt3aR878H/+ develop spontaneous hematologic malignancies after long latent periods.

All of the germline DNMT3A mutations examined in this study caused a focal methylation phenotype in the hematopoietic cells of DOS patients, suggesting that DOS-associated DNMT3A mutations must cause a loss of DNA methyltransferase activity. However, the hypomethylation phenotype of patients with R882 mutations was much more severe than that of non-R882 mutations, consistent with the observation that R882 mutations cause a dominant-negative effect in hematopoietic cells20,22. The patients with DNMT3AR882mutations had a total of 2,209 DMRs identified in their peripheral blood cells, while patients with non-R882 mutations had one-tenth that number. By coincidence, one patient had a true haploinsufficient state (UPN 518693), due to a heterozygous ~135 kb deletion that encompassed the entire DNMT3A gene (Table 1, Supplementary Fig. 1). This patient’s methylation phenotype defined the consequences of simple haploinsufficiency due to gene deletion. Since all of the other non-R882 missense mutations had a similar magnitude of methylation loss (Fig. 1), we suggest that the non-R882 mutations in this study may all lead to functional deficits that mimic the inactivation of the affected allele. However, the mechanisms of inactivation are almost certainly multifarious. The canonically hypomethylated DMRs in the non-R882 patients affected many of the same regions as in the R882 samples; 215/332 DMRs (65%) were concordant, suggesting that these regions may be highly relevant for epigenetic changes that alter the state of hematopoietic stem cells, rendering them more susceptible to transformation.

As noted in similar studies22,29, the global correlation between DMRs and altered gene expression is relatively weak in the hematopoietic cells of the DOS patients. However, two example genes highlighted in this study (HOXB2 and RASIP1) had reproducible but opposite alterations in gene-expression patterns that were associated with local DMRs. Hoxb4expression and methylation correlations have previously been reported in a Dnmt3adeficient serial transplantation model32. Globally, however, no strong correlative rules could be established for the expression of genes in close proximity to DMRs in peripheral blood cells, even when restricted to specific functional regions in the genome (Supplementary Fig. 2). Similar findings were reported in a parallel yet unique model of complete Dnmt3aloss29. Clearly, the expression of genes near DMRs are strongly influenced by other factors, including genomic context (exon vs. intron, early vs. late exon, etc.), transcription factor and protein networks, chromatin modifications, and long-range DNA interactions, none of which have yet been adequately explored in an integrated analysis.

The germline Dnmt3aR878H/+ mouse model phenocopies many key aspects of the human syndrome. Age-related weight gain was associated with increased body fat content (with no change in lean mass), suggesting that these mice were indeed obese; obesity was exacerbated with high-fat diet but was not associated with overeating, and its physiologic basis was not revealed by standard metabolic studies. Epigenetic variability is associated with human obesity and can be altered by inheritance, diet, aging, and the intrauterine environment38. Behavioral tests revealed that Dnmt3aR878H/+ mice had reduced exploratory behavior, volitional movement deficits, and complex changes in learning and memory or anxiety-like behaviors often observed in DOS patients. While movement, coordination, memory, learning, and anxiety are intertwined, untangling these relationships in future studies will be important for understanding specific contributions of individual mutations to phenotypes in DOS patients. This model shows more locomotive deficits and less autism-like behaviors than noted in haploinsufficient mice39, suggesting possible specific genotype: phenotype relationships for different Dnmt3a mutations. Furthermore, movement deficits, which may be a consequence of behaviors, may contribute to the weight phenotype (or vice versa), and therefore, may have important implications for the development of obesity in DOS patients. Further studies of brain methylation phenotypes will be required to fully elucidate epigenome: phenotype correlations in these two models of DOS.

There are a number of epigenetic similarities between DOS patients and mice with germline Dnmt3aR878H/+ mutations. We identified 2172 DMRs in the bone marrow cells of Dnmt3aR878H/+ mice, while 2209 were identified in the blood cells of DNMT3AR882 DOS patients. Because the age and sex of the samples from both species were well matched, it is very unlikely that these DMRs were significantly affected by either covariate. Further, a post hoc sensitivity analysis of defined DMRs using linear regression to test for the age and sex as covariates revealed that genotype remained the most important predictor of methylation phenotype. The frequency of DMRs in annotated genomic locations was similar in both species, for example, DMRs were found in 7.52% of the gene bodies of R882 mutant human blood cells, vs. 6.24% of gene bodies in mice with the R878H mutation (Supplementary Fig. 1). Although direct human to mouse synteny mapping outside of gene bodies is informatically challenging, we have found that many of the DMRs within specific genes are similar between humans and mice. For illustrative purposes, we have highlighted similarly located DMRs in the HOXB gene cluster and the RASIP1 gene in both species. Although we do not suggest that dysregulation of these specific genes is directly relevant for the DOS phenotype, it is interesting to note that HOXB cluster genes have been found to be downregulated in children with autism spectrum disorders40, and that the RASIP1 locus has been found to be hypomethylated in the blood of children with the Floating Harbor Syndrome41, which is caused by pathogenic variants in SRCAP, and associated with facial dysmorphology and learning disabilities (MIM #136140). The similarity between phenotypes, DMRs, and DEGs between species suggests that this mouse model will be useful for the study of the epigenetic phenotypes in various organs, and how they may relate to the complex phenotypes observed in patients with this disorder.

Is DOS a leukemia susceptibility syndrome? Evidence from this and other studies suggest that it is2,4,6,13. Because many of the recently identified DOS patients are quite young, the lifetime risk of developing a hematologic malignancy is not yet clear. The incidence of childhood ALL is 35 per million, and childhood AML, 7 per million;42 although somatic DNMT3A mutations are common in adult AML, they are extremely rare in children with AML43. The number of identified DOS patients worldwide is currently ~200 (J. Kiernan, personal communication). Several of these patients have developed hematologic malignancies at an early age. In the largest series of DOS patients reported to date2, one out of 55 had developed AML at age 12 (DNMT3AY735S). In this series of 11 patients, one had presented with AML at age 12 (DNMT3AR882C); an additional DOS patient with a DNMT3AR882C mutation developed AML at age 154. With the help of the TBRS Community (a group of concerned TBRS families that facilitates research for this syndrome; https://tbrsyndrome.org), we have identified several other children and young adults with DOS and a history of hematologic malignancies (Ferris, Smith, and Ley, manuscript in preparation), including a 20-year-old with AML (DNMT3AI310F), a 9-year-old with Pre-B-ALL (DNMT3AR882H), a 7-year-old with a secondary T-cell leukemia/lymphoma (DNMT3AR882H), a 34-year-old with Essential Thrombocytosis (DNMT3AR882H), and a 27-year-old with Hodgkin Lymphoma (DNMT3AY735C). Importantly, cooperating mutations in genes identified in DOS patients with AML are similar to those found in adults with DNMT3A mutant AML, including FLT3-ITD, NPMc, and PTPN11 (UPN 894912)4. Mouse models have further established the risk of developing hematopoietic malignancies: Dnmt3a deficiency is associated with the development of myeloid, erythroid, B-, and T-cell malignances32,44,45,46,47,48,49,50 and Dnmt3a haploinsufficiency in the germline is associated with the development of myeloid malignancies after a long latent period30. In this report, Dnmt3aR878H/+ germline mutations are associated with the development of spontaneous myeloid and B-cell neoplasms. Natural history studies in humans will be needed to define the relative risk of leukemia development for different mutation types (i.e., haploinsufficiency-like vs. R882 mutations), and the nature of the genetic and epigenetic events that provoke progression. Regardless, these data suggest that the children with this disorder will need to be prospectively monitored for the development of hematopoietic (and perhaps other) malignancies. Indeed, young DOS patients have been described with a pituitary macroadenoma51, a medulloblastoma52, and a dermatofibrosarcoma in one patient in this study (UPN 228211).

DEGs defined in preleukemic blood cells in this study may provide some clues regarding the altered epigenetic state that predisposes to transformation. For example, several HOXBcluster genes are downregulated in cells expressing DNMT3AR882H. However, in AML cells, the same genes are often persistently expressed at high levels53, and genes within this cluster are normally repressed by methylation of the DERARE element between the Hoxb4and Hoxb5 genes54. HOXA and B cluster genes can also be dysregulated in AML by fusions with NUP9855,56,57,58,59,60,61, and dysregulation is seen in association with MLL translocations62 and NPM1 mutations63. These observations underlie the complexity of HOXgene dysregulation patterns in AML and underscore the large gaps in our knowledge regarding the events that govern the progression from preleukemia to overt transformation.

In summary, the data presented in this report suggest that many DNMT3A mutations associated with DOS decrease the methyltransferase activity of DNMT3A, causing a focal, canonical hypomethylation phenotype. This phenotype is more pronounced with mutations at R882, probably because of the dominant-negative effects of these mutations. The germline R878H mutation results in a mouse phenotype that recapitulates many features of human patients with the same mutation, strongly suggesting that this mutation is sufficient to cause the syndrome. The availability of this model will allow for more precise characterization of the epigenetic states that are responsible for the various features of this syndrome, and should provide a preclinical model for approaches designed to correct it.

Article classification: Biological abstract
Share to：
Tel:+86-027-87610298
Tel:+86-027-87610297
Add:Room A11-329, 1st Floor, No.1, SBI Venture Street, Optics Valley, East Lake
New Technology Development Zone, Wuhan, China.
Certificate NO.:U18Q28010569R0S