Enhancer-gene communication is dependent on topologically associating domains (TADs) and boundaries enforced by the CCCTC-binding factor (CTCF) insulator, but the underlying structures and mechanisms remain controversial. Here, we investigate a boundary that typically insulates fibroblast growth factor (FGF) oncogenes but is disrupted by DNA hypermethylation in gastrointestinal stromal tumors (GISTs). The boundary contains an array of CTCF sites that enforce adjacent TADs, one containing FGF genes and the other containing ANO1 and its putative enhancers, which are specifically active in GIST and its likely cell of origin. We show that coordinate disruption of four CTCF motifs in the boundary fuses the adjacent TADs, allows the ANO1 enhancer to contact FGF3, and causes its robust induction. High-resolution micro-C maps reveal specific contact between transcription initiation sites in the ANO1 enhancer and FGF3 promoter that quantitatively scales with FGF3 induction such that modest changes in contact frequency result in strong changes in expression, consistent with a causal relationship.
Publications
2024
Transcriptional coregulators and transcription factors (TFs) contain intrinsically disordered regions (IDRs) that are critical for their association and function in gene regulation. More recently, IDRs have been shown to promote multivalent protein-protein interactions between coregulators and TFs to drive their association into condensates. By contrast, here we demonstrate how the IDR of the corepressor LSD1 excludes TF association, acting as a dynamic conformational switch that tunes repression of active cis-regulatory elements. Hydrogen-deuterium exchange shows that the LSD1 IDR interconverts between transient open and closed conformational states, the latter of which inhibits partitioning of the protein's structured domains with TF condensates. This autoinhibitory switch controls leukemic differentiation by modulating repression of active cis-regulatory elements bound by LSD1 and master hematopoietic TFs. Together, these studies unveil alternative mechanisms by which disordered regions and their dynamic crosstalk with structured regions can shape coregulator-TF interactions to control cis-regulatory landscapes and cell fate.
2023
Mammalian SWI/SNF chromatin remodeling complexes move and evict nucleosomes at gene promoters and enhancers to modulate DNA access. Although SWI/SNF subunits are commonly mutated in disease, therapeutic options are limited by our inability to predict SWI/SNF gene targets and conflicting studies on functional significance. Here, we leverage a fast-acting inhibitor of SWI/SNF remodeling to elucidate direct targets and effects of SWI/SNF. Blocking SWI/SNF activity causes a rapid and global loss of chromatin accessibility and transcription. Whereas repression persists at most enhancers, we uncover a compensatory role for the EP400/TIP60 remodeler, which reestablishes accessibility at most promoters during prolonged loss of SWI/SNF. Indeed, we observe synthetic lethality between EP400 and SWI/SNF in cancer cell lines and human cancer patient data. Our data define a set of molecular genomic features that accurately predict gene sensitivity to SWI/SNF inhibition in diverse cancer cell lines, thereby improving the therapeutic potential of SWI/SNF inhibitors.
Epigenetic lesions that disrupt regulatory elements represent potential cancer drivers. However, we lack experimental models for validating their tumorigenic impact. Here, we model aberrations arising in isocitrate dehydrogenase-mutant gliomas, which exhibit DNA hypermethylation. We focus on a CTCF insulator near the PDGFRA oncogene that is recurrently disrupted by methylation in these tumors. We demonstrate that disruption of the syntenic insulator in mouse oligodendrocyte progenitor cells (OPCs) allows an OPC-specific enhancer to contact and induce Pdgfra, thereby increasing proliferation. We show that a second lesion, methylation-dependent silencing of the Cdkn2a tumor suppressor, cooperates with insulator loss in OPCs. Coordinate inactivation of the Pdgfra insulator and Cdkn2a drives gliomagenesis in vivo. Despite locus synteny, the insulator is CpG-rich only in humans, a feature that may confer human glioma risk but complicates mouse modeling. Our study demonstrates the capacity of recurrent epigenetic lesions to drive OPC proliferation in vitro and gliomagenesis in vivo.
Although vast numbers of putative gene regulatory elements have been cataloged, the sequence motifs and individual bases that underlie their functions remain largely unknown. Here, we combine epigenetic perturbations, base editing, and deep learning to dissect regulatory sequences within the exemplar immune locus encoding CD69. We converge on a ∼170 base interval within a differentially accessible and acetylated enhancer critical for CD69 induction in stimulated Jurkat T cells. Individual C-to-T base edits within the interval markedly reduce element accessibility and acetylation, with corresponding reduction of CD69 expression. The most potent base edits may be explained by their effect on regulatory interactions between the transcriptional activators GATA3 and TAL1 and the repressor BHLHE40. Systematic analysis suggests that the interplay between GATA3 and BHLHE40 plays a general role in rapid T cell transcriptional responses. Our study provides a framework for parsing regulatory elements in their endogenous chromatin contexts and identifying operative artificial variants.
Systematic evaluation of the impact of genetic variants is critical for the study and treatment of human physiology and disease. While specific mutations can be introduced by genome engineering, we still lack scalable approaches that are applicable to the important setting of primary cells, such as blood and immune cells. Here, we describe the development of massively parallel base-editing screens in human hematopoietic stem and progenitor cells. Such approaches enable functional screens for variant effects across any hematopoietic differentiation state. Moreover, they allow for rich phenotyping through single-cell RNA sequencing readouts and separately for characterization of editing outcomes through pooled single-cell genotyping. We efficiently design improved leukemia immunotherapy approaches, comprehensively identify non-coding variants modulating fetal hemoglobin expression, define mechanisms regulating hematopoietic differentiation, and probe the pathogenicity of uncharacterized disease-associated variants. These strategies will advance effective and high-throughput variant-to-function mapping in human hematopoiesis to identify the causes of diverse diseases.
Although enhancers are central regulators of mammalian gene expression, the mechanisms underlying enhancer-promoter (E-P) interactions remain unclear. Chromosome conformation capture (3C) methods effectively capture large-scale three-dimensional (3D) genome structure but struggle to achieve the depth necessary to resolve fine-scale E-P interactions. Here, we develop Region Capture Micro-C (RCMC) by combining micrococcal nuclease (MNase)-based 3C with a tiling region-capture approach and generate the deepest 3D genome maps reported with only modest sequencing. By applying RCMC in mouse embryonic stem cells and reaching the genome-wide equivalent of 317 billion unique contacts, RCMC reveals previously unresolvable patterns of highly nested and focal 3D interactions, which we term microcompartments. Microcompartments frequently connect enhancers and promoters, and although loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. We therefore propose that many E-P interactions form through a compartmentalization mechanism, which may partially explain why acute cohesin depletion only modestly affects global gene expression.
CRISPR gene editing holds great promise to modify DNA sequences in somatic cells to treat disease. However, standard computational and biochemical methods to predict off-target potential focus on reference genomes. We developed an efficient tool called CRISPRme that considers single-nucleotide polymorphism (SNP) and indel genetic variants to nominate and prioritize off-target sites. We tested the software with a BCL11A enhancer targeting guide RNA (gRNA) showing promise in clinical trials for sickle cell disease and β-thalassemia and found that the top candidate off-target is produced by an allele common in African-ancestry populations (MAF 4.5%) that introduces a protospacer adjacent motif (PAM) sequence. We validated that SpCas9 generates strictly allele-specific indels and pericentric inversions in CD34+ hematopoietic stem and progenitor cells (HSPCs), although high-fidelity Cas9 mitigates this off-target. This report illustrates how genetic variants should be considered as modifiers of gene editing outcomes. We expect that variant-aware off-target assessment will become integral to therapeutic genome editing evaluation and provide a powerful approach for comprehensive off-target nomination.
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
DNA methylation is critical for regulating gene expression, necessitating its accurate placement by enzymes such as the DNA methyltransferase DNMT3A. Dysregulation of this process is known to cause aberrant development and oncogenesis, yet how DNMT3A is regulated holistically by its three domains remains challenging to study. Here, we integrate base editing with a DNA methylation reporter to perform in situ mutational scanning of DNMT3A in cells. We identify mutations throughout the protein that perturb function, including ones at an interdomain interface that block allosteric activation. Unexpectedly, we also find mutations in the PWWP domain, a histone reader, that modulate enzyme activity despite preserving histone recognition and protein stability. These effects arise from altered PWWP domain DNA affinity, which we show is a noncanonical function required for full activity in cells. Our findings highlight mechanisms of interdomain crosstalk and demonstrate a generalizable strategy to probe sequence-activity relationships of nonessential chromatin regulators.