Publications

2023

Joung, Julia, Sai Ma, Tristan Tay, Kathryn R Geiger-Schuller, Paul C Kirchgatterer, Vanessa K Verdine, Baolin Guo, et al. (2023) 2023. “A Transcription Factor Atlas of Directed Differentiation”. Cell 186 (1): 209-229.e26. https://doi.org/10.1016/j.cell.2022.11.026.

Transcription factors (TFs) regulate gene programs, thereby controlling diverse cellular processes and cell states. To comprehensively understand TFs and the programs they control, we created a barcoded library of all annotated human TF splice isoforms (>3,500) and applied it to build a TF Atlas charting expression profiles of human embryonic stem cells (hESCs) overexpressing each TF at single-cell resolution. We mapped TF-induced expression profiles to reference cell types and validated candidate TFs for generation of diverse cell types, spanning all three germ layers and trophoblasts. Targeted screens with subsets of the library allowed us to create a tailored cellular disease model and integrate mRNA expression and chromatin accessibility data to identify downstream regulators. Finally, we characterized the effects of combinatorial TF overexpression by developing and validating a strategy for predicting combinations of TFs that produce target expression profiles matching reference cell types to accelerate cellular engineering efforts.

Mangiameli, Sarah M, Haiqi Chen, Andrew S Earl, Julie A Dobkin, Daniel Lesman, Jason D Buenrostro, and Fei Chen. (2023) 2023. “Photoselective Sequencing: Microscopically Guided Genomic Measurements With Subcellular Resolution”. Nature Methods 20 (5): 686-94. https://doi.org/10.1038/s41592-023-01845-8.

In biological systems, spatial organization and function are interconnected. Here we present photoselective sequencing, a new method for genomic and epigenomic profiling within morphologically distinct regions. Starting with an intact biological specimen, photoselective sequencing uses targeted illumination to selectively unblock a photocaged fragment library, restricting the sequencing-based readout to microscopically identified spatial regions. We validate photoselective sequencing by measuring the chromatin accessibility profiles of fluorescently labeled cell types within the mouse brain and comparing with published data. Furthermore, by combining photoselective sequencing with a computational strategy for decomposing bulk accessibility profiles, we find that the oligodendrocyte-lineage-cell population is relatively enriched for oligodendrocyte-progenitor cells in the cortex versus the corpus callosum. Finally, we leverage photoselective sequencing at the subcellular scale to identify features of chromatin that are correlated with positioning at the nuclear periphery. These results collectively demonstrate that photoselective sequencing is a flexible and generalizable platform for exploring the interplay of spatial structures with genomic and epigenomic properties.

Hu, Yan, Sai Ma, Vinay K Kartha, Fabiana M Duarte, Max Horlbeck, Ruochi Zhang, Rojesh Shrestha, et al. (2023) 2023. “Single-Cell Multi-Scale Footprinting Reveals the Modular Organization of DNA Regulatory Elements”. BioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2023.03.28.533945.

Cis -regulatory elements control gene expression and are dynamic in their structure, reflecting changes to the composition of diverse effector proteins over time 1-3 . Here we sought to connect the structural changes at cis -regulatory elements to alterations in cellular fate and function. To do this we developed PRINT, a computational method that uses deep learning to correct sequence bias in chromatin accessibility data and identifies multi-scale footprints of DNA-protein interactions. We find that multi-scale footprints enable more accurate inference of TF and nucleosome binding. Using PRINT with single-cell multi-omics, we discover wide-spread changes to the structure and function of candidate cis -regulatory elements (cCREs) across hematopoiesis, wherein nucleosomes slide, expose DNA for TF binding, and promote gene expression. Activity segmentation using the co-variance across cell states identifies "sub-cCREs" as modular cCRE subunits of regulatory DNA. We apply this single-cell and PRINT approach to characterize the age-associated alterations to cCREs within hematopoietic stem cells (HSCs). Remarkably, we find a spectrum of aging alterations among HSCs corresponding to a global gain of sub-cCRE activity while preserving cCRE accessibility. Collectively, we reveal the functional importance of cCRE structure across cell states, highlighting changes to gene regulation at single-cell and single-base-pair resolution.

2022

Uzquiano, Ana, Amanda J Kedaigle, Martina Pigoni, Bruna Paulsen, Xian Adiconis, Kwanho Kim, Tyler Faits, et al. (2022) 2022. “Proper Acquisition of Cell Class Identity in Organoids Allows Definition of Fate Specification Programs of the Human Cerebral Cortex”. Cell 185 (20): 3770-3788.e27. https://doi.org/10.1016/j.cell.2022.09.010.

Realizing the full utility of brain organoids to study human development requires understanding whether organoids precisely replicate endogenous cellular and molecular events, particularly since acquisition of cell identity in organoids can be impaired by abnormal metabolic states. We present a comprehensive single-cell transcriptomic, epigenetic, and spatial atlas of human cortical organoid development, comprising over 610,000 cells, from generation of neural progenitors through production of differentiated neuronal and glial subtypes. We show that processes of cellular diversification correlate closely to endogenous ones, irrespective of metabolic state, empowering the use of this atlas to study human fate specification. We define longitudinal molecular trajectories of cortical cell types during organoid development, identify genes with predicted human-specific roles in lineage establishment, and uncover early transcriptional diversity of human callosal neurons. The findings validate this comprehensive atlas of human corticogenesis in vitro as a resource to prime investigation into the mechanisms of human cortical development.

Glinos, Dafni A, Garrett Garborcauskas, Paul Hoffman, Nava Ehsan, Lihua Jiang, Alper Gokden, Xiaoguang Dai, et al. (2022) 2022. “Transcriptome Variation in Human Tissues Revealed by Long-Read Sequencing”. Nature 608 (7922): 353-59. https://doi.org/10.1038/s41586-022-05035-y.

Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.

Gabriele, Michele, Hugo B Brandão, Simon Grosse-Holz, Asmita Jha, Gina M Dailey, Claudia Cattoglio, Tsung-Han S Hsieh, Leonid Mirny, Christoph Zechner, and Anders S Hansen. (2022) 2022. “Dynamics of CTCF- and Cohesin-Mediated Chromatin Looping Revealed by Live-Cell Imaging”. Science (New York, N.Y.) 376 (6592): 496-501. https://doi.org/10.1126/science.abn6583.

Animal genomes are folded into loops and topologically associating domains (TADs) by CTCF and loop-extruding cohesins, but the live dynamics of loop formation and stability remain unknown. Here, we directly visualized chromatin looping at the Fbn2 TAD in mouse embryonic stem cells using super-resolution live-cell imaging and quantified looping dynamics by Bayesian inference. Unexpectedly, the Fbn2 loop was both rare and dynamic, with a looped fraction of approximately 3 to 6.5% and a median loop lifetime of approximately 10 to 30 minutes. Our results establish that the Fbn2 TAD is highly dynamic, and about 92% of the time, cohesin-extruded loops exist within the TAD without bridging both CTCF boundaries. This suggests that single CTCF boundaries, rather than the fully CTCF-CTCF looped state, may be the primary regulators of functional interactions.

Vlaming, Hanneke, Claudia A Mimoso, Andrew R Field, Benjamin J E Martin, and Karen Adelman. (2022) 2022. “Screening Thousands of Transcribed Coding and Non-Coding Regions Reveals Sequence Determinants of RNA Polymerase II Elongation Potential”. Nature Structural & Molecular Biology 29 (6): 613-20. https://doi.org/10.1038/s41594-022-00785-9.

Precise regulation of transcription by RNA polymerase II (RNAPII) is critical for organismal growth and development. However, what determines whether an engaged RNAPII will synthesize a full-length transcript or terminate prematurely is poorly understood. Notably, RNAPII is far more susceptible to termination when transcribing non-coding RNAs than when synthesizing protein-coding mRNAs, but the mechanisms underlying this are unclear. To investigate the impact of transcribed sequence on elongation potential, we developed a method to screen the effects of thousands of INtegrated Sequences on Expression of RNA and Translation using high-throughput sequencing (INSERT-seq). We found that higher AT content in non-coding RNAs, rather than specific sequence motifs, drives RNAPII termination. Further, we demonstrate that 5' splice sites autonomously stimulate processive transcription, even in the absence of polyadenylation signals. Our results reveal a potent role for the transcribed sequence in dictating gene output and demonstrate the power of INSERT-seq toward illuminating these contributions.

Zhao, Tongtong, Zachary D Chiang, Julia W Morriss, Lindsay M LaFave, Evan M Murray, Isabella Del Priore, Kevin Meli, et al. (2022) 2022. “Spatial Genomics Enables Multi-Modal Study of Clonal Heterogeneity in Tissues”. Nature 601 (7891): 85-91. https://doi.org/10.1038/s41586-021-04217-4.

The state and behaviour of a cell can be influenced by both genetic and environmental factors. In particular, tumour progression is determined by underlying genetic aberrations1-4 as well as the makeup of the tumour microenvironment5,6. Quantifying the contributions of these factors requires new technologies that can accurately measure the spatial location of genomic sequence together with phenotypic readouts. Here we developed slide-DNA-seq, a method for capturing spatially resolved DNA sequences from intact tissue sections. We demonstrate that this method accurately preserves local tumour architecture and enables the de novo discovery of distinct tumour clones and their copy number alterations. We then apply slide-DNA-seq to a mouse model of metastasis and a primary human cancer, revealing that clonal populations are confined to distinct spatial regions. Moreover, through integration with spatial transcriptomics, we uncover distinct sets of genes that are associated with clone-specific genetic aberrations, the local tumour microenvironment, or both. Together, this multi-modal spatial genomics approach provides a versatile platform for quantifying how cell-intrinsic and cell-extrinsic factors contribute to gene expression, protein abundance and other cellular phenotypes.

Johnstone, Sarah E, Vadim N Gladyshev, Martin J Aryee, and Bradley E Bernstein. (2022) 2022. “Epigenetic Clocks, Aging, and Cancer”. Science (New York, N.Y.) 378 (6626): 1276-77. https://doi.org/10.1126/science.abn4009.

Global methylation changes in aging cells affect cancer risk and tissue homeostasis.

Miller, Tyler E, Caleb A Lareau, Julia A Verga, Erica A K DePasquale, Vincent Liu, Daniel Ssozi, Katalin Sandor, et al. (2022) 2022. “Mitochondrial Variant Enrichment from High-Throughput Single-Cell RNA Sequencing Resolves Clonal Populations”. Nature Biotechnology 40 (7): 1030-34. https://doi.org/10.1038/s41587-022-01210-8.

The combination of single-cell transcriptomics with mitochondrial DNA variant detection can be used to establish lineage relationships in primary human cells, but current methods are not scalable to interrogate complex tissues. Here, we combine common 3' single-cell RNA-sequencing protocols with mitochondrial transcriptome enrichment to increase coverage by more than 50-fold, enabling high-confidence mutation detection. The method successfully identifies skewed immune-cell expansions in primary human clonal hematopoiesis.