Publications

2025

Inamo, Jun, Joshua Keegan, Alec Griffith, Tusharkanti Ghosh, Alice Horisberger, Kaitlyn Howard, John F Pulford, et al. (2025) 2025. “Deep Immunophenotyping Reveals Circulating Activated Lymphocytes in Individuals at Risk for Rheumatoid Arthritis.”. The Journal of Clinical Investigation 135 (6). https://doi.org/10.1172/JCI185217.

Rheumatoid arthritis (RA) is a systemic autoimmune disease currently with no universally highly effective prevention strategies. Identifying pathogenic immune phenotypes in at-risk populations prior to clinical onset is crucial to establishing effective prevention strategies. Here, we applied multimodal single-cell technologies (mass cytometry and CITE-Seq) to characterize the immunophenotypes in blood from at-risk individuals (ARIs) identified through the presence of serum antibodies against citrullinated protein antigens (ACPAs) and/or first-degree relative (FDR) status, as compared with patients with established RA and people in a healthy control group. We identified significant cell expansions in ARIs compared with controls, including CCR2+CD4+ T cells, T peripheral helper (Tph) cells, type 1 T helper cells, and CXCR5+CD8+ T cells. We also found that CD15+ classical monocytes were specifically expanded in ACPA-negative FDRs, and an activated PAX5lo naive B cell population was expanded in ACPA-positive FDRs. Further, we uncovered the molecular phenotype of the CCR2+CD4+ T cells, expressing high levels of Th17- and Th22-related signature transcripts including CCR6, IL23R, KLRB1, CD96, and IL22. Our integrated study provides a promising approach to identify targets to improve prevention strategy development for RA.

Xu, Ziqi, Arya Massarat, Laurie Rumker, Melissa Gymrek, Soumya Raychaudhuri, Wei Zhou, and Tiffany Amariuta. (2025) 2025. “Estimating the Cis -Heritability of Gene Expression Using Single Cell Expression Profiles Controls False Positive Rate of EGene Detection.”. BioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2025.02.24.639892.

For gene expression traits, cis -genetic heritability can quantify the strength of genetic regulation in particular cell types, elucidating the cell-type-specificity of disease variants and genes. To estimate gene expression heritability, standard models require a single gene expression value per individual, forcing data from single cell RNA-sequencing (scRNA-seq) experiments to be "pseudobulked". Here, we show that applying standard heritability models to pseudobulk data overestimates gene expression heritability and produces inflated false positive rates for detecting cis -heritable genes. Therefore, we introduce a new method called scGeneHE ( s ingle c ell Gene expression H eritability E stimation), a Poisson mixed-effects model that quantifies the cis -genetic component of gene expression using individual cellular profiles. In simulations, scGeneHE has a consistently well-calibrated false positive rate for eGene detection and unbiasedly estimates cis -heritability at many parameter settings. We applied scGeneHE to scRNA-seq data from 969 individuals, 11 immune cell types, and 822,552 cells from the OneK1K cohort to infer cell-type-specificity of genetic regulation at risk genes for immune-mediated diseases and trace the fluctuation of cis -heritability across cellular populations of varying resolution. In summary, we developed a new statistical method that resolves the analytical challenge of estimating gene expression cis -heritability from native scRNA-seq data.

Millard, Nghia, Jonathan H Chen, Mukta G Palshikar, Karin Pelka, Maxwell Spurrell, Colles Price, Jiang He, Nir Hacohen, Soumya Raychaudhuri, and Ilya Korsunsky. (2025) 2025. “Batch Correcting Single-Cell Spatial Transcriptomics Count Data With Crescendo Improves Visualization and Detection of Spatial Gene Patterns.”. Genome Biology 26 (1): 36. https://doi.org/10.1186/s13059-025-03479-9.

Spatial transcriptomics facilitates gene expression analysis of cells in their spatial anatomical context. Batch effects hinder visualization of gene spatial patterns across samples. We present the Crescendo algorithm to correct for batch effects at the gene expression level and enable accurate visualization of gene expression patterns across multiple samples. We show Crescendo's utility and scalability across three datasets ranging from 170,000 to 7 million single cells across spatial and single-cell RNA sequencing technologies. By correcting for batch effects, Crescendo enhances spatial transcriptomics analyses to detect gene colocalization and ligand-receptor interactions and enables cross-technology information transfer.

Reshef, Yakir, Lakshay Sood, Michelle Curtis, Laurie Rumker, Daniel J Stein, Mukta G Palshikar, Saba Nayar, et al. (2025) 2025. “Powerful and Accurate Case-Control Analysis of Spatial Molecular Data With Deep Learning-Defined Tissue Microniches.”. BioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2025.02.07.637149.

As spatial molecular data grow in scope and resolution, there is a pressing need to identify key spatial structures associated with disease. Current approaches often rely on hand-crafted features such as local abundances of manually annotated, discrete cell types, which may overlook important signals. Here we introduce variational inference-based microniche analysis (VIMA), a method that combines deep learning with principled statistics to discover associated spatial features with greater flexibility and precision. VIMA uses a variational autoencoder to extract numerical "fingerprints" from small tissue patches that capture their biological content. It uses these fingerprints to define a large number of "microniches" - small, potentially overlapping groups of tissue patches with highly similar biology that span multiple samples. It then uses rigorous statistics to identify microniches whose abundance correlates with case-control status. We show in simulations that VIMA is well calibrated and more powerful and accurate than other approaches. We then apply VIMA to a 140-gene spatial transcriptomics dataset in Alzheimer's dementia, a 54-marker CO-Detection by indEXing (CODEX) dataset in ulcerative colitis (UC), and a 7-marker immunohistochemistry dataset in rheumatoid arthritis (RA), in each case recapitulating known biology and identifying novel spatial features of disease.

Donado, Carlos A, Erin Theisen, Fan Zhang, Aparna Nathan, Madison L Fairfield, Karishma Vijay Rupani, Dominique Jones, et al. (2025) 2025. “Granzyme K Activates the Entire Complement Cascade.”. Nature. https://doi.org/10.1038/s41586-025-08713-9.

Granzymes are a family of serine proteases mainly expressed by CD8+ T cells, natural killer cells, and innate-like lymphocytes1. Although their primary function is thought to be the induction of cell death in virally infected and tumor cells, accumulating evidence indicates certain granzymes can elicit inflammation by acting on extracellular substrates1. Recently, we found that the majority of tissue CD8+ T cells in rheumatoid arthritis (RA) synovium and in inflamed organs across other diseases express granzyme K (GZMK)2, a tryptase-like protease with poorly defined function. Here, we show that GZMK can activate the complement cascade by cleaving C2 and C4. The nascent C4b and C2b fragments form a C3 convertase that cleaves C3, enabling assembly of a C5 convertase that cleaves C5. The resulting convertases generate all the effector molecules of the complement cascade: the anaphylatoxins C3a and C5a, the opsonins C4b and C3b, and the membrane attack complex. In RA synovium, GZMK is enriched in regions with abundant complement activation, and fibroblasts are the major producers of complement proteins that serve as substrates for GZMK-mediated complement activation. Further, Gzmk-deficient mice have less severe arthritis and dermatitis with concomitant decreases in complement activation. Our findings describe the discovery of a previously unidentified mechanism of complement activation that is entirely driven by lymphocyte-derived GZMK. Given the widespread abundance of GZMK-expressing T cells in tissues in chronic inflammatory diseases, GZMK-mediated complement activation is likely to be an important contributor to tissue inflammation in multiple disease contexts.

Mueller, Alisa A, Angela E Zou, Lucy-Jayne Marsh, Samuel Kemble, Saba Nayar, Gerald F M Watts, Cassandra L Murphy, et al. (2025) 2025. “Wnt Signaling Drives Stromal Inflammation in Inflammatory Arthritis.”. BioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2025.01.06.631510.

The concept that fibroblasts are critical mediators of inflammation is an emerging paradigm. In rheumatoid arthritis (RA), they are the main producers of IL-6 as well as a host of other cytokines and chemokines. Their pathologic activation also directly causes cartilage and bone degradation. Yet, therapeutic agents specifically targeting fibroblasts are not available. Here, we find that Wnt receptors and modulators are predominantly expressed in stromal populations in the synovium. Importantly, non-canonical Wnt activation induces robust inflammatory gene expression including an abundance of cytokines and chemokines in synovial fibroblasts in vitro . Strikingly, the addition of Wnt ligands or inhibition of Wnt secretion exacerbates or reduces arthritis severity, respectively, in vivo in a murine model of inflammatory arthritis. These observations are relevant in human disease, as Wnt activation signatures are enhanced in fibroblasts derived from inflamed RA synovial tissue as well as fibroblasts across other inflammatory diseases. Together, these findings implicate Wnt signaling as a major driver of fibroblast-mediated inflammation and joint pathology. They further suggest that targeting the Wnt pathway is a therapeutically relevant approach to rheumatoid arthritis, particularly in patients who do not respond to conventional treatments and who often express fibroblast-predominant synovial phenotypes.

2024

Lagattuta, Kaitlyn A, Ayano C Kohlgruber, Nouran S Abdelfattah, Aparna Nathan, Laurie Rumker, Michael E Birnbaum, Stephen J Elledge, and Soumya Raychaudhuri. (2024) 2024. “The T cell Receptor Sequence Influences the Likelihood of T cell Memory Formation.”. Cell Reports 44 (1): 115098. https://doi.org/10.1016/j.celrep.2024.115098.

The amino acid sequence of the T cell receptor (TCR) varies between T cells of an individual's immune system. Particular TCR residues nearly guarantee mucosal-associated invariant T (MAIT) and natural killer T (NKT) cell transcriptional fates. To define how the TCR sequence affects T cell fates, we analyze the paired αβTCR sequence and transcriptome of 961,531 single cells. We find that hydrophobic complementarity-determining region (CDR)3 residues promote regulatory T cell fates in both the CD8 and CD4 lineages. Most strikingly, we find a set of TCR sequence features that promote the T cell transition from naive to memory. We quantify the extent of these features through our TCR scoring function "TCR-mem." Using TCR transduction experiments, we demonstrate that increased TCR-mem promotes T cell activation, even among T cells that recognize the same antigen. Our results reveal a common set of TCR sequence features that enable T cell activation and immunological memory.

Tegtmeyer, Matthew, Jatin Arora, Samira Asgari, Beth A Cimini, Ajay Nadig, Emily Peirent, Dhara Liyanage, et al. (2024) 2024. “High-Dimensional Phenotyping to Define the Genetic Basis of Cellular Morphology.”. Nature Communications 15 (1): 347. https://doi.org/10.1038/s41467-023-44045-w.

The morphology of cells is dynamic and mediated by genetic and environmental factors. Characterizing how genetic variation impacts cell morphology can provide an important link between disease association and cellular function. Here, we combine genomic sequencing and high-content imaging approaches on iPSCs from 297 unique donors to investigate the relationship between genetic variants and cellular morphology to map what we term cell morphological quantitative trait loci (cmQTLs). We identify novel associations between rare protein altering variants in WASF2, TSPAN15, and PRLR with several morphological traits related to cell shape, nucleic granularity, and mitochondrial distribution. Knockdown of these genes by CRISPRi confirms their role in cell morphology. Analysis of common variants yields one significant association and nominate over 300 variants with suggestive evidence (P < 10-6) of association with one or more morphology traits. We then use these data to make predictions about sample size requirements for increasing discovery in cellular genetic studies. We conclude that, similar to molecular phenotypes, morphological profiling can yield insight about the function of genes and variants.

Sakaue, Saori, Kathryn Weinand, Shakson Isaac, Kushal K Dey, Karthik Jagadeesh, Masahiro Kanai, Gerald F M Watts, et al. (2024) 2024. “Tissue-Specific Enhancer-Gene Maps from Multimodal Single-Cell Data Identify Causal Disease Alleles.”. Nature Genetics 56 (4): 615-26. https://doi.org/10.1038/s41588-024-01682-1.

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.

Weinand, Kathryn, Saori Sakaue, Aparna Nathan, Anna Helena Jonsson, Fan Zhang, Gerald F M Watts, Majd Al Suqri, et al. (2024) 2024. “The Chromatin Landscape of Pathogenic Transcriptional Cell States in Rheumatoid Arthritis.”. Nature Communications 15 (1): 4650. https://doi.org/10.1038/s41467-024-48620-7.

Synovial tissue inflammation is a hallmark of rheumatoid arthritis (RA). Recent work has identified prominent pathogenic cell states in inflamed RA synovial tissue, such as T peripheral helper cells; however, the epigenetic regulation of these states has yet to be defined. Here, we examine genome-wide open chromatin at single-cell resolution in 30 synovial tissue samples, including 12 samples with transcriptional data in multimodal experiments. We identify 24 chromatin classes and predict their associated transcription factors, including a CD8 + GZMK+ class associated with EOMES and a lining fibroblast class associated with AP-1. By integrating with an RA tissue transcriptional atlas, we propose that these chromatin classes represent 'superstates' corresponding to multiple transcriptional cell states. Finally, we demonstrate the utility of this RA tissue chromatin atlas through the associations between disease phenotypes and chromatin class abundance, as well as the nomination of classes mediating the effects of putatively causal RA genetic variants.