Publications

2023

Gupta, Anika, Kathryn Weinand, Aparna Nathan, Saori Sakaue, Martin Jinye Zhang, Accelerating Medicines Partnership RA/SLE Program and Network, Laura Donlin, et al. (2023) 2023. “Dynamic Regulatory Elements in Single-Cell Multimodal Data Implicate Key Immune Cell States Enriched for Autoimmune Disease Heritability.”. Nature Genetics 55 (12): 2200-2210. https://doi.org/10.1038/s41588-023-01577-7.

In autoimmune diseases such as rheumatoid arthritis, the immune system attacks the body's own cells. Developing a precise understanding of the cell states where noncoding autoimmune risk variants impart causal mechanisms is critical to developing curative therapies. Here, to identify noncoding regions with accessible chromatin that associate with cell-state-defining gene expression patterns, we leveraged multimodal single-nucleus RNA and assay for transposase-accessible chromatin (ATAC) sequencing data across 28,674 cells from the inflamed synovial tissue of 12 donors. Specifically, we used a multivariate Poisson model to predict peak accessibility from single-nucleus RNA sequencing principal components. For 14 autoimmune diseases, we discovered that cell-state-dependent ('dynamic') chromatin accessibility peaks in immune cell types were enriched for heritability, compared with cell-state-invariant ('cs-invariant') peaks. These dynamic peaks marked regulatory elements associated with T peripheral helper, regulatory T, dendritic and STAT1+CXCL10+ myeloid cell states. We argue that dynamic regulatory elements can help identify precise cell states enriched for disease-critical genetic variation.

Sakaue, Saori, Saisriram Gurajala, Michelle Curtis, Yang Luo, Wanson Choi, Kazuyoshi Ishigaki, Joyce B Kang, et al. (2023) 2023. “Tutorial: a Statistical Genetics Guide to Identifying HLA Alleles Driving Complex Disease.”. Nature Protocols 18 (9): 2625-41. https://doi.org/10.1038/s41596-023-00853-4.

The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software ( https://github.com/immunogenomics/HLA_analyses_tutorial ). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations.

Zhang, Fan, Anna Helena Jonsson, Aparna Nathan, Nghia Millard, Michelle Curtis, Qian Xiao, Maria Gutierrez-Arcelus, et al. (2023) 2023. “Deconstruction of Rheumatoid Arthritis Synovium Defines Inflammatory Subtypes.”. Nature 623 (7987): 616-24. https://doi.org/10.1038/s41586-023-06708-y.

Rheumatoid arthritis is a prototypical autoimmune disease that causes joint inflammation and destruction1. There is currently no cure for rheumatoid arthritis, and the effectiveness of treatments varies across patients, suggesting an undefined pathogenic diversity1,2. Here, to deconstruct the cell states and pathways that characterize this pathogenic heterogeneity, we profiled the full spectrum of cells in inflamed synovium from patients with rheumatoid arthritis. We used multi-modal single-cell RNA-sequencing and surface protein data coupled with histology of synovial tissue from 79 donors to build single-cell atlas of rheumatoid arthritis synovial tissue that includes more than 314,000 cells. We stratified tissues into six groups, referred to as cell-type abundance phenotypes (CTAPs), each characterized by selectively enriched cell states. These CTAPs demonstrate the diversity of synovial inflammation in rheumatoid arthritis, ranging from samples enriched for T and B cells to those largely lacking lymphocytes. Disease-relevant cell states, cytokines, risk genes, histology and serology metrics are associated with particular CTAPs. CTAPs are dynamic and can predict treatment response, highlighting the clinical utility of classifying rheumatoid arthritis synovial phenotypes. This comprehensive atlas and molecular, tissue-based stratification of rheumatoid arthritis synovial tissue reveal new insights into rheumatoid arthritis pathology and heterogeneity that could inform novel targeted treatments.

2022

Reshef, Yakir A, Laurie Rumker, Joyce B Kang, Aparna Nathan, Ilya Korsunsky, Samira Asgari, Megan B Murray, Branch Moody, and Soumya Raychaudhuri. (2022) 2022. “Co-Varying Neighborhood Analysis Identifies Cell Populations Associated With Phenotypes of Interest from Single-Cell Transcriptomics.”. Nature Biotechnology 40 (3): 355-63. https://doi.org/10.1038/s41587-021-01066-4.

As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes, such as clinical phenotypes. Current statistical approaches typically map cells to clusters and then assess differences in cluster abundance. Here we present co-varying neighborhood analysis (CNA), an unbiased method to identify associated cell populations with greater flexibility than cluster-based approaches. CNA characterizes dominant axes of variation across samples by identifying groups of small regions in transcriptional space-termed neighborhoods-that co-vary in abundance across samples, suggesting shared function or regulation. CNA performs statistical testing for associations between any sample-level attribute and the abundances of these co-varying neighborhood groups. Simulations show that CNA enables more sensitive and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, identifies monocyte populations expanded in sepsis and identifies a novel T cell population associated with progression to active tuberculosis.

Ishigaki, Kazuyoshi, Kaitlyn A Lagattuta, Yang Luo, Eddie A James, Jane H Buckner, and Soumya Raychaudhuri. (2022) 2022. “HLA Autoimmune Risk Alleles Restrict the Hypervariable Region of T Cell Receptors.”. Nature Genetics 54 (4): 393-402. https://doi.org/10.1038/s41588-022-01032-z.

Polymorphisms in the human leukocyte antigen (HLA) genes strongly influence autoimmune disease risk. HLA risk alleles may influence thymic selection to increase the frequency of T cell receptors (TCRs) reactive to autoantigens (central hypothesis). However, research in human autoimmunity has provided little evidence supporting the central hypothesis. Here we investigated the influence of HLA alleles on TCR composition at the highly diverse complementarity determining region 3 (CDR3), which confers antigen recognition. We observed unexpectedly strong HLA-CDR3 associations. The strongest association was found at HLA-DRB1 amino acid position 13, the position that mediates genetic risk for multiple autoimmune diseases. We identified multiple CDR3 amino acid features enriched by HLA risk alleles. Moreover, the CDR3 features promoted by the HLA risk alleles are more enriched in candidate pathogenic TCRs than control TCRs (for example, citrullinated epitope-specific TCRs in patients with rheumatoid arthritis). Together, these results provide genetic evidence supporting the central hypothesis.

Lagattuta, Kaitlyn A, Joyce B Kang, Aparna Nathan, Kristen E Pauken, Anna Helena Jonsson, Deepak A Rao, Arlene H Sharpe, Kazuyoshi Ishigaki, and Soumya Raychaudhuri. (2022) 2022. “Repertoire Analyses Reveal T Cell Antigen Receptor Sequence Features That Influence T Cell Fate.”. Nature Immunology 23 (3): 446-57. https://doi.org/10.1038/s41590-022-01129-x.

T cells acquire a regulatory phenotype when their T cell antigen receptors (TCRs) experience an intermediate- to high-affinity interaction with a self-peptide presented via the major histocompatibility complex (MHC). Using TCRβ sequences from flow-sorted human cells, we identified TCR features that promote regulatory T cell (Treg) fate. From these results, we developed a scoring system to quantify TCR-intrinsic regulatory potential (TiRP). When applied to the tumor microenvironment, TiRP scoring helped to explain why only some T cell clones maintained the conventional T cell (Tconv) phenotype through expansion. To elucidate drivers of these predictive TCR features, we then examined the two elements of the Treg TCR ligand separately: the self-peptide and the human MHC class II molecule. These analyses revealed that hydrophobicity in the third complementarity-determining region (CDR3β) of the TCR promotes reactivity to self-peptides, while TCR variable gene (TRBV gene) usage shapes the TCR's general propensity for human MHC class II-restricted activation.

Nathan, Aparna, Samira Asgari, Kazuyoshi Ishigaki, Cristian Valencia, Tiffany Amariuta, Yang Luo, Jessica I Beynor, et al. (2022) 2022. “Single-Cell EQTL Models Reveal Dynamic T Cell State Dependence of Disease Loci.”. Nature 606 (7912): 120-28. https://doi.org/10.1038/s41586-022-04713-1.

Non-coding genetic variants may cause disease by modulating gene expression. However, identifying these expression quantitative trait loci (eQTLs) is complicated by differences in gene regulation across fluid functional cell states within cell types. These states-for example, neurotransmitter-driven programs in astrocytes or perivascular fibroblast differentiation-are obscured in eQTL studies that aggregate cells1,2. Here we modelled eQTLs at single-cell resolution in one complex cell type: memory T cells. Using more than 500,000 unstimulated memory T cells from 259 Peruvian individuals, we show that around one-third of 6,511 cis-eQTLs had effects that were mediated by continuous multimodally defined cell states, such as cytotoxicity and regulatory capacity. In some loci, independent eQTL variants had opposing cell-state relationships. Autoimmune variants were enriched in cell-state-dependent eQTLs, including risk variants for rheumatoid arthritis near ORMDL3 and CTLA4; this indicates that cell-state context is crucial to understanding potential eQTL pathogenicity. Moreover, continuous cell states explained more variation in eQTLs than did conventional discrete categories, such as CD4+ versus CD8+, suggesting that modelling eQTLs and cell states at single-cell resolution can expand insight into gene regulation in functionally heterogeneous cell types.

Korsunsky, Ilya, Kevin Wei, Mathilde Pohin, Edy Y Kim, Francesca Barone, Triin Major, Emily Taylor, et al. (2022) 2022. “Cross-Tissue, Single-Cell Stromal Atlas Identifies Shared Pathological Fibroblast Phenotypes in Four Chronic Inflammatory Diseases.”. Med (New York, N.Y.) 3 (7): 481-518.e14. https://doi.org/10.1016/j.medj.2022.05.002.

BACKGROUND: Pro-inflammatory fibroblasts are critical for pathogenesis in rheumatoid arthritis, inflammatory bowel disease, interstitial lung disease, and Sjögren's syndrome and represent a novel therapeutic target for chronic inflammatory disease. However, the heterogeneity of fibroblast phenotypes, exacerbated by the lack of a common cross-tissue taxonomy, has limited our understanding of which pathways are shared by multiple diseases.

METHODS: We profiled fibroblasts derived from inflamed and non-inflamed synovium, intestine, lungs, and salivary glands from affected individuals with single-cell RNA sequencing. We integrated all fibroblasts into a multi-tissue atlas to characterize shared and tissue-specific phenotypes.

FINDINGS: Two shared clusters, CXCL10+CCL19+ immune-interacting and SPARC+COL3A1+ vascular-interacting fibroblasts, were expanded in all inflamed tissues and mapped to dermal analogs in a public atopic dermatitis atlas. We confirmed these human pro-inflammatory fibroblasts in animal models of lung, joint, and intestinal inflammation.

CONCLUSIONS: This work represents a thorough investigation into fibroblasts across organ systems, individual donors, and disease states that reveals shared pathogenic activation states across four chronic inflammatory diseases.

FUNDING: Grant from F. Hoffmann-La Roche (Roche) AG.

Ishigaki, Kazuyoshi, Saori Sakaue, Chikashi Terao, Yang Luo, Kyuto Sonehara, Kensuke Yamaguchi, Tiffany Amariuta, et al. (2022) 2022. “Multi-Ancestry Genome-Wide Association Analyses Identify Novel Genetic Mechanisms in Rheumatoid Arthritis.”. Nature Genetics 54 (11): 1640-51. https://doi.org/10.1038/s41588-022-01213-w.

Rheumatoid arthritis (RA) is a highly heritable complex disease with unknown etiology. Multi-ancestry genetic research of RA promises to improve power to detect genetic signals, fine-mapping resolution and performances of polygenic risk scores (PRS). Here, we present a large-scale genome-wide association study (GWAS) of RA, which includes 276,020 samples from five ancestral groups. We conducted a multi-ancestry meta-analysis and identified 124 loci (P < 5 × 10-8), of which 34 are novel. Candidate genes at the novel loci suggest essential roles of the immune system (for example, TNIP2 and TNFRSF11A) and joint tissues (for example, WISP1) in RA etiology. Multi-ancestry fine-mapping identified putatively causal variants with biological insights (for example, LEF1). Moreover, PRS based on multi-ancestry GWAS outperformed PRS based on single-ancestry GWAS and had comparable performance between populations of European and East Asian ancestries. Our study provides several insights into the etiology of RA and improves the genetic predictability of RA.

Reshef, Yakir A, Laurie Rumker, Joyce B Kang, Aparna Nathan, Ilya Korsunsky, Samira Asgari, Megan B Murray, Branch Moody, and Soumya Raychaudhuri. (2022) 2022. “Co-Varying Neighborhood Analysis Identifies Cell Populations Associated With Phenotypes of Interest from Single-Cell Transcriptomics.”. Nature Biotechnology 40 (3): 355-63. https://doi.org/10.1038/s41587-021-01066-4.

As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes, such as clinical phenotypes. Current statistical approaches typically map cells to clusters and then assess differences in cluster abundance. Here we present co-varying neighborhood analysis (CNA), an unbiased method to identify associated cell populations with greater flexibility than cluster-based approaches. CNA characterizes dominant axes of variation across samples by identifying groups of small regions in transcriptional space-termed neighborhoods-that co-vary in abundance across samples, suggesting shared function or regulation. CNA performs statistical testing for associations between any sample-level attribute and the abundances of these co-varying neighborhood groups. Simulations show that CNA enables more sensitive and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, identifies monocyte populations expanded in sepsis and identifies a novel T cell population associated with progression to active tuberculosis.