Publications

2024

Lagattuta, Kaitlyn A, Hannah L Park, Laurie Rumker, Kazuyoshi Ishigaki, Aparna Nathan, and Soumya Raychaudhuri. (2024) 2024. “The Genetic Basis of Autoimmunity Seen through the Lens of T Cell Functional Traits.”. Nature Communications 15 (1): 1204. https://doi.org/10.1038/s41467-024-45170-w.

Autoimmune disease heritability is enriched in T cell-specific regulatory regions of the genome. Modern-day T cell datasets now enable association studies between single nucleotide polymorphisms (SNPs) and a myriad of molecular phenotypes, including chromatin accessibility, gene expression, transcriptional programs, T cell antigen receptor (TCR) amino acid usage, and cell state abundances. Such studies have identified hundreds of quantitative trait loci (QTLs) in T cells that colocalize with genetic risk for autoimmune disease. The key challenge facing immunologists today lies in synthesizing these results toward a unified understanding of the autoimmune T cell: which genes, cell states, and antigens drive tissue destruction?

Weinand, Kathryn, Saori Sakaue, Aparna Nathan, Anna Helena Jonsson, Fan Zhang, Gerald F M Watts, Majd Al Suqri, et al. (2024) 2024. “The Chromatin Landscape of Pathogenic Transcriptional Cell States in Rheumatoid Arthritis.”. Nature Communications 15 (1): 4650. https://doi.org/10.1038/s41467-024-48620-7.

Synovial tissue inflammation is a hallmark of rheumatoid arthritis (RA). Recent work has identified prominent pathogenic cell states in inflamed RA synovial tissue, such as T peripheral helper cells; however, the epigenetic regulation of these states has yet to be defined. Here, we examine genome-wide open chromatin at single-cell resolution in 30 synovial tissue samples, including 12 samples with transcriptional data in multimodal experiments. We identify 24 chromatin classes and predict their associated transcription factors, including a CD8 + GZMK+ class associated with EOMES and a lining fibroblast class associated with AP-1. By integrating with an RA tissue transcriptional atlas, we propose that these chromatin classes represent 'superstates' corresponding to multiple transcriptional cell states. Finally, we demonstrate the utility of this RA tissue chromatin atlas through the associations between disease phenotypes and chromatin class abundance, as well as the nomination of classes mediating the effects of putatively causal RA genetic variants.

Sakaue, Saori, Kathryn Weinand, Shakson Isaac, Kushal K Dey, Karthik Jagadeesh, Masahiro Kanai, Gerald F M Watts, et al. (2024) 2024. “Tissue-Specific Enhancer-Gene Maps from Multimodal Single-Cell Data Identify Causal Disease Alleles.”. Nature Genetics 56 (4): 615-26. https://doi.org/10.1038/s41588-024-01682-1.

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.

Luo, Yang, Chuan-Chin Huang, Nicole C Howard, Xin Wang, Qingyun Liu, Xinyi Li, Junhao Zhu, et al. (2024) 2024. “Paired Analysis of Host and Pathogen Genomes Identifies Determinants of Human Tuberculosis.”. Nature Communications 15 (1): 10393. https://doi.org/10.1038/s41467-024-54741-w.

Infectious disease is the result of interactions between host and pathogen and can depend on genetic variations in both. We conduct a genome-to-genome study of paired human and Mycobacterium tuberculosis genomes from a cohort of 1556 tuberculosis patients in Lima, Peru. We identify an association between a human intronic variant (rs3130660, OR = 10.06, 95%CI: 4.87 - 20.77, P = 7.92 × 10-8) in the FLOT1 gene and a subclavaluee of Mtb Lineage 2. In a human macrophage infection model, we observe hosts with the rs3130660-A allele exhibited stronger interferon gene signatures. The interacting strains have altered redox states due to a thioredoxin reductase mutation. We investigate this association in a 2020 cohort of 699 patients recruited during the COVID-19 pandemic. While the prevalence of the interacting strain almost doubled between 2010 and 2020, its infection is not associated with rs3130660 in this recent cohort. These findings suggest a complex interplay among host, pathogen, and environmental factors in tuberculosis dynamics.

Rumker, Laurie, Saori Sakaue, Yakir Reshef, Joyce B Kang, Seyhan Yazar, Jose Alquicira-Hernandez, Cristian Valencia, et al. (2024) 2024. “Identifying Genetic Variants That Influence the Abundance of Cell States in Single-Cell Data.”. Nature Genetics 56 (10): 2068-77. https://doi.org/10.1038/s41588-024-01909-1.

Disease risk alleles influence the composition of cells present in the body, but modeling genetic effects on the cell states revealed by single-cell profiling is difficult because variant-associated states may reflect diverse combinations of the profiled cell features that are challenging to predefine. We introduce Genotype-Neighborhood Associations (GeNA), a statistical tool to identify cell-state abundance quantitative trait loci (csaQTLs) in high-dimensional single-cell datasets. Instead of testing associations to predefined cell states, GeNA flexibly identifies the cell states whose abundance is most associated with genetic variants. In a genome-wide survey of single-cell RNA sequencing peripheral blood profiling from 969 individuals, GeNA identifies five independent loci associated with shifts in the relative abundance of immune cell states. For example, rs3003-T (P = 1.96 × 10-11) associates with increased abundance of natural killer cells expressing tumor necrosis factor response programs. This csaQTL colocalizes with increased risk for psoriasis, an autoimmune disease that responds to anti-tumor necrosis factor treatments. Flexibly characterizing csaQTLs for granular cell states may help illuminate how genetic background alters cellular composition to confer disease risk.

Sakaue, Saori, Kathryn Weinand, Shakson Isaac, Kushal K Dey, Karthik Jagadeesh, Masahiro Kanai, Gerald F M Watts, et al. (2024) 2024. “Tissue-Specific Enhancer-Gene Maps from Multimodal Single-Cell Data Identify Causal Disease Alleles.”. Nature Genetics 56 (4): 615-26. https://doi.org/10.1038/s41588-024-01682-1.

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.

2023

Kang, Joyce B, Alessandro Raveane, Aparna Nathan, Nicole Soranzo, and Soumya Raychaudhuri. (2023) 2023. “Methods and Insights from Single-Cell Expression Quantitative Trait Loci.”. Annual Review of Genomics and Human Genetics 24: 277-303. https://doi.org/10.1146/annurev-genom-101422-100437.

Recent advancements in single-cell technologies have enabled expression quantitative trait locus (eQTL) analysis across many individuals at single-cell resolution. Compared with bulk RNA sequencing, which averages gene expression across cell types and cell states, single-cell assays capture the transcriptional states of individual cells, including fine-grained, transient, and difficult-to-isolate populations at unprecedented scale and resolution. Single-cell eQTL (sc-eQTL) mapping can identify context-dependent eQTLs that vary with cell states, including some that colocalize with disease variants identified in genome-wide association studies. By uncovering the precise contexts in which these eQTLs act, single-cell approaches can unveil previously hidden regulatory effects and pinpoint important cell states underlying molecular mechanisms of disease. Here, we present an overview of recently deployed experimental designs in sc-eQTL studies. In the process, we consider the influence of study design choices such as cohort, cell states, and ex vivo perturbations. We then discuss current methodologies, modeling approaches, and technical challenges as well as future opportunities and applications.

Xiao, Qian, Joseph Mears, Aparna Nathan, Kazuyoshi Ishigaki, Yuriy Baglaenko, Noha Lim, Laura A Cooney, et al. (2023) 2023. “Immunosuppression Causes Dynamic Changes in Expression QTLs in Psoriatic Skin.”. Nature Communications 14 (1): 6268. https://doi.org/10.1038/s41467-023-41984-2.

Psoriasis is a chronic, systemic inflammatory condition primarily affecting skin. While the role of the immune compartment (e.g., T cells) is well established, the changes in the skin compartment are more poorly understood. Using longitudinal skin biopsies (n = 375) from the "Psoriasis Treatment with Abatacept and Ustekinumab: A Study of Efficacy"(PAUSE) clinical trial (n = 101), we report 953 expression quantitative trait loci (eQTLs). Of those, 116 eQTLs have effect sizes that were modulated by local skin inflammation (eQTL interactions). By examining these eQTL genes (eGenes), we find that most are expressed in the skin tissue compartment, and a subset overlap with the NRF2 pathway. Indeed, the strongest eQTL interaction signal - rs1491377616-LCE3C - links a psoriasis risk locus with a gene specifically expressed in the epidermis. This eQTL study highlights the potential to use biospecimens from clinical trials to discover in vivo eQTL interactions with therapeutically relevant environmental variables.

Sakaue, Saori, Saisriram Gurajala, Michelle Curtis, Yang Luo, Wanson Choi, Kazuyoshi Ishigaki, Joyce B Kang, et al. (2023) 2023. “Tutorial: a Statistical Genetics Guide to Identifying HLA Alleles Driving Complex Disease.”. Nature Protocols 18 (9): 2625-41. https://doi.org/10.1038/s41596-023-00853-4.

The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software ( https://github.com/immunogenomics/HLA_analyses_tutorial ). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations.

Gupta, Anika, Kathryn Weinand, Aparna Nathan, Saori Sakaue, Martin Jinye Zhang, Accelerating Medicines Partnership RA/SLE Program and Network, Laura Donlin, et al. (2023) 2023. “Dynamic Regulatory Elements in Single-Cell Multimodal Data Implicate Key Immune Cell States Enriched for Autoimmune Disease Heritability.”. Nature Genetics 55 (12): 2200-2210. https://doi.org/10.1038/s41588-023-01577-7.

In autoimmune diseases such as rheumatoid arthritis, the immune system attacks the body's own cells. Developing a precise understanding of the cell states where noncoding autoimmune risk variants impart causal mechanisms is critical to developing curative therapies. Here, to identify noncoding regions with accessible chromatin that associate with cell-state-defining gene expression patterns, we leveraged multimodal single-nucleus RNA and assay for transposase-accessible chromatin (ATAC) sequencing data across 28,674 cells from the inflamed synovial tissue of 12 donors. Specifically, we used a multivariate Poisson model to predict peak accessibility from single-nucleus RNA sequencing principal components. For 14 autoimmune diseases, we discovered that cell-state-dependent ('dynamic') chromatin accessibility peaks in immune cell types were enriched for heritability, compared with cell-state-invariant ('cs-invariant') peaks. These dynamic peaks marked regulatory elements associated with T peripheral helper, regulatory T, dendritic and STAT1+CXCL10+ myeloid cell states. We argue that dynamic regulatory elements can help identify precise cell states enriched for disease-critical genetic variation.