Publications

2023

Albiñana, Clara, Zhihong Zhu, Andrew J Schork, Andrés Ingason, Hugues Aschard, Isabell Brikell, Cynthia M Bulik, et al. (2023) 2023. “Multi-PGS Enhances Polygenic Prediction by Combining 937 Polygenic Scores.”. Nature Communications 14 (1): 4702. https://doi.org/10.1038/s41467-023-40330-w.

The predictive performance of polygenic scores (PGS) is largely dependent on the number of samples available to train the PGS. Increasing the sample size for a specific phenotype is expensive and takes time, but this sample size can be effectively increased by using genetically correlated phenotypes. We propose a framework to generate multi-PGS from thousands of publicly available genome-wide association studies (GWAS) with no need to individually select the most relevant ones. In this study, the multi-PGS framework increases prediction accuracy over single PGS for all included psychiatric disorders and other available outcomes, with prediction R2 increases of up to 9-fold for attention-deficit/hyperactivity disorder compared to a single PGS. We also generate multi-PGS for phenotypes without an existing GWAS and for case-case predictions. We benchmark the multi-PGS framework against other methods and highlight its potential application to new emerging biobanks.

Pedersen, Emil M, Esben Agerbo, Oleguer Plana-Ripoll, Jette Steinbach, Morten D Krebs, David M Hougaard, Thomas Werge, et al. (2023) 2023. “ADuLT: An Efficient and Robust Time-to-Event GWAS.”. Nature Communications 14 (1): 5553. https://doi.org/10.1038/s41467-023-41210-z.

Proportional hazards models have been proposed to analyse time-to-event phenotypes in genome-wide association studies (GWAS). However, little is known about the ability of proportional hazards models to identify genetic associations under different generative models and when ascertainment is present. Here we propose the age-dependent liability threshold (ADuLT) model as an alternative to a Cox regression based GWAS, here represented by SPACox. We compare ADuLT, SPACox, and standard case-control GWAS in simulations under two generative models and with varying degrees of ascertainment as well as in the iPSYCH cohort. We find Cox regression GWAS to be underpowered when cases are strongly ascertained (cases are oversampled by a factor 5), regardless of the generative model used. ADuLT is robust to ascertainment in all simulated scenarios. Then, we analyse four psychiatric disorders in iPSYCH, ADHD, Autism, Depression, and Schizophrenia, with a strong case-ascertainment. Across these psychiatric disorders, ADuLT identifies 20 independent genome-wide significant associations, case-control GWAS finds 17, and SPACox finds 8, which is consistent with simulation results. As more genetic data are being linked to electronic health records, robust GWAS methods that can make use of age-of-onset information will help increase power in analyses for common health outcomes.

Privé, Florian, Clara Albiñana, Julyan Arbel, Bogdan Pasaniuc, and Bjarni J Vilhjálmsson. (2023) 2023. “Inferring Disease Architecture and Predictive Ability With LDpred2-Auto.”. American Journal of Human Genetics 110 (12): 2042-55. https://doi.org/10.1016/j.ajhg.2023.10.010.

LDpred2 is a widely used Bayesian method for building polygenic scores (PGSs). LDpred2-auto can infer the two parameters from the LDpred model, the SNP heritability h2 and polygenicity p, so that it does not require an additional validation dataset to choose best-performing parameters. The main aim of this paper is to properly validate the use of LDpred2-auto for inferring multiple genetic parameters. Here, we present a new version of LDpred2-auto that adds an optional third parameter α to its model, for modeling negative selection. We then validate the inference of these three parameters (or two, when using the previous model). We also show that LDpred2-auto provides per-variant probabilities of being causal that are well calibrated and can therefore be used for fine-mapping purposes. We also introduce a formula to infer the out-of-sample predictive performance r2 of the resulting PGS directly from the Gibbs sampler of LDpred2-auto. Finally, we extend the set of HapMap3 variants recommended to use with LDpred2 with 37% more variants to improve the coverage of this set, and we show that this new set of variants captures 12% more heritability and provides 6% more predictive performance, on average, in UK Biobank analyses.

Líndez, Pau Piera, Joachim Johansen, Svetlana Kutuzova, Arnor Ingi Sigurdsson, Jakob Nybo Nissen, and Simon Rasmussen. (2023) 2023. “Adversarial and Variational Autoencoders Improve Metagenomic Binning.”. Communications Biology 6 (1): 1073. https://doi.org/10.1038/s42003-023-05452-3.

Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing  7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.

Kronborg, Thit Mynster, Henry Webel, Malene Barfod O’Connell, Karen Vagner Danielsen, Lise Hobolth, Søren Møller, Rasmus Tanderup Jensen, et al. (2023) 2023. “Markers of Inflammation Predict Survival in Newly Diagnosed Cirrhosis: A Prospective Registry Study.”. Scientific Reports 13 (1): 20039. https://doi.org/10.1038/s41598-023-47384-2.

The inflammatory activity in cirrhosis is often pronounced and related to episodes of decompensation. Systemic markers of inflammation may contain prognostic information, and we investigated their possible correlation with admissions and mortality among patients with newly diagnosed liver cirrhosis. We collected plasma samples from 149 patients with newly diagnosed (within the past 6 months) cirrhosis, and registered deaths and hospital admissions within 180 days. Ninety-two inflammatory markers were quantified and correlated with clinical variables, mortality, and admissions. Prediction models were calculated by logistic regression. We compared the disease courses of our cohort with a validation cohort of 86 patients with cirrhosis. Twenty of 92 markers of inflammation correlated significantly with mortality within 180 days (q-values of 0.00-0.044), whereas we found no significant correlations with liver-related admissions. The logistic regression models yielded AUROCs of 0.73 to 0.79 for mortality and 0.61 to 0.73 for liver-related admissions, based on a variety of modalities (clinical variables, inflammatory markers, clinical scores, or combinations thereof). The models performed moderately well in the validation cohort and were better able to predict mortality than liver-related admissions. In conclusion, markers of inflammation can be used to predict 180-day mortality in patients with newly diagnosed cirrhosis. Prediction models for newly diagnosed cirrhotic patients need further validation before implementation in clinical practice.Trial registration: NCT04422223 (and NCT03443934 for the validation cohort), and Scientific Ethics Committee No.: H-19024348.

Negoita, Florentina, Alex B Addinsall, Kristina Hellberg, Conchita Fraguas Bringas, Paul S Hafen, Tyler J Sermersheim, Marianne Agerholm, et al. (2023) 2023. “CaMKK2 Is Not Involved in Contraction-Stimulated AMPK Activation and Glucose Uptake in Skeletal Muscle.”. Molecular Metabolism 75: 101761. https://doi.org/10.1016/j.molmet.2023.101761.

OBJECTIVE: The AMP-activated protein kinase (AMPK) gets activated in response to energetic stress such as contractions and plays a vital role in regulating various metabolic processes such as insulin-independent glucose uptake in skeletal muscle. The main upstream kinase that activates AMPK through phosphorylation of α-AMPK Thr172 in skeletal muscle is LKB1, however some studies have suggested that Ca2+/calmodulin-dependent protein kinase kinase 2 (CaMKK2) acts as an alternative kinase to activate AMPK. We aimed to establish whether CaMKK2 is involved in activation of AMPK and promotion of glucose uptake following contractions in skeletal muscle.

METHODS: A recently developed CaMKK2 inhibitor (SGC-CAMKK2-1) alongside a structurally related but inactive compound (SGC-CAMKK2-1N), as well as CaMKK2 knock-out (KO) mice were used. In vitro kinase inhibition selectivity and efficacy assays, as well as cellular inhibition efficacy analyses of CaMKK inhibitors (STO-609 and SGC-CAMKK2-1) were performed. Phosphorylation and activity of AMPK following contractions (ex vivo) in mouse skeletal muscles treated with/without CaMKK inhibitors or isolated from wild-type (WT)/CaMKK2 KO mice were assessed. Camkk2 mRNA in mouse tissues was measured by qPCR. CaMKK2 protein expression was assessed by immunoblotting with or without prior enrichment of calmodulin-binding proteins from skeletal muscle extracts, as well as by mass spectrometry-based proteomics of mouse skeletal muscle and C2C12 myotubes.

RESULTS: STO-609 and SGC-CAMKK2-1 were equally potent and effective in inhibiting CaMKK2 in cell-free and cell-based assays, but SGC-CAMKK2-1 was much more selective. Contraction-stimulated phosphorylation and activation of AMPK were not affected with CaMKK inhibitors or in CaMKK2 null muscles. Contraction-stimulated glucose uptake was comparable between WT and CaMKK2 KO muscle. Both CaMKK inhibitors (STO-609 and SGC-CAMKK2-1) and the inactive compound (SGC-CAMKK2-1N) significantly inhibited contraction-stimulated glucose uptake. SGC-CAMKK2-1 also inhibited glucose uptake induced by a pharmacological AMPK activator or insulin. Relatively low levels of Camkk2 mRNA were detected in mouse skeletal muscle, but neither CaMKK2 protein nor its derived peptides were detectable in mouse skeletal muscle tissue.

CONCLUSIONS: We demonstrate that pharmacological inhibition or genetic loss of CaMKK2 does not affect contraction-stimulated AMPK phosphorylation and activation, as well as glucose uptake in skeletal muscle. Previously observed inhibitory effect of STO-609 on AMPK activity and glucose uptake is likely due to off-target effects. CaMKK2 protein is either absent from adult murine skeletal muscle or below the detection limit of currently available methods.

Ding, Yi, Kangcheng Hou, Ziqi Xu, Aditya Pimplaskar, Ella Petter, Kristin Boulier, Florian Privé, Bjarni J Vilhjálmsson, Loes M Olde Loohuis, and Bogdan Pasaniuc. (2023) 2023. “Polygenic Scoring Accuracy Varies across the Genetic Ancestry Continuum.”. Nature 618 (7966): 774-81. https://doi.org/10.1038/s41586-023-06079-4.

Polygenic scores (PGSs) have limited portability across different groupings of individuals (for example, by genetic ancestries and/or social determinants of health), preventing their equitable use1-3. PGS portability has typically been assessed using a single aggregate population-level statistic (for example, R2)4, ignoring inter-individual variation within the population. Here, using a large and diverse Los Angeles biobank5 (ATLAS, n = 36,778) along with the UK Biobank6 (UKBB, n = 487,409), we show that PGS accuracy decreases individual-to-individual along the continuum of genetic ancestries7 in all considered populations, even within traditionally labelled 'homogeneous' genetic ancestries. The decreasing trend is well captured by a continuous measure of genetic distance (GD) from the PGS training data: Pearson correlation of -0.95 between GD and PGS accuracy averaged across 84 traits. When applying PGS models trained on individuals labelled as white British in the UKBB to individuals with European ancestries in ATLAS, individuals in the furthest GD decile have 14% lower accuracy relative to the closest decile; notably, the closest GD decile of individuals with Hispanic Latino American ancestries show similar PGS performance to the furthest GD decile of individuals with European ancestries. GD is significantly correlated with PGS estimates themselves for 82 of 84 traits, further emphasizing the importance of incorporating the continuum of genetic ancestries in PGS interpretation. Our results highlight the need to move away from discrete genetic ancestry clusters towards the continuum of genetic ancestries when considering PGSs.