Abstract
As spatial molecular data grow in scope and resolution, there is a pressing need to identify key spatial structures associated with disease. Current approaches often rely on hand-crafted features such as local abundances of manually annotated, discrete cell types, which may overlook important signals. Here we introduce variational inference-based microniche analysis (VIMA), a method that combines deep learning with principled statistics to discover associated spatial features with greater flexibility and precision. VIMA uses a variational autoencoder to extract numerical "fingerprints" from small tissue patches that capture their biological content. It uses these fingerprints to define a large number of "microniches" - small, potentially overlapping groups of tissue patches with highly similar biology that span multiple samples. It then uses rigorous statistics to identify microniches whose abundance correlates with case-control status. We show in simulations that VIMA is well calibrated and more powerful and accurate than other approaches. We then apply VIMA to a 140-gene spatial transcriptomics dataset in Alzheimer's dementia, a 54-marker CO-Detection by indEXing (CODEX) dataset in ulcerative colitis (UC), and a 7-marker immunohistochemistry dataset in rheumatoid arthritis (RA), in each case recapitulating known biology and identifying novel spatial features of disease.