2795 Publications

Cryo-EM images are intrinsically low dimensional

L. Evans, Octavian-Vlad Murad, P. Cossio, et al.

Simulation-based inference provides a powerful framework for cryoelectron microscopy, employing neural networks in methods like CryoSBI to infer biomolecular conformations via learned latent representations. This latent space represents a rich opportunity, encoding valuable information about the physical system and the inference process. Harnessing this potential hinges on understanding the underlying geometric structure of these representations. We investigate this structure by applying manifold learning techniques to CryoSBI representations of a simulated benchmark dataset and both simulated and experimental images of hemagglutinin. We reveal that these high-dimensional data inherently populate low-dimensional, smooth manifolds, with simulated data effectively covering the experimental counterpart. By characterizing the manifold's geometry using Diffusion Maps and identifying its principal axes of variation via coordinate interpretation methods, we establish a direct link between the latent structure and key physical parameters. Discovering this intrinsic low-dimensionality and interpretable geometric organization not only validates the CryoSBI approach but also enables us to learn more from the data structure and provides opportunities for improving future inference strategies by exploiting this revealed manifold geometry.

Show Abstract
September 22, 2025

Cryo-EM images are intrinsically low dimensional

L. Evans, Octavian-Vlad Murad, P. Cossio, et al.

Simulation-based inference provides a powerful framework for cryoelectron microscopy, employing neural networks in methods like CryoSBI to infer biomolecular conformations via learned latent representations. This latent space represents a rich opportunity, encoding valuable information about the physical system and the inference process. Harnessing this potential hinges on understanding the underlying geometric structure of these representations. We investigate this structure by applying manifold learning techniques to CryoSBI representations of a simulated benchmark dataset and both simulated and experimental images of hemagglutinin. We reveal that these high-dimensional data inherently populate low-dimensional, smooth manifolds, with simulated data effectively covering the experimental counterpart. By characterizing the manifold's geometry using Diffusion Maps and identifying its principal axes of variation via coordinate interpretation methods, we establish a direct link between the latent structure and key physical parameters. Discovering this intrinsic low-dimensionality and interpretable geometric organization not only validates the CryoSBI approach but also enables us to learn more from the data structure and provides opportunities for improving future inference strategies by exploiting this revealed manifold geometry.

Show Abstract
September 22, 2025

Cell clusters sense their global shape to drive collective migration

Joan Térmens, Irina Pi-Jaumà, I. Lavi, et al.

The collective migration of epithelial groups of cells plays a central role in processes such as embryo development, wound healing, and cancer invasion. While finite cell clusters are known to collectively migrate in response to external gradients, the competing effect of possible endogenous cues is largely this http URL, we demonstrate that the polarization of peripheral cells that pull the cluster's edge outward is sufficient to induce and sustain the collective migration of confluent clusters. We use a general continuum model to show that the underlying shape-sensing mechanism is purely mechanical, relying on long-range hydrodynamic interactions and cell-cell alignment forces. As a proof-of-concept, we validate our findings with experiments on monolayer clusters from various cell lines, where we control initial shapes and sizes. The mechanism operates independently of external signals and will generally interfere with them. Specifically, we predict and observe experimentally that it can override collective durotaxis, reversing the direction of migration. Together, our results offer a physical framework for understanding how cell interactions govern the interplay between global shape and collective motion and afford engineering principles for optimal control and manipulation of cell cluster shape and motion.

Show Abstract
September 19, 2025

Towards Seamless Interoperability of MPI-OpenMP Applications

B. Smith, M. Berger, Junchao Zhang, Hui Zhou

A chasm exists between mathematical software libraries written for MPI-based applications and those written for OpenMP applications. Recently, however, PETSc enables the simple use of its MPI-based linear solvers from OpenMP applications. Separately, the MPICH MPI development team has started a new project to allow almost seamless MPI use in OpenMP applications. Both proposed approaches would result in a similar user experience. We discuss the reasons for these projects and their potential for providing more numerical library choices for OpenMP applications, including the unlimited assortment of linear solvers available in PETSc. In addition, we present the performance of an application using the first approach, demonstrating its efficacy.

Show Abstract

Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme

R. Morel, Francesco Pio Ramunno, Jeff Shen, A. Bietti, K. Cho, M. Cranmer, S. Golkar, Olexandr Gugnin , G. Krawezik, Et al.

Conditional diffusion models provide a natural framework for probabilistic prediction of dynamical systems and have been successfully applied to fluid dynamics and weather prediction. However, in many settings, the available information at a given time represents only a small fraction of what is needed to predict future states, either due to measurement uncertainty or because only a small fraction of the state can be observed. This is true for example in solar physics, where we can observe the Sun’s surface and atmosphere, but its evolution is driven by internal processes for which we lack direct measurements. In this paper, we tackle the probabilistic prediction of partially observable, long-memory dynamical systems, with applications to solar dynamics and the evolution of active regions. We show that standard inference schemes, such as autoregressive rollouts, fail to capture long-range dependencies in the data, largely because they do not integrate past information effectively. To overcome this, we propose a multiscale inference scheme for diffusion models, tailored to physical processes. Our method generates trajectories that are temporally fine-grained near the present and coarser as we move farther away, which enables capturing long-range temporal dependencies without increasing computational cost. When integrated into a diffusion model, we show that our inference scheme significantly reduces the bias of the predicted distributions and improves rollout stability.

Show Abstract

AION-1: Omnimodal Foundation Model for Astronomical Sciences

L. Parker, F. Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, L. Sarra, Lucas Meyer, Micah Bowles, S. Wagner-Carena, H. Qu, S. Golkar, A. Bietti, R. Morel, Et al.

While foundation models have shown promise across a variety of fields, astronomy lacks a unified framework for joint modeling across its highly diverse data modalities. In this paper, we present AION-1, the first large-scale multimodal foundation family of models for astronomy. AION-1 enables arbitrary transformations between heterogeneous data types using a two-stage architecture: modality-specific tokenization followed by transformer-based masked modeling of cross-modal token sequences. Trained on over 200M astronomical objects, AION-1 demonstrates strong performance across regression, classification, generation, and object retrieval tasks. Beyond astronomy, AION-1 provides a scalable blueprint for multimodal scientific foundation models that can seamlessly integrate heterogeneous combinations of real-world observations. Our model release is entirely open source, including the dataset, training script, and weights.

Show Abstract

ArchVelo: Archetypal Velocity Modeling for Single-cell Multi-omic Trajectories

M. Avdeeva, Sarah Walker, et al.

nferring dynamic cellular processes from static single-cell measurements remains a central challenge in genomics. Here we introduce ArchVelo, a new method for modeling gene regulation and inferring cell trajectories using single-cell simultaneous chromatin accessibility (scATAC-seq) and transcriptomic (scRNA-seq) profiling. ArchVelo represents chromatin accessibility as a set of archetypes—shared regulatory programs—and models their dynamic influence on transcription. Compared to previous methods, ArchVelo improves inference accuracy and gene-level latent time alignment, and enables identification of the underlying transcription factor activity. We benchmark ArchVelo on developing mouse brain and human hematopoiesis datasets and apply it to CD8 T cells responding to viral infection, revealing distinct trajectories of differentiation and proliferation. Focusing on the progenitor CD8 T cell population with key roles in sustaining immune responses and translationally linked to immunotherapy outcomes, we identify a previously uncharacterized differentiation trajectory from Ccr6− to Ccr6+ progenitors, shared between acute and chronic infection. In sum, ArchVelo provides a principled framework for modeling dynamic gene regulation in multi-omic single-cell data across biological systems.

Show Abstract
September 17, 2025

Spatial Frequency Maps in Human Visual Cortex: A Replication and Extension

Jiyeong Ha, B. Broderick, Kendrick Kay, J. Winawer

In a step toward developing a model of human primary visual cortex, a recent study introduced a model of spatial frequency tuning in V1 (Broderick, Simoncelli, & Winawer, 2022). The model is compact, using just 9 parameters to predict BOLD response amplitude for locations across all of V1 as a function of stimulus orientation and spatial frequency. Here we replicated this analysis in a new dataset, the ‘nsdsynthetic’ supplement to the Natural Scenes Dataset (Allen et al., 2022), to assess generalization of model parameters. Furthermore, we extended the analyses to extrastriate maps V2 and V3. For each retinotopic map in the 8 NSD subjects, we fit the 9-parameter model. Despite many experimental differences between NSD and the original study, including stimulus size, experimental design, and MR field strength, there was good agreement in most model parameters. The dependence of preferred spatial frequency on eccentricity in V1 was similar between NSD and Broderick et al. Moreover, the effect of absolute stimulus orientation on spatial frequency maps was similar: higher preferred spatial frequency for horizontal and cardinal orientations compared to vertical and oblique orientations in both studies. The extension to extrastriate maps revealed that the biggest change in tuning between maps was in bandwidth: the bandwidth in spatial frequency tuning increased by 70% from V1 to V2 and 100% from V1 to V3, paralleling known increases in receptive field size. Together, the results show robust reproducibility and bring us closer to a systematic characterization of spatial encoding in the human visual system.

Show Abstract
September 17, 2025

Collective multicellular patterns arising from cadherin-linked cytoskeletal domains

In multicellular systems, adhesion complexes, such as those composed of E-cadherin and associated catenins, mechanically couple neighboring cells by directly linking their actin-based cytoskeletal assemblies. However, the mechanics of how forces are transmitted across these adhesions remains largely unstudied. Here, we introduce a biophysical model that explicitly couples adhesion complex dynamics to intracellular mechanics across cell boundaries. A cadherin dimer plus associated catenins connecting two cells is represented as a spring whose ends experience drag with respect to the moving actin cytoskeleton. The cytoskeleton is modeled as a contractile gel driven by myosin activity in its bulk and forces from adhesion on its boundaries. Our model captures this bidirectional coupling via a coarse-grained continuum framework and reveals a range of observed cell- and tissue-scale behaviors. These include global cell polarization of the multicellular collective, other polarization patterns and oscillatory dynamics, spontaneously formed actin rings within cells, and supracellular stress chains. Many of these features arise from modeling the direct mechanical coupling between cytoskeleton and adhesion. This model can be extended to other adhesion-cytoskeleton feedback systems and used to advance our understanding of multicellular tissue dynamics, particularly during development.

Show Abstract
September 16, 2025

Live imaging endogenous transcription factor dynamics reveals mechanisms of epiblast and primitive endoderm fate segregation

Rebecca P. Kim-Yip, David Denberg, H. Nunley , et al.

The segregation of the epiblast (EPI) and primitive endoderm (PE) cell types in the preimplantation mouse embryo is not only a crucial decision that sets aside the precursors of the embryo proper from extraembryonic cells, respectively, but also has served as a central model to study a key concept in mammalian development: how much of developmental patterning is predetermined vs. stochastically emergent. Here, we address this question by quantitative live imaging of multiple endogenously tagged transcription factors key to this fate decision and trace their dynamics at a single-cell resolution through the formation of EPI and PE cell fates. Strikingly, we reveal an initial symmetry breaking event, the formation of a primary EPI cell lineage, and show that this is linked to the dynamics of the prior inner cell mass/trophectoderm fate decision through the expression of SOX2. This primary EPI lineage, through fibroblast growth factor (FGF) signaling, induces an increase in the transcription factor GATA6 in other inner cell mass cells, setting them on the course toward PE differentiation. Interestingly, this trajectory can switch during a defined developmental window, leading to the emergence of secondary EPI cells. Finally, we show that early expression levels of NANOG, which are seemingly stochastic, can bias whether a cell’s trajectory switches to secondary EPI or continues as PE. Our data give unique insight into how fate patterning is initiated and propagated during unperturbed embryonic development through the interplay of lineage-history-biased and stochastic cell-intrinsic molecular features, unifying previous models of EPI/PE segregation.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates