2573 Publications

Comparing noisy neural population dynamics using optimal transport distances

A. Nejatbakhsh, Victor Geadah, A. Williams, D. Lipshutz

Biological and artificial neural systems form high-dimensional neural representations that underpin their computational capabilities. Methods for quantifying geometric similarity in neural representations have become a popular tool for identifying computational principles that are potentially shared across neural systems. These methods generally assume that neural responses are deterministic and static. However, responses of biological systems, and some artificial systems, are noisy and dynamically unfold over time. Furthermore, these characteristics can have substantial influence on a system's computational capabilities. Here, we demonstrate that existing metrics can fail to capture key differences between neural systems with noisy dynamic responses. We then propose a metric for comparing the geometry of noisy neural trajectories, which can be derived as an optimal transport distance between Gaussian processes. We use the metric to compare models of neural responses in different regions of the motor system and to compare the dynamics of latent diffusion models for text-to-image synthesis.

Show Abstract

Learning locally dominant force balances in active particle systems

Dominik Sturm, S. Maddu, Ivo F. Sbalzarini

We use a combination of unsupervised clustering and sparsity-promoting inference algorithms to learn locally dominant force balances that explain macroscopic pattern formation in self-organized active particle systems. The self-organized emergence of macroscopic patterns from microscopic interactions between self-propelled particles can be widely observed in nature. Although hydrodynamic theories help us better understand the physical basis of this phenomenon, identifying a sufficient set of local interactions that shape, regulate and sustain self-organized structures in active particle systems remains challenging. We investigate a classic hydrodynamic model of self-propelled particles that produces a wide variety of patterns, such as asters and moving density bands. Our data-driven analysis shows that propagating bands are formed by local alignment interactions driven by density gradients, while steady-state asters are shaped by a mechanism of splay-induced negative compressibility arising from strong particle interactions. Our method also reveals analogous physical principles of pattern formation in a system where the speed of the particle is influenced by the local density. This demonstrates the ability of our method to reveal physical commonalities across models. The physical mechanisms inferred from the data are in excellent agreement with analytical scaling arguments and experimental observations.

Show Abstract

A minimal dynamical system and analog circuit for non-associative learning

M. Smart, S. Shvartsman, Martin Mönnigmann

Learning in living organisms is typically associated with networks of neurons. The use of large numbers of adjustable units has also been a crucial factor in the continued success of artificial neural networks. In light of the complexity of both living and artificial neural networks, it is surprising to see that very simple organisms -- even unicellular organisms that do not possess a nervous system -- are capable of certain forms of learning. Since in these cases learning may be implemented with much simpler structures than neural networks, it is natural to ask how simple the building blocks required for basic forms of learning may be. The purpose of this study is to discuss the simplest dynamical systems that model a fundamental form of non-associative learning, habituation, and to elucidate technical implementations of such systems, which may be used to implement non-associative learning in neuromorphic computing and related applications.

Show Abstract

xVal: A Continuous Numerical Tokenization for Scientific Language Models

Siavash Golkar, Ph.D. , Mariel Pettee, Ph.D. , M. Eickenberg, A. Bietti, et al.

Due in part to their discontinuous and discrete default encodings for numbers, Large Language Models (LLMs) have not yet been commonly used to process numerically-dense scientific datasets. Rendering datasets as text, however, could help aggregate diverse and multi-modal scientific data into a single training corpus, thereby potentially facilitating the development of foundation models for science. In this work, we introduce xVal, a strategy for continuously tokenizing numbers within language models that results in a more appropriate inductive bias for scientific applications. By training specially-modified language models from scratch on a variety of scientific datasets formatted as text, we find that xVal generally outperforms other common numerical tokenization strategies on metrics including out-of-distribution generalization and computational efficiency.

Show Abstract

Soft Matching Distance: A metric on neural representations that captures single-neuron tuning

A. Williams, Meenakshi Khosla

Common measures of neural representational (dis)similarity are designed to be insensitive to rotations and reflections of the neural activation space. Motivated by the premise that the tuning of individual units may be important, there has been recent interest in developing stricter notions of representational (dis)similarity that require neurons to be individually matched across networks. When two networks have the same size (i.e. same number of neurons), a distance metric can be formulated by optimizing over neuron index permutations to maximize tuning curve alignment. However, it is not clear how to generalize this metric to measure distances between networks with different sizes. Here, we leverage a connection to optimal transport theory to derive a natural generalization based on “soft” permutations. The resulting metric is symmetric, satisfies the triangle inequality, and can be interpreted as a Wasserstein distance between two empirical distributions. Further, our proposed metric avoids counter-intuitive outcomes suffered by alternative approaches, and captures complementary geometric insights into neural representations that are entirely missed by rotation-invariant metrics.

Show Abstract

A tug-of-war between germ cell motility and intercellular bridges controls germline cyst formation in mice

Ezra W. Levy, Isabella Leite, S. Shvartsman, et al.

Gametes in many species develop in cysts—clusters of germ cells formed by incomplete cytokinesis—that remain connected through intercellular bridges (ICBs). These connections enable sharing of cytoplasmic components between germ cells and, in the female germ line, enrich select cells in the cyst to become the oocyte(s). In mice, germline cysts of variable sizes are generated during embryonic development, thought to result from cyst fractures. Studies of fixed samples failed to capture fracture events, and thus, the mechanism remained elusive. Here, we use high-resolution live imaging of germ cells within their native tissue environment to visualize germline cyst dynamics. With this novel approach, we reveal a striking motile phenotype of gonad-resident germ cells and show that this randomly oriented cell-autonomous motile behavior during cyst formation underlies fracture events. Conversely, we show that stabilized ICBs help resist excessive fracturing. Additionally, we find that motility and thus fracture rates gradually decrease during development in a sex-dependent manner, completely ceasing by the end of cyst-forming divisions. These results lead to a model where the opposing activities of developmentally regulated cell motility and stable ICBs give rise to cysts of variable sizes. We corroborate these results by developing a model that uses experimentally measured fracture rates to simulate cyst formation and fracture and show that it can reproduce experimentally measured cyst sizes in both male and female. Understanding how variable cysts form will enable further studies of mammalian oocyte selection and establishment of the ovarian reserve.

Show Abstract

Flows, self-organization, and transport in living cells

This paper briefly reprises, with added commentary, a talk I gave on transport and flows within living cells at an APS-DFD meeting. Directed transport is especially important in large cells, such as eggs where developmental factors need to be properly localized, and early embryos whose organelles and genetic material must be properly positioned before cell division. I discuss two cases—a nematode single-cell embryo and a fruit fly egg cell—where advances in mathematical modeling and large-scale simulation of fluid-structure interactions have helped us understand fundamental mechanisms of force transduction and self-organization within the cell.

Show Abstract

Multiple Physics Pretraining for Physical Surrogate Models

Michael McCabe, B. Régaldo-Saint Blancard, Liam Holden Parker, R. Ohana, Miles Cranmer, A. Bietti, Michael Eickenberg, et al.

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling of spatiotemporal systems with transformers. In MPP, rather than training one model on a specific physical system, we train a backbone model to predict the dynamics of multiple heterogeneous physical systems simultaneously in order to learn features that are broadly useful across systems and facilitate transfer. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on systems with previously unseen physical components or higher dimensional systems compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility.

Show Abstract

Provable Posterior Sampling with Denoising Oracles via Tilted Transport

Joan Bruna, J. Han

Score-based diffusion models have significantly advanced high-dimensional data generation across various domains, by learning a denoising oracle (or score) from datasets. From a Bayesian perspective, they offer a realistic modeling of data priors and facilitate solving inverse problems through posterior sampling. Although many heuristic methods have been developed recently for this purpose, they lack the quantitative guarantees needed in many scientific applications. This work addresses the topic from two perspectives. We first present a hardness result indicating that a generic method leveraging the prior denoising oracle for posterior sampling becomes infeasible as soon as the measurement operator is mildly ill-conditioned. We next develop the tilted transport technique, which leverages the quadratic structure of the log-likelihood in linear inverse problems in combination with the prior denoising oracle to exactly transform the original posterior sampling problem into a new one that is provably easier to sample from. We quantify the conditions under which the boosted posterior is strongly log-concave, highlighting how task difficulty depends on the condition number of the measurement matrix and the signal-to-noise ratio. The resulting general scheme is shown to match the best-known sampling methods for Ising models, and is further validated on high-dimensional Gaussian mixture models.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.