2697 Publications

Trapped acoustic waves and raindrops: high-order accurate integral equation method for localized excitation of a periodic staircase

F. Agocs, A. Barnett

We present a high-order boundary integral equation (BIE) method for the frequency-domain acoustic scattering of a point source by a singly-periodic, infinite, corrugated boundary. We apply it to the accurate numerical study of acoustic radiation in the neighborhood of a sound-hard two-dimensional staircase modeled after the El Castillo pyramid. Such staircases support trapped waves which travel along the surface and decay exponentially away from it. We use the array scanning method (Floquet–Bloch transform) to recover the scattered field as an integral over the family of quasiperiodic solutions parameterized by on-surface wavenumber. Each such BIE solution requires the quasiperiodic Green's function, which we evaluate using an efficient integral representation of lattice sum coefficients. We avoid the singularities and branch cuts present in the array scanning integral by complex contour deformation. For each frequency, this enables a solution accurate to around 10 digits in a few seconds. We propose a residue method to extract the limiting powers carried by trapped modes far from the source. Finally, by computing the trapped mode dispersion relation, we use a simple ray model to explain an acoustic chirp-like time-domain response that is referred to in the literature as the “raindrop effect.”

Show Abstract

xVal: A Continuous Numerical Tokenization for Scientific Language Models

Siavash Golkar, Ph.D. , Mariel Pettee, Ph.D. , M. Eickenberg, A. Bietti, et al.

Due in part to their discontinuous and discrete default encodings for numbers, Large Language Models (LLMs) have not yet been commonly used to process numerically-dense scientific datasets. Rendering datasets as text, however, could help aggregate diverse and multi-modal scientific data into a single training corpus, thereby potentially facilitating the development of foundation models for science. In this work, we introduce xVal, a strategy for continuously tokenizing numbers within language models that results in a more appropriate inductive bias for scientific applications. By training specially-modified language models from scratch on a variety of scientific datasets formatted as text, we find that xVal generally outperforms other common numerical tokenization strategies on metrics including out-of-distribution generalization and computational efficiency.

Show Abstract

Soft Matching Distance: A metric on neural representations that captures single-neuron tuning

A. Williams, Meenakshi Khosla

Common measures of neural representational (dis)similarity are designed to be insensitive to rotations and reflections of the neural activation space. Motivated by the premise that the tuning of individual units may be important, there has been recent interest in developing stricter notions of representational (dis)similarity that require neurons to be individually matched across networks. When two networks have the same size (i.e. same number of neurons), a distance metric can be formulated by optimizing over neuron index permutations to maximize tuning curve alignment. However, it is not clear how to generalize this metric to measure distances between networks with different sizes. Here, we leverage a connection to optimal transport theory to derive a natural generalization based on “soft” permutations. The resulting metric is symmetric, satisfies the triangle inequality, and can be interpreted as a Wasserstein distance between two empirical distributions. Further, our proposed metric avoids counter-intuitive outcomes suffered by alternative approaches, and captures complementary geometric insights into neural representations that are entirely missed by rotation-invariant metrics.

Show Abstract

A tug-of-war between germ cell motility and intercellular bridges controls germline cyst formation in mice

Ezra W. Levy, Isabella Leite, S. Shvartsman, et al.

Gametes in many species develop in cysts—clusters of germ cells formed by incomplete cytokinesis—that remain connected through intercellular bridges (ICBs). These connections enable sharing of cytoplasmic components between germ cells and, in the female germ line, enrich select cells in the cyst to become the oocyte(s). In mice, germline cysts of variable sizes are generated during embryonic development, thought to result from cyst fractures. Studies of fixed samples failed to capture fracture events, and thus, the mechanism remained elusive. Here, we use high-resolution live imaging of germ cells within their native tissue environment to visualize germline cyst dynamics. With this novel approach, we reveal a striking motile phenotype of gonad-resident germ cells and show that this randomly oriented cell-autonomous motile behavior during cyst formation underlies fracture events. Conversely, we show that stabilized ICBs help resist excessive fracturing. Additionally, we find that motility and thus fracture rates gradually decrease during development in a sex-dependent manner, completely ceasing by the end of cyst-forming divisions. These results lead to a model where the opposing activities of developmentally regulated cell motility and stable ICBs give rise to cysts of variable sizes. We corroborate these results by developing a model that uses experimentally measured fracture rates to simulate cyst formation and fracture and show that it can reproduce experimentally measured cyst sizes in both male and female. Understanding how variable cysts form will enable further studies of mammalian oocyte selection and establishment of the ovarian reserve.

Show Abstract

A Dual-space Multilevel Kernel-splitting Framework for Discrete and Continuous Convolution

Abstract We introduce a new class of multilevel, adaptive, dual-space methods for computing fast convolutional transformations. These methods can be applied to a broad class of kernels, from the Green's functions for classical partial differential equations (PDEs) to power functions and radial basis functions such as those used in statistics and machine learning. The DMK (dual-space multilevel kernel-splitting) framework uses a hierarchy of grids, computing a smoothed interaction at the coarsest level, followed by a sequence of corrections at finer and finer scales until the problem is entirely local, at which point direct summation is applied. Unlike earlier multilevel summation schemes, DMK exploits the fact that the interaction at each scale is diagonalized by a short Fourier transform, permitting the use of separation of variables, but without relying on the FFT. This requires careful attention to the discretization of the Fourier transform at each spatial scale. Like multilevel summation, we make use of a recursive (telescoping) decomposition of the original kernel into the sum of a smooth far-field kernel, a sequence of difference kernels, and a residual kernel, which plays a role only in leaf boxes in the adaptive tree. At all higher levels in the grid hierarchy, the interaction kernels are designed to be smooth in both physical and Fourier space, admitting efficient Fourier spectral approximations. The DMK framework substantially simplifies the algorithmic structure of the fast multipole method (FMM) and unifies the FMM, Ewald summation, and multilevel summation, achieving speeds comparable to the FFT in work per gridpoint, even in a fully adaptive context. For continuous source distributions, the evaluation of local interactions is further accelerated by approximating the kernel at the finest level as a sum of Gaussians (SOG) with a highly localized remainder. The Gaussian convolutions are calculated using tensor product transforms, and the remainder term is calculated using asymptotic methods. We illustrate the performance of DMK for both continuous and discrete sources with extensive numerical examples in two and three dimensions.

Show Abstract

Flows, self-organization, and transport in living cells

This paper briefly reprises, with added commentary, a talk I gave on transport and flows within living cells at an APS-DFD meeting. Directed transport is especially important in large cells, such as eggs where developmental factors need to be properly localized, and early embryos whose organelles and genetic material must be properly positioned before cell division. I discuss two cases—a nematode single-cell embryo and a fruit fly egg cell—where advances in mathematical modeling and large-scale simulation of fluid-structure interactions have helped us understand fundamental mechanisms of force transduction and self-organization within the cell.

Show Abstract

Multiple Physics Pretraining for Physical Surrogate Models

Michael McCabe, B. Régaldo-Saint Blancard, Liam Holden Parker, R. Ohana, Miles Cranmer, A. Bietti, Michael Eickenberg, et al.

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling of spatiotemporal systems with transformers. In MPP, rather than training one model on a specific physical system, we train a backbone model to predict the dynamics of multiple heterogeneous physical systems simultaneously in order to learn features that are broadly useful across systems and facilitate transfer. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on systems with previously unseen physical components or higher dimensional systems compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility.

Show Abstract

Provable Posterior Sampling with Denoising Oracles via Tilted Transport

Joan Bruna, J. Han

Score-based diffusion models have significantly advanced high-dimensional data generation across various domains, by learning a denoising oracle (or score) from datasets. From a Bayesian perspective, they offer a realistic modeling of data priors and facilitate solving inverse problems through posterior sampling. Although many heuristic methods have been developed recently for this purpose, they lack the quantitative guarantees needed in many scientific applications. This work addresses the topic from two perspectives. We first present a hardness result indicating that a generic method leveraging the prior denoising oracle for posterior sampling becomes infeasible as soon as the measurement operator is mildly ill-conditioned. We next develop the tilted transport technique, which leverages the quadratic structure of the log-likelihood in linear inverse problems in combination with the prior denoising oracle to exactly transform the original posterior sampling problem into a new one that is provably easier to sample from. We quantify the conditions under which the boosted posterior is strongly log-concave, highlighting how task difficulty depends on the condition number of the measurement matrix and the signal-to-noise ratio. The resulting general scheme is shown to match the best-known sampling methods for Ising models, and is further validated on high-dimensional Gaussian mixture models.

Show Abstract

Statistical Mechanics of Support Vector Regression

A key problem in deep learning and computational neuroscience is relating the geometrical properties of neural representations to task performance. Here, we consider this problem for continuous decoding tasks where neural variability may affect task precision. Using methods from statistical mechanics, we study the average-case learning curves for ε-insensitive Support Vector Regression (ε-SVR) and discuss its capacity as a measure of linear decodability. Our analysis reveals a phase transition in the training error at a critical load, capturing the interplay between the tolerance parameter ε and neural variability. We uncover a double-descent phenomenon in the generalization error, showing that ε acts as a regularizer, both suppressing and shifting these peaks. Theoretical predictions are validated both on toy models and deep neural networks, extending the theory of Support Vector Machines to continuous tasks with inherent neural variability.

Show Abstract

Programming tissue-sensing T cells that deliver therapies to the brain

Milos S. Simic, Payal B. Watchmaker, O. Troyanskaya, et al.

Cells modified outside of the body and then reintroduced provide an advantage over most small-molecule therapeutics in that cells can be designed to recognize target molecules in specific tissues and then act locally. Two studies now demonstrate advances in cell engineering for treating human disease (see the Perspective by Davila and Brentjens). Reddy et al. engineered human T cells to make a synthetic receptor that recognized overactive T cells such as those causing autoimmune disease and organ rejection. The most effective modified cells tested were ones in which the synthetic receptor initiated a program causing the production of both an anti-inflammatory cytokine and a receptor that acted as sink for a locally produced proinflammatory cytokine. In mouse models, such cells could be designed with logic programs that protect the desired tissues without detrimental systemic immunosuppression. Simic et al. modified T cells to produce a synthetic receptor that recognized an antigen localized to the extracellular matrix of the brain. The synthetic receptor activated a circuit stimulating the production of chimeric antigen receptors that targeted and killed cancer cells in the brain but not those implanted elsewhere in the mouse. A mouse model of neuroinflammatory brain disease could be treated with cells engineered to locally produce an anti-inflammatory cytokine.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates