2743 Publications

cryoJAX: A Cryo-electron Microscopy Image Simulation Library In JAX

Michael J. O'Brien, S. Hanson, D. Needleman, et al.

While cryo-electron microscopy (cryo-EM) has come to prominence in the last decade due to its ability to resolve biomolecular complexes at atomic resolution, advancements in experimental and computational methods have made cryo-EM promising for investigating intracellular organization and heterogeneous molecular states. A primary challenge for these alternative applications is the development of techniques for cryo-EM data analysis, which are very computationally demanding. To this end, it is advantageous to leverage advanced scientific computing frameworks for statistical analysis. One such framework is JAX, an emerging array-oriented Python numerical computing package for automatic differentiation and vectorization with a growing ecosystem for statistical inference and machine learning. We have developed cryoJAX, a cryo-EM image-simulation library for building computational data-analysis applications in JAX. CryoJAX is a flexible modeling language for cryo-EM image formation and therefore can support a wide range of data analysis downstream. By integrating with the JAX ecosystem, cryoJAX enables the development and deployment of algorithms for the growing breadth of scientific applications for cryo-EM.

Show Abstract

Natural Image Statistics, Visual Representation, and Denoising

Imran Thobani, Alisa Leshchenko, E. P. Simoncelli

This article, gathered and elaborated from a lecture by Eero Simoncelli at the 2024 Analytical Connectionism Summer School, reviews several approaches for modeling the probabilistic distribution of natural images and their interaction with the problem of image denoising. The lecture starts with the Gaussian spectral model of the 1950s as a conceptual foundation and quantitative baseline, followed by sparse coding models which took hold in the 1990s. These statistical models of natural images can be used as prior probability distributions for solving inverse problems such as denoising, using a Bayesian framework. Finally, the lecture describes recent work in machine learning in which the process of constructing a denoiser is reversed: a neural network is trained to solve the denoising problem without first specifying a prior distribution, and this trained network is subsequently used as an implicit model of the distribution of natural images. Images can be drawn from this implicit model through a reverse diffusion process, and the model can also be used to solve inference problems. This allows researchers to investigate the extent to which these DNNs are generalizing beyond their training data (as necessary for accurately modeling the distribution of natural images) as opposed to memorizing the images they were trained on.

Show Abstract

The head-direction signal is generated by multiple attractor-like networks

G. Viejo, Sofia Skromne Carrasco, Adrien Peyrache

While the thalamus is known to relay and modulate sensory signals to the cortex, whether it also participates in active computation and intrinsic signal generation remains unresolved. The anterodorsal nucleus of the thalamus broadcasts the head-direction (HD) signal, which is generated in the brainstem, particularly in the upstream lateral mammillary nucleus, and thalamic HD cells remain coordinated even during sleep. Here, by recording and manipulating neuronal activity along the mammillary–thalamic–cortical pathway, we show that coherence among thalamic HD cells persists even when their upstream inputs are decorrelated, particularly during non-Rapid Eye Movement sleep. These findings suggest that thalamic circuits are sufficient to generate and maintain coherent population dynamics in the absence of structured input.

Show Abstract
February 27, 2026

Learning a distance measure from the information-estimation geometry of data

We introduce the Information-Estimation Metric (IEM), a novel form of distance function derived from an underlying continuous probability density over a domain of signals. The IEM is rooted in a fundamental relationship between information theory and estimation theory, which links the log-probability of a signal with the errors of an optimal denoiser, applied to noisy observations of the signal. In particular, the IEM between a pair of signals is obtained by comparing their denoising error vectors over a range of noise amplitudes. Geometrically, this amounts to comparing the score vector fields of the blurred density around the signals over a range of blur levels. We prove that the IEM is a valid global metric and derive a closed-form expression for its local second-order approximation, which yields a Riemannian metric. For Gaussian-distributed signals, the IEM coincides with the Mahalanobis distance. But for more complex distributions, it adapts, both locally and globally, to the geometry of the distribution. In practice, the IEM can be computed using a learned denoiser (analogous to generative diffusion models) and solving a one-dimensional integral. To demonstrate the value of our framework, we learn an IEM on the ImageNet database. Experiments show that this IEM is competitive with or outperforms state-of-the-art supervised image quality metrics in predicting human perceptual judgments.

Show Abstract

Emergent Manifold Separability during Reasoning in Large Language Models

Alexandre Polo, C. Chun, S. Chung

Chain-of-Thought (CoT) prompting significantly improves reasoning in Large Language Models, yet the temporal dynamics of the underlying representation geometry remain poorly understood. We investigate these dynamics by applying Manifold Capacity Theory (MCT) to a compositional Boolean logic task, allowing us to quantify the linear separability of latent representations without the confounding factors of probe training. Our analysis reveals that reasoning manifests as a transient geometric pulse, where concept manifolds are untangled into linearly separable subspaces immediately prior to computation and rapidly compressed thereafter. This behavior diverges from standard linear probe accuracy, which remains high long after computation, suggesting a fundamental distinction between information that is merely retrievable and information that is geometrically prepared for processing. We interpret this phenomenon as \emph{Dynamic Manifold Management}, a mechanism where the model dynamically modulates representational capacity to optimize the bandwidth of the residual stream throughout the reasoning chain.

Show Abstract
February 26, 2026

A Lightweight, Geometrically Flexible Fast Algorithm for the Evaluation of Layer and Volume Potentials

F. Fryklund, L. Greengard, S. Jiang, Samuel Potter

Over the last two decades, several fast, robust, and high-order accurate methods have been developed for solving the Poisson equation in complicated geometry using potential theory. In this approach, rather than discretizing the partial differential equation itself, one first evaluates a volume integral to account for the source distribution within the domain, followed by solving a boundary integral equation to impose the specified boundary conditions. Here, we present a new fast algorithm which is easy to implement and compatible with virtually any discretization technique, including unstructured domain triangulations, such as those used in standard finite element or finite volume methods. Our approach combines earlier work on potential theory for the heat equation, asymptotic analysis, the nonuniform fast Fourier transform (NUFFT), and the dual-space multilevel kernel-splitting (DMK) framework. It is insensitive to flaws in the triangulation, permitting not just nonconforming elements, but arbitrary aspect ratio triangles, gaps and various other degeneracies. On a single CPU core, the scheme computes the solution at a rate comparable to that of the fast Fourier transform (FFT) in work per gridpoint.

Show Abstract

Months-long stability of the head-direction system

Sofia Skromne Carrasco, G. Viejo, Adrien Peyrache

Spatial orientation enables animals to navigate their environment by rapidly mapping the external world and remembering key locations. In mammals, the head-direction (HD) system is an essential component of the navigation system of the brain. Although the tuning of neurons in other areas of this system is unstable—evidenced, for example, by the change in the spatial tuning of hippocampal place cells across days—the
stability of the neuronal code that underlies the sense of direction remains unclear. Here, by longitudinally tracking the activity of the same HD cells in the post-subiculum of freely moving mice, we show stability and plasticity at two levels. Although the population structure remained highly conserved across environments and over time, subtle shifts in population coherence encoded environment identity. In addition, the HD system established a distinct, environment-specific alignment between its internal representation and external landmarks, which persisted for weeks, even
after a single exposure. These findings suggest that the HD system forms long-lasting orientation memories that are anchored to specific environments.

Show Abstract

Blind denoising diffusion models and the blessings of dimensionality

Z. Kadkhodaie, Aram-Alexandre Pooladian, Sinho Chewi, E. P. Simoncelli

We analyze, theoretically and empirically, the performance of generative diffusion models based on \emph{blind denoisers}, in which the denoiser is not given the noise amplitude in either the training or sampling processes. Assuming that the data distribution has low intrinsic dimensionality, we prove that blind denoising diffusion models (BDDMs), despite not having access to the noise amplitude, \emph{automatically} track a particular \emph{implicit} noise schedule along the reverse process. Our analysis shows that BDDMs can accurately sample from the data distribution in polynomially many steps as a function of the intrinsic dimension. Empirical results corroborate these mathematical findings on both synthetic and image data, demonstrating that the noise variance is accurately estimated from the noisy image. Remarkably, we observe that schedule-free BDDMs produce samples of higher quality compared to their non-blind counterparts. We provide evidence that this performance gain arises because BDDMs correct the mismatch between the true residual noise (of the image) and the noise assumed by the schedule used in non-blind diffusion models.

Show Abstract
February 10, 2026

On the Superlinear Relationship between SGD Noise Covariance and Loss Landscape Curvature

Yikuan Zhang, Ning Yang, Y. Tu

Stochastic Gradient Descent (SGD) introduces anisotropic noise that is correlated with the local curvature of the loss landscape, thereby biasing optimization toward flat minima. Prior work often assumes an equivalence between the Fisher Information Matrix and the Hessian for negative log-likelihood losses, leading to the claim that the SGD noise covariance C is proportional to the Hessian H. We show that this assumption holds only under restrictive conditions that are typically violated in deep neural networks. Using the recently discovered Activity--Weight Duality, we find a more general relationship agnostic to the specific loss formulation, showing that C∝𝔼p[h2p], where hp denotes the per-sample Hessian with H=𝔼p[hp]. As a consequence, C and H commute approximately rather than coincide exactly, and their diagonal elements follow an approximate power-law relation Cii∝Hγii with a theoretically bounded exponent 1≤γ≤2, determined by per-sample Hessian spectra. Experiments across datasets, architectures, and loss functions validate these bounds, providing a unified characterization of the noise-curvature relationship in deep learning.

Show Abstract
February 5, 2026

Neural population geometry and optimal coding of tasks with shared latent structure

Albert J. Wakhloo, Will Slatton, S. Chung

Animals can recognize latent structures in their environment and apply this information to efficiently navigate the world. Several works argue that the brain supports these abilities by forming neural representations from which behaviorally relevant variables can be read out across contexts and tasks. However, it is unclear which features of neural activity facilitate downstream readout. Here we analytically determine the geometric properties of neural activity that govern linear readout generalization on a set of tasks sharing a common latent structure. We show that four statistics summarizing the dimensionality, factorization and correlation structures of neural activity determine generalization. Early in learning, optimal neural representations are lower dimensional and exhibit higher correlations between single units and task variables than late in learning. We support these predictions through biological and artificial neural data analysis. Our results tie the linearly decodable information in neural population activity to its geometry.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates