2573 Publications

Provable convergence guarantees for black-box variational inference

Justin Domke, R. M. Gower, Guillaume Garrigos

Black-box variational inference is widely used in situations where there is no proof that its stochastic optimization succeeds. We suggest this is due to a theoretical gap in existing stochastic optimization proofs—namely the challenge of gradient estimators with unusual noise bounds, and a composite non-smooth objective. For dense Gaussian variational families, we observe that existing gradient estimators based on reparameterization satisfy a quadratic noise bound and give novel convergence guarantees for proximal and projected stochastic gradient descent using this bound. This provides rigorous guarantees that methods similar to those used in practice converge on realistic inference problems.

Show Abstract

Building a “trap model” of glassy dynamics from a local structural predictor of rearrangements

S. A. Ridout, I. Tah, A. J. Liu

Here we introduce a variation of the trap model of supercooled liquids based on softness, a particle-based variable identified by machine learning that quantifies the local structural environment and energy barrier for the particle to rearrange. As in the trap model, we assume that each particle's softness, and hence energy barrier, evolves independently. We show that our model makes qualitatively reasonable predictions of behaviors such as the dependence of fragility on density in a model supercooled liquid. We also show failures of the model, indicating in some cases signs that softness may be missing important information, and in other cases features that may only be explained by correlations neglected in the trap model.

Show Abstract

Confirmations, correlatons and instabilities of a flexible fiber in an active fluid

S. Weady, D. Stein, Alexandra Zidovska, M. Shelley

Fluid-structure interactions between active and passive components are important for many biological systems to function. A particular example is chromatin in the cell nucleus, where ATP-powered processes drive coherent motions of the chromatin fiber over micron lengths. Motivated by this system, we develop a multiscale model of a long flexible polymer immersed in a suspension of active force dipoles as an analog to a chromatin fiber in an active fluid – the nucleoplasm. Linear analysis identifies an orientational instability driven by hydrodynamic and alignment interactions between the fiber and the suspension, and numerical simulations show activity can drive coherent motions and structured conformations. These results demonstrate how active and passive components, connected through fluid-structure interactions, can generate coherent structures and self-organize on large scales.

Show Abstract
December 6, 2023

Uniqueness and characteristic flow for a non strictly convex singular variational problem

Jean-FrançoisBabadjian, G. Francfort

This work addresses the question of uniqueness of the minimizers of a convex but not strictly convex integral functional with linear growth in a two-dimensional setting. The integrand -- whose precise form derives directly from the theory of perfect plasticity -- behaves quadratically close to the origin and grows linearly once a specific threshold is reached. Thus, in contrast with the only existing literature on uniqueness for functionals with linear growth, that is that which pertains to the generalized least gradient, the integrand is not a norm. We make use of hyperbolic conservation laws hidden in the structure of the problem to tackle uniqueness. Our argument strongly relies on the regularity of a vector field -- the Cauchy stress in the terminology of perfect plasticity -- which allows us to define characteristic lines, and then to employ the method of characteristics. Using the detailed structure of the characteristic landscape evidenced in our preliminary study \cite{BF}, we show that this vector field is actually continuous, save for possibly two points. The different behaviors of the energy density at zero and at infinity imply an inequality constraint on the Cauchy stress. Under a barrier type convexity assumption on the set where the inequality constraint is saturated, we show that uniqueness holds for pure Dirichlet boundary data, a stronger result than that of uniqueness for a given trace on the whole boundary since our minimizers can fail to attain the boundary data.

Show Abstract
December 4, 2023

Sharp error estimates for target measure diffusion maps with applications to the committor problem

Shashank Sule, L. Evans, Maria Cameron

We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using $\delta$-nets as spatially uniform inputs to the TMDmap algorithm.

Show Abstract

Adaptive whitening with fast gain modulation and slow synaptic plasticity

L. Duong, E. P. Simoncelli, D. Chklovskii, D. Lipshutz

Neurons in early sensory areas rapidly adapt to changing sensory statistics, both by normalizing the variance of their individual responses and by reducing correlations between their responses. Together, these transformations may be viewed as an adaptive form of statistical whitening. Existing mechanistic models of adaptive whitening exclusively use either synaptic plasticity or gain modulation as the biological substrate for adaptation; however, on their own, each of these models has significant limitations. In this work, we unify these approaches in a normative multi-timescale mechanistic model that adaptively whitens its responses with complementary computational roles for synaptic plasticity and gain modulation. Gains are modified on a fast timescale to adapt to the current statistical context, whereas synapses are modified on a slow timescale to match structural properties of the input statistics that are invariant across contexts. Our model is derived from a novel multi-timescale whitening objective that factorizes the inverse whitening matrix into basis vectors, which correspond to synaptic weights, and a diagonal matrix, which corresponds to neuronal gains. We test our model on synthetic and natural datasets and find that the synapses learn optimal configurations over long timescales that enable adaptive whitening on short timescales using gain modulation.

Show Abstract

A polar prediction model for learning to represent visual transformations

All organisms make temporal predictions, and their evolutionary fitness level depends on the accuracy of these predictions. In the context of visual perception, the motions of both the observer and objects in the scene structure the dynamics of sensory signals, allowing for partial prediction of future signals based on past ones. Here, we propose a self-supervised representation-learning framework that extracts and exploits the regularities of natural videos to compute accurate predictions. We motivate the polar architecture by appealing to the Fourier shift theorem and its group-theoretic generalization, and we optimize its parameters on next-frame prediction. Through controlled experiments, we demonstrate that this approach can discover the representation of simple transformation groups acting in data. When trained on natural video datasets, our framework achieves better prediction performance than traditional motion compensation and rivals conventional deep networks, while maintaining interpretability and speed. Furthermore, the polar computations can be restructured into components resembling normalized simple and direction-selective complex cell models of primate V1 neurons. Thus, polar prediction offers a principled framework for understanding how the visual system represents sensory inputs in a form that simplifies temporal prediction.

Show Abstract

Efficient coding of natural images using maximum manifold capacity representations

The efficient coding hypothesis posits that sensory systems are adapted to the statistics of their inputs, maximizing mutual information between environmental signals and their representations, subject to biological constraints. While elegant, information theoretic quantities are notoriously difficult to measure or optimize, and most research on the hypothesis employs approximations, bounds, or substitutes (e.g., reconstruction error). A recently developed measure of coding efficiency, the "manifold capacity", quantifies the number of object categories that may be represented in a linearly separable fashion, but its calculation relies on a computationally intensive iterative procedure that precludes its use as an objective. Here, we simplify this measure to a form that facilitates direct optimization, use it to learn Maximum Manifold Capacity Representations (MMCRs), and demonstrate that these are competitive with state-of-the-art results on current self-supervised learning (SSL) recognition benchmarks. Empirical analyses reveal important differences between MMCRs and the representations learned by other SSL frameworks, and suggest a mechanism by which manifold compression gives rise to class separability. Finally, we evaluate a set of SSL methods on a suite of neural predictivity benchmarks, and find MMCRs are highly competitive as models of the primate ventral stream.

Show Abstract

Comparing neural models using their perceptual discriminability predictions

J. Zhou, Chanwoo Chun, Ajay Subramanian, E. P. Simoncelli

Internal representations are not uniquely identifiable from perceptual measurements: different representations can generate identical perceptual predictions, and similar representations may predict dissimilar percepts. Here, we generalize a previous method (``Eigendistortions'' -- Berardino et al., 2017) to enable comparison of models based on their metric tensors, which can be verified perceptually. Metric tensors characterize sensitivity to stimulus perturbations, reflecting both the geometric and stochastic properties of the representation, and providing an explicit prediction of perceptual discriminability. Brute force comparison of model-predicted metric tensors would require estimation of human perceptual thresholds along an infeasibly large set of stimulus directions. To circumvent this ``perceptual curse of dimensionality'', we compute and measure discrimination capabilities for a small set of most-informative perturbations, reducing the measurement cost from thousands of hours (a conservative estimate) to a single trial. We show that this single measurement, made for a variety of different test stimuli, is sufficient to differentiate models, select models that better match human perception, or generate new models that combine the advantages of existing models. We demonstrate the power of this method in comparison of (1) two models for trichromatic color representation, with differing internal noise; and (2) two autoencoders trained with different regularizers.

Show Abstract

Metal-insulator transition and quantum magnetism in the SU(3) Fermi-Hubbard model

We use state-of-the-art numerical techniques to compute ground state correlations in the two-dimensional SU(3) Fermi Hubbard model at 1/3-filling, modeling fermions with three possible spin flavors moving on a square lattice with an average of one particle per site. We find clear evidence of a quantum critical point separating a non-magnetic uniform metallic phase from a regime where long-range `spin' order is present. In particular, there are multiple successive transitions to states with regular, long-range alternation of the different flavors, whose symmetry changes as the interaction strength increases. In addition to the rich quantum magnetism, this important physical system allows one to study integer filling and the associated Mott transition disentangled from nesting, in contrast to the usual SU(2) model. Our results also provide a significant step towards the interpretation of present and future experiments on fermionic alkaline-earth atoms, and other realizations of SU(N) physics.
Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.