2596 Publications

posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, B. Carpenter, Aki Vehtari

The generality and robustness of inference algorithms is critical to the success of widely used probabilistic programming languages such as Stan, PyMC, Pyro, and this http URL. When designing a new general-purpose inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem arises in evaluating its accuracy and efficiency across a range of representative target models. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. We further provide a guide to the best practices in using posteriordb for model evaluation and comparison. To provide a wide range of realistic target densities, posteriordb currently comprises 120 representative models and has been instrumental in developing several general inference algorithms.

Show Abstract

Good Rates From Bad Coordinates: The Exponential Average Time-dependent Rate Approach

Nicodemo Mazzaferro, Subarna Sasmal, P. Cossio, Glen M. Hocky

Our ability to calculate rate constants of biochemical processes using molecular dynamics simulations is severely limited by the fact that the time scales for reactions, or changes in conformational state, scale exponentially with the relevant free-energy barrier heights. In this work, we improve upon a recently proposed rate estimator that allows us to predict transition times with molecular dynamics simulations biased to rapidly explore one or several collective variables (CVs). This approach relies on the idea that not all bias goes into promoting transitions, and along with the rate, it estimates a concomitant scale factor for the bias termed the “CV biasing efficiency”γ. First, we demonstrate mathematically that our new formulation allows us to derive the commonly used Infrequent Metadynamics (iMetaD) estimator when using a perfect CV, where γ= 1. After testing it on a model potential, we then study the unfolding behavior of a previously well characterized coarse-grained protein, which is sufficiently complex that we can choose many different CVs to bias, but which is sufficiently simple that we are able to compute the unbiased rate directly. For this system, we demonstrate that predictions from our new Exponential Average Time-Dependent Rate (EATR) estimator converge to the true rate constant more rapidly as a function of bias deposition time than does the previous iMetaD approach, even for bias deposition times that are short. We also show that the γparameter can serve as a good metric for assessing the quality of the biasing coordinate. We demonstrate that these results hold when applying the methods to an atomistic protein folding example. Finally, we demonstrate that our approach works when combining multiple less-than-optimal bias coordinates, and adapt our method to the related “OPES flooding”approach. Overall, our time-dependent rate approach offers a powerful framework for predicting rate constants from biased simulations.

Show Abstract

Fishing for Planets: A Comparative Analysis of EPRV Survey Performance in the Presence of Correlated Noise

A. Gupta, M. Bedell

With dedicated exoplanet surveys underway for multiple extreme-precision radial velocity (EPRV) instruments, the near-future prospects of RV exoplanet science are promising. These surveys' generous time allocations are expected to facilitate the discovery of Earth analogs around bright, nearby Sun-like stars. But survey success will depend critically on the choice of observing strategy, which will determine the survey's ability to mitigate known sources of noise and extract low-amplitude exoplanet signals. Here we present an analysis of the Fisher information content of simulated EPRV surveys, accounting for the most recent advances in our understanding of stellar variability on both short and long timescales (i.e., oscillations and granulation within individual nights, and activity-induced variations across multiple nights). In this analysis, we capture the correlated nature of stellar variability by parameterizing these signals with Gaussian process kernels. We describe the underlying simulation framework and the physical interpretation of the Fisher information content, and we evaluate the efficacy of EPRV survey strategies that have been presented in the literature. We explore and compare strategies for scheduling observations over various timescales, and we make recommendations to optimize survey performance for the detection of Earth-like exoplanets.

Show Abstract

Open Data In Neurophysiology: Advancements, Solutions & Challenges

Colleen J Gillon, Cody Baker, Ryan Ly, E. Balzani, Bingni W Brunton, Manuel Schottdorf, Satrajit Ghosh, Noma Dehghani

Across the life sciences, an ongoing effort over the last 50 years has made data and methods more reproducible and transparent. This openness has led to transformative insights and vastly accelerated scientific progress(1,2). For example, structural biology(3) and genomics(4,5) have undertaken systematic collection and publication of protein sequences and structures over the past half-century, and these data have led to scientific breakthroughs that were unthinkable when data collection first began (e.g.(6)). We believe that neuroscience is poised to follow the same path, and that principles of open data and open science will transform our understanding of the nervous system in ways that are impossible to predict at the moment. To this end, new social structures along with active and open scientific communities are essential(7) to facilitate and expand the still limited adoption of open science practices in our field(8). Unified by shared values of openness, we set out to organize a symposium for Open Data in Neuroscience (ODIN) to strengthen our community and facilitate transformative neuroscience research at large. In this report, we share what we learned during this first ODIN event. We also lay out plans for how to grow this movement, document emerging conversations, and propose a path toward a better and more transparent science of tomorrow.

Show Abstract

Magnetic, charge, and bond order in the two-dimensional Su-Schrieffer-Heeger-Holstein model

Most nonperturbative numerical studies of electron-phonon interactions focus on model Hamiltonians where the electrons interact with a phonon branch via a single type of microscopic mechanism. Two commonly explored couplings in this context are the Holstein and Su-Schrieffer-Heeger (SSH) interactions, which describe phonons modulating the on-site energy and intersite electron hopping, respectively. Many materials, however, have multiple phonon branches that can each interact with electronic degrees of freedom in different ways. We present here a determinant quantum Monte Carlo study of the half-filled two-dimensional (bond) SSH-Holstein Hamiltonian, where electrons couple to different phonon branches via either the Holstein or SSH mechanism. We map the model's phase diagram and determine the nature of the transitions between charge-density wave, bond-order wave, and antiferromagnetic order.
Show Abstract
July 1, 2024

Cross-extrapolation reconstruction of low-rank functions and application to quantum many-body observables in the strong coupling regime

We present a general-purpose algorithm to extrapolate a low-rank function of two variables from a small domain to a larger one. It is based on the cross-interpolation formula. We apply it to reconstruct physical quantities in some quantum many-body perturbative expansions in the real-time Keldysh formalism, considered as a function of time t and interaction U. These functions are of remarkably low rank. This property, combined with the convergence of the perturbative expansion in U both at finite t (for any U), and small U (for any t), is sufficient for our algorithm to reconstruct the physical quantity at long time, strong coupling regime. Our method constitutes an alternative to standard resummation techniques in perturbative methods, such as diagrammatic quantum Monte Carlo. We benchmark it on the single impurity Anderson model and show that it is successful even in some regime where standard conformal mapping resummation techniques fail.
Show Abstract
July 1, 2024

Cross-extrapolation reconstruction of low-rank functions and application to quantum many-body observables in the strong coupling regime

We present a general-purpose algorithm to extrapolate a low-rank function of two variables from a small domain to a larger one. It is based on the cross-interpolation formula. We apply it to reconstruct physical quantities in some quantum many-body perturbative expansions in the real-time Keldysh formalism, considered as a function of time t and interaction U. These functions are of remarkably low rank. This property, combined with the convergence of the perturbative expansion in U both at finite t (for any U), and small U (for any t), is sufficient for our algorithm to reconstruct the physical quantity at long time, strong coupling regime. Our method constitutes an alternative to standard resummation techniques in perturbative methods, such as diagrammatic quantum Monte Carlo. We benchmark it on the single impurity Anderson model and show that it is successful even in some regime where standard conformal mapping resummation techniques fail.
Show Abstract
July 1, 2024

The neuron as a direct data-driven controller

J. Moore, A. Genkin, Magnus Tournoy, J. Pughe-Sanford, Rob R. de Ruyter van Steveninck, D. Chklovskii

Building upon the efficient coding and predictive information theories, we present a perspective that neurons not only predict but may also actively influence their future inputs through their outputs. We model neurons as feedback controllers of their environments, a role traditionally considered computationally demanding, particularly when the dynamical system characterizing the environment is unknown. By harnessing an advanced data-driven control framework, we illustrate the feasibility of biological neurons functioning as effective feedback controllers. This innovative approach enables us to coherently explain various experimental findings that previously seemed unrelated. Our research has multiple potential implications, from the modeling of neuronal circuits to enabling biologically inspired artificial intelligence systems. In the quest to model neuronal function amid gaps in physiological data, a promising strategy is to develop a normative theory that interprets neuronal physiology as optimizing a computational objective. This study extends current normative models, which primarily optimize prediction, by conceptualizing neurons as optimal feedback controllers. We posit that neurons, especially those beyond early sensory areas, steer their environment toward a specific desired state through their output. This environment comprises both synaptically interlinked neurons and external motor sensory feedback loops, enabling neurons to evaluate the effectiveness of their control via synaptic feedback. To model neurons as biologically feasible controllers which implicitly identify loop dynamics, infer latent states, and optimize control we utilize the contemporary direct data-driven control (DD-DC) framework. Our DD-DC neuron model explains various neurophysiological phenomena: the shift from potentiation to depression in spike-timing-dependent plasticity with its asymmetry, the duration and adaptive nature of feedforward and feedback neuronal filters, the imprecision in spike generation under constant stimulation, and the characteristic operational variability and noise in the brain. Our model presents a significant departure from the traditional, feedforward, instant-response McCulloch–Pitts–Rosenblatt neuron, offering a modern, biologically informed fundamental unit for constructing neural networks.

Show Abstract

AstroCLIP: a cross-modal foundation model for galaxies

Liam Parker , Francois Lanusse, Siavash Golkar, Leopoldo Sarra, Miles Cranmer, A. Bietti, Michael Eickenberg, Geraud Krawezik, Michael McCabe , R. Morel, R. Ohana, B. Régaldo-Saint Blancard, et al.

We present AstroCLIP, a single, versatile model that can embed both galaxy images and spectra into a shared, physically meaningful latent space. These embeddings can then be used – without any model fine-tuning – for a variety of downstream tasks including (1) accurate in-modality and cross-modality semantic similarity search, (2) photometric redshift estimation, (3) galaxy property estimation from both images and spectra, and (4) morphology classification. Our approach to implementing AstroCLIP consists of two parts. First, we embed galaxy images and spectra separately by pre-training separate transformer-based image and spectrum encoders in self-supervised settings. We then align the encoders using a contrastive loss. We apply our method to spectra from the Dark Energy Spectroscopic Instrument and images from its corresponding Legacy Imaging Survey. Overall, we find remarkable performance on all downstream tasks, even relative to supervised baselines. For example, for a task like photometric redshift prediction, we find similar performance to a specifically trained ResNet18, and for additional tasks like physical property estimation (stellar mass, age, metallicity, and specific-star-formation rate), we beat this supervised baseline by 19 per cent in terms of R2. We also compare our results with a state-of-the-art self-supervised single-modal model for galaxy images, and find that our approach outperforms this benchmark by roughly a factor of two on photometric redshift estimation and physical property prediction in terms of R2, while remaining roughly in-line in terms of morphology classification. Ultimately, our approach represents the first cross-modal self-supervised model for galaxies, and the first self-supervised transformer-based architectures for galaxy images and spectra.

Show Abstract

Privileged representational axes in biological and artificial neural networks

Meenakshi Khosla, A. Williams, Josh McDermott, N. Kanwisher

How do neurons code information? Recent work emphasizes properties of population codes, such as their geometry and decodable information, using measures that are blind to the native tunings (or 'axes') of neural responses. But might these representational axes matter, with some privileged systematically over others? To find out, we developed methods to test for alignment of neural tuning across brains and deep convolutional neural networks (DCNNs). Across both vision and audition, both brains and DCNNs consistently favored certain axes for representing the natural world. Moreover, the representational axes of DCNNs trained on natural inputs were aligned to those in perceptual cortices, such that axis-sensitive model-brain similarity metrics better differentiated competing models of biological sensory systems. We further show that coding schemes that privilege certain axes can reduce downstream wiring costs and improve generalization. These results motivate a new framework for understanding neural tuning in biological and artificial networks and its computational benefits.Competing Interest StatementThe authors have declared no competing interest.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.