481 Publications

Algebraically rigorous quaternion framework for the neural network pose estimation problem

C. Lin, Andrew J. Hanson, S. Hanson

The 3D pose estimation problem – aligning pairs of noisy 3D point clouds – is a problem with a wide variety of real- world applications. Here we focus on the use of quaternion- based neural network approaches to this problem and ap- parent anomalies that have arisen in previous efforts to re- solve them. In addressing these anomalies, we draw heav- ily from the extensive literature on closed-form methods to solve this problem. We suggest that the major concerns that have been put forward could be resolved using a sim- ple multi-valued training target derived from rigorous theo- retical properties of the rotation-to-quaternion map of Bar- Itzhack. This multi-valued training target is then demon- strated to have good performance for both simulated and ModelNet targets. We provide a comprehensive theoretical context, using the quaternion adjugate, to confirm and es- tablish the necessity of replacing single-valued quaternion functions by quaternions treated in the extended domain of multiple-charted manifolds.

Show Abstract

Conditionally Strongly Log-Concave Generative Models

Florentin Guth, Etienne Lempereur, Joan Bruna, S. Mallat

There is a growing gap between the impressive results of deep image generative models and classical algorithms that offer theoretical guarantees. The former suffer from mode collapse or memorization issues, limiting their application to scientific data. The latter require restrictive assumptions such as log-concavity to escape the curse of dimensionality. We partially bridge this gap by introducing conditionally strongly log-concave (CSLC) models, which factorize the data distribution into a product of conditional probability distributions that are strongly log-concave. This factorization is obtained with orthogonal projectors adapted to the data distribution. It leads to efficient parameter estimation and sampling algorithms, with theoretical guarantees, although the data distribution is not globally logconcave. We show that several challenging multiscale processes are conditionally log-concave using wavelet packet orthogonal projectors. Numerical results are shown for physical fields such as the φ4 model and weak lensing convergence maps with higher resolution than in previous works.

Show Abstract

A Stochastic Proximal Polyak Step Size

Fabian Schaipp, R. M. Gower, Michael Ulbrich

Recently, the stochastic Polyak step size (SPS) has emerged as a competitive adaptive step size scheme for stochastic gradient descent. Here we develop ProxSPS, a proximal variant of SPS that can handle regularization terms. Developing a proximal variant of SPS is particularly important, since SPS requires a lower bound of the objective function to work well. When the objective function is the sum of a loss and a regularizer, available estimates of a lower bound of the sum can be loose. In contrast, ProxSPS only requires a lower bound for the loss which is often readily available. As a consequence, we show that ProxSPS is easier to tune and more stable in the presence of regularization. Furthermore for image classification tasks, ProxSPS performs as well as AdamW with little to no tuning, and results in a network with smaller weight parameters. We also provide an extensive convergence analysis for ProxSPS that includes the non-smooth, smooth, weakly convex and strongly convex setting.

Show Abstract

Learning multi-scale local conditional probability models of images

Zahra Kadkhodaie, Florentin Guth, S. Mallat, Eero P Simoncelli

Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. To study this, we incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients. We instantiate this model using convolutional neural networks (CNNs) with local receptive fields, which enforce both the stationarity and Markov properties. Global structures are captured using a CNN with receptive fields covering the entire (but small) low-pass image. We test this model on a dataset of face images, which are highly non-stationary and contain large-scale geometric structures.
Remarkably, denoising, super-resolution, and image synthesis results all demonstrate that these structures can be captured with significantly smaller conditioning neighborhoods than required by a Markov model implemented in the pixel domain. Our results show that score estimation for large complex images can be reduced to low-dimensional Markov conditional models across scales, alleviating the curse of dimensionality.

Show Abstract

Complex-to-Real Sketches for Tensor Products with Applications to the Polynomial Kernel

Jonas Wacker, R. Ohana, Maurizio Filippone

Randomized sketches of a tensor product of pvectors follow a tradeoff between statistical efficiency and computational acceleration. Commonly used approaches avoid computing the high-dimensional tensor product explicitly, resulting in a suboptimal dependence of O(3p) in the embedding dimension. We propose a simple Complex-to-Real (CtR) modification of well-known sketches that replaces real random projections by complex ones, incurring a lower O(2p)factor in the embedding dimension. The output of our sketches is real-valued, which renders
their downstream use straightforward. In particular, we apply our sketches to p-fold self-tensored inputs corresponding to the feature maps of the
polynomial kernel. We show that our method achieves state-of-the-art performance in terms of accuracy and speed compared to other randomized approximations from the literature.

Show Abstract

A Cheat Sheet for Bayesian Prediction

Bertrand Clarke, Y. Yao

This paper reviews the growing field of Bayesian prediction. Bayes point and interval prediction are defined and exemplified and situated in statistical prediction more generally. Then, four general approaches to Bayes prediction are defined and we turn to predictor selection. This can be done predictively or non-predictively and predictors can be based on single models or multiple models. We call these latter cases unitary predictors and model average predictors, respectively. Then we turn to the most recent aspect of prediction to emerge, namely prediction in the context of large observational data sets and discuss three further classes of techniques. We conclude with a summary and statement of several current open problems.

Show Abstract

Simulation-based inference of single-molecule force spectroscopy

Lars Dingeldein, P. Cossio, Roberto Covino

Single-molecule force spectroscopy (smFS) is a powerful approach to studying molecular self-organization. However, the coupling of the molecule with the ever-present experimental device introduces artifacts, that complicate the interpretation of these experiments. Performing statistical inference to learn hidden molecular properties is challenging because these measurements produce non-Markovian time series, and even minimal models lead to intractable likelihoods. To overcome these challenges, we developed a computational framework built on novel statistical methods called simulation-based inference (SBI). SBI enabled us to directly estimate the Bayesian posterior, and extract reduced quantitative models from smFS, by encoding a mechanistic model into a simulator in combination with probabilistic deep learning. Using synthetic data, we could systematically disentangle the measurement of hidden molecular properties from experimental artifacts. The integration of physical models with machine-learning density estimation is general, transparent, easy to use, and broadly applicable to other types of biophysical experiments.

Show Abstract

Simulation-based inference of single-molecule force spectroscopy

Lars Dingeldein, P. Cossio, Roberto Covino

Single-molecule force spectroscopy (smFS) is a powerful approach to studying molecular self-organization. However, the coupling of the molecule with the ever-present experimental device introduces artifacts, that complicate the interpretation of these experiments. Performing statistical inference to learn hidden molecular properties is challenging because these measurements produce non-Markovian time series, and even minimal models lead to intractable likelihoods. To overcome these challenges, we developed a computational framework built on novel statistical methods called simulation-based inference (SBI). SBI enabled us to directly estimate the Bayesian posterior, and extract reduced quantitative models from smFS, by encoding a mechanistic model into a simulator in combination with probabilistic deep learning. Using synthetic data, we could systematically disentangle the measurement of hidden molecular properties from experimental artifacts. The integration of physical models with machine-learning density estimation is general, transparent, easy to use, and broadly applicable to other types of biophysical experiments.

Show Abstract

A fast, accurate and easy to implement Kapur — Rokhlin quadrature scheme for singular integrals in axisymmetric geometries

Evan Toler, A.J. Cerfon, D. Malhotra

Many applications in magnetic confinement fusion require the efficient calculation of surface integrals with singular integrands. The singularity subtraction approaches typically used to handle such singularities are complicated to implement and low-order accurate. In contrast, we demonstrate that the Kapur–Rokhlin quadrature scheme is well-suited for the logarithmically singular integrals encountered for a toroidally axisymmetric confinement system, is easy to implement and is high-order accurate. As an illustration, we show how to apply this quadrature scheme for the efficient and accurate calculation of the normal component of the magnetic field due to the plasma current on the plasma boundary, via the virtual-casing principle.

Show Abstract

Folding and modulation of the helical conformation of Glycophorin A by point mutations

Pei-Yin Lee, A. Sahoo, Silvina Matysiak

Transmembrane helix folding and self-association play important roles in biological signaling and transportation pathways across biomembranes. With molecular simulations, studies to explore the structural biochemistry of this process have been limited to focusing on individual fragments of this process – either helix formation or dimerization. While at an atomistic resolution, it can be prohibitive to access long spatio-temporal scales, at the coarse grained (CG) level, current methods either employ additional constraints to prevent spontaneous unfolding or have a low resolution on sidechain beads that restricts the study of dimer disruption caused by mutations. To address these research gaps, in this work, we apply our recent, in-house developed CG model (ProMPT) to study the folding and dimerization of Glycophorin A (GpA) and its mutants in the presence of Dodecyl-phosphocholine (DPC) micelles. Our results first validate the two-stage model that folding and dimerization are independent events for transmembrane helices and found a positive correlation between helix folding and DPC-peptide contacts. The wild type (WT) GpA is observed to be a right-handed dimer with specific GxxxG contacts, which agrees with experimental findings. Specific point mutations reveal several features responsible for the structural stability of GpA. While the T87L mutant forms anti-parallel dimers due to an absence of T87 interhelical hydrogen bonds, a slight loss in helicity and a hinge-like feature at the GxxxG region develops for the G79L mutant. We note that the local changes in the hydrophobic environment, affected by the point mutation, contribute to the development of this helical bend. This work presents a holistic overview of the structural stability of GpA in a micellar environment, while taking secondary structural fluctuations into account. Moreover, it presents opportunities for applications of computationally efficient CG models to study conformational alterations of transmembrane proteins that have physiological relevance.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates