Publications

Lab icebergs melt down and flip out

Bobae Johnson, S. Weady, et al.

Ice in nature is dynamic at all scales, from glacial sheets that deform and flow to icebergs that melt down and capsize [1,2]. For the latter, much of the ice and much of the action is unseen beneath the surface [3–5]. Here we study laboratory-scale icebergs that freely float and melt, where direct visualizations show interesting and interconnected changes in the shape of the ice, its posture, and the flows of the surrounding water.

Our experiments reveal that free-floating ice persistently melts into unstable geometries, causing it to repeatedly capsize. Figure 1 shows the shape progression for a cylindrical piece of ice floating at the surface of room temperature water. It locks to an orientation, melts in place for several minutes, then abruptly rotates to a new posture and again locks. This process repeats for about 10 to 15 flips over the 30 minutes it takes to melt away. The photographs sample some of the locked orientations. Figure 2 displays the flows of the melt waters beneath the iceberg, where the two photos capture views along the axis and from the side, respectively. Below we describe the specialized techniques that enabled these images.

Show Abstract

Estimating the tails of the spectrum of the Hessian of the log-likelihood for \textit{ab-initio} single-particle reconstruction in electron cryomicroscopy

A. Rangan, W. S. Wai Shing, P. Cossio, et al.

Electron cryomicroscopy (cryo-EM) is a technique in structural biology used to reconstruct accurate volumetric maps of molecules. One step of the cryo-EM pipeline involves solving an inverse-problem. This inverse-problem, referred to as \textit{ab-initio} single-particle reconstruction, takes as input a collection of 2d-images -- each a projection of a molecule from an unknown viewing-angle -- and attempts to reconstruct the 3d-volume representing the underlying molecular density.
Most methods for solving this inverse-problem search for a solution which optimizes a posterior likelihood of generating the observed image-data, given the reconstructed volume. Within this framework, it is natural to study the Hessian of the log-likelihood: the eigenvectors and eigenvalues of the Hessian determine how the likelihood changes with respect to perturbations in the solution, and can give insight into the sensitivity of the solution to aspects of the input.
In this paper we describe a simple strategy for estimating the smallest eigenvalues and eigenvectors (i.e., the `softest modes') of the Hessian of the log-likelihood for the \textit{ab-initio} single-particle reconstruction problem. This strategy involves rewriting the log-likelihood as a 3d-integral. This interpretation holds in the low-noise limit, as well as in many practical scenarios which allow for noise-marginalization.
Once we have estimated the softest modes, we can use them to perform many kinds of sensitivity analysis. For example, we can determine which parts of the reconstructed volume are trustworthy, and which are unreliable, and how this unreliability might depend on the data-set and the imaging parameters. We believe that this kind of analysis can be used alongside more traditional strategies for sensitivity analysis, as well as in other applications, such as free-energy estimation.

Show Abstract

Computing whole embryo strain maps during gastrulation

David Denberg, Xiaoxuan Zhang, S. Shvartsman, et al.

Gastrulation is a critical process during embryonic development that transforms a single-layered blastula into a multilayered embryo with distinct germ layers, which eventually give rise to all the tissues and organs of the organism. Studies across species have uncovered the mechanisms underlying the building blocks of gastrulation movements, such as localized in-plane and out-of-plane epithelial deformations. The next challenge is to understand dynamics on the scale of the embryo: this requires quantifying strain tensors, which rigorously describe the differences between the deformed configurations taken on by local clusters of cells at time instants of observation and their reference configuration at an initial time. We present a systematic strategy for computing such tensors from the local dynamics of cell clusters, which are chosen across the embryo from several regions whose morphogenetic fate is central to viable gastrulation. As an application of our approach, we demonstrate a strategy of identifying distinct Drosophila morphological domains using strain tensors.

Show Abstract

Representational learning by optimization of neural manifolds in an olfactory memory network

Bo Hu, Nesibe Z. Temiz, C. Chou , Peter Rupprecht, Claire Meissner-Bernard, Benjamin Titze, S. Chung , Rainer W. Freidrich

Higher brain functions depend on experience-dependent representations of relevant information that may be organized by attractor dynamics or by geometrical modifications of continuous “neural manifolds”. To explore these scenarios we analyzed odor-evoked activity in telencephalic area pDp of juvenile and adult zebrafish, the homolog of piriform cortex. No obvious signatures of attractor dynamics were detected. Rather, olfactory discrimination training selectively enhanced the separation of neural manifolds representing task-relevant odors from other representations, consistent with predictions of autoassociative network models endowed with precise synaptic balance. Analytical approaches using the framework of manifold capacity revealed multiple geometrical modifications of representational manifolds that supported the classification of task-relevant sensory information. Manifold capacity predicted odor discrimination across individuals, indicating a close link between manifold geometry and behavior. Hence, pDp and possibly related recurrent networks store information in the geometry of representational manifolds, resulting in joint sensory and semantic maps that may support distributed learning processes.

Show Abstract

A numerical method for scattering problems with unbounded interfaces

Tristan Goodwill, C. Epstein

We introduce a new class of computationally tractable scattering problems in unbounded domains, which we call decomposable problems. In these decomposable problems, the computational domain can be split into a finite collection of subdomains in which the scatterer has a "simple" structure. A subdomain is simple if the domain Green's function for this subdomain is either available analytically or can be computed numerically with arbitrary accuracy by a tractable method. These domain Green's functions are then used to reformulate the scattering problem as a system of boundary integral equations on the union of the subdomain boundaries. This reformulation gives a practical numerical method, as the resulting integral equations can then be solved, to any desired degree of accuracy, by using coordinate complexification over a finite interval, and standard discretization techniques.

Show Abstract

A robust and versatile computational peptide design pipeline to inform wet-lab experiments

V. Mulligan, Tristan Zaborniak , Benjamin P. Brown , D. Renfrew

Since Merrifield’s development of solid-phase peptide synthesis, we have seen explosive growth in the number of synthetic building-blocks that can be incorporated into peptides. This has created a problem: the number of possible molecules that could be synthesized is many orders of magnitude greater than the largest conceivable combinatorial libraries. Computational design, based on combinatorial optimization algorithms, addresses this problem by proposing sequences likely to have desired folds and functions. These computational methods complement experiments by reducing astronomically large numbers of combinatorial possibilities to experimentally tractable shortlists. This presentation describes our robust, versatile methods, made available to peptide scientists in the Rosetta and Masala software suites, for designing peptides that fold into rigid conformations. Our physics-based methods generalize to exotic chemical building blocks poorly amenable to machine learning-based methods for want of training data. Our pipeline has produced experimentally-validated mixed-chirality peptides that bind to targets of therapeutic interest, and peptides that diffuse across cell membranes. Ongoing research is mapping the sequence optimization problem (which grows intractable even for supercomputers as the number of candidate chemical building blocks grows very large) to current and near-future quantum computers, allowing use of quantum algorithms in the context of the existing, widely-used design protocols.

Show Abstract

On the construction of scattering matrices for irregular or elongated enclosures using Green’s representation formula

Carlos Borges, L. Greengard, Michael O'Neil , M. Rachh

Multiple scattering methods are widely used to reduce the computational complexity of acoustic or electromagnetic scattering problems when waves propagate through media containing many identical inclusions. Historically, this numerical technique has been limited to situations in which the inclusions (particles) can be covered by nonoverlapping disks in two dimensions or spheres in three dimensions. This allows for the use of separation of variables in cylindrical or spherical coordinates to represent the solution to the governing partial differential equation. Here, we provide a more flexible approach, applicable to a much larger class of geometries. We use a Green’s representation formula and the associated layer potentials to construct incoming and outgoing solutions on rectangular enclosures. The performance and flexibility of the resulting scattering operator formulation in two-dimensions is demonstrated via several numerical examples for multi-particle scattering in free space as well as in layered media. The mathematical formalism extends directly to the three dimensional case as well, and can easily be coupled with several commercial numerical PDE software packages.

Show Abstract

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence

B. Şimşek, Amire Bendjeddou, Daniel Hsu

This work focuses on the gradient flow dynamics of a neural network model that uses correlation loss to approximate a multi-index function on high-dimensional standard Gaussian data. Specifically, the multi-index function we consider is a sum of neurons $f^*(x) \!=\! \sum_{j=1}^k \! \sigma^*(v_j^T x)$ where $v_1, \dots, v_k$ are unit vectors, and $\sigma^*$ lacks the first and second Hermite polynomials in its Hermite expansion. It is known that, for the single-index case ($k\!=\!1$), overcoming the search phase requires polynomial time complexity. We first generalize this result to multi-index functions characterized by vectors in arbitrary directions. After the search phase, it is not clear whether the network neurons converge to the index vectors, or get stuck at a sub-optimal solution. When the index vectors are orthogonal, we give a complete characterization of the fixed points and prove that neurons converge to the nearest index vectors. Therefore, using $n \! \asymp \! k \log k$ neurons ensures finding the full set of index vectors with gradient flow with high probability over random initialization. When $ v_i^T v_j \!=\! \beta \! \geq \! 0$ for all $i \neq j$, we prove the existence of a sharp threshold $\beta_c \!=\! c/(c+k)$ at which the fixed point that computes the average of the index vectors transitions from a saddle point to a minimum. Numerical simulations show that using a correlation loss and a mild overparameterization suffices to learn all of the index vectors when they are nearly orthogonal, however, the correlation loss fails when the dot product between the index vectors exceeds a certain threshold.

Show Abstract

Efficient Implementation of the Random Phase Approximation with Domain-based Local Pair Natural Orbitals

Yu Hsuan Liang, Xing Zhang, G. K. Chan, T. Berkelbach, Hong-Zhou Ye

We present an efficient implementation of the random phase approximation (RPA) for molecular systems within the domain-based local pair natural orbital (DLPNO) framework. With optimized parameters, DLPNO-RPA achieves approximately 99.9% accuracy in the total correlation energy compared to a canonical implementation, enabling highly accurate reaction energies and potential energy surfaces to be computed while substantially reducing computational costs. As an application, we demonstrate the capability of DLPNO-RPA to efficiently calculate basis set-converged binding energies for a set of large molecules, with results showing excellent agreement with high-level reference data from both coupled cluster and diffusion Monte Carlo. This development paves the way for the routine use of RPA-based methods in molecular quantum chemistry.

Show Abstract

Diabatic states of charge transfer with constrained charge equilibration

Sohang Kundu, Hong-Zhou Ye, T. Berkelbach

Charge transfer (CT) processes that are electronically non-adiabatic are ubiquitous in chemistry, biology, and materials science, but their theoretical description requires diabatic states or adiabatic excited states. For complex systems, these latter states are more difficult to calculate than the adiabatic ground state. Here, we propose a simple method to obtain diabatic states, including energies and charges, by constraining the atomic charges within the charge equilibration framework. For two-state systems, the exact diabatic coupling can be determined, from which the adiabatic excited-state energy can also be calculated. The method can be viewed as an affordable alternative to constrained density functional theory (CDFT), and so we call it constrained charge equilibration (CQEq). We test the CQEq method on the anthracene-tetracyanoethylene CT complex and the reductive decomposition of ethylene carbonate on a lithium metal surface. We find that CQEq predicts diabatic energies, charges, and adiabatic excitation energies in good agreement with CDFT, and we propose that CQEq is promising for combination with machine learning force fields to study non-adiabatic CT in the condensed phase.

Show Abstract