Publications

An adaptive spectral method for oscillatory second-order linear ODEs with frequency-independent cost

We introduce an efficient numerical method for second-order linear ODEs whose solution may vary between highly oscillatory and slowly changing over the solution interval. In oscillatory regions the solution is generated via a nonoscillatory phase function that obeys the nonlinear Riccati equation. We propose a defect correction iteration that gives an asymptotic series for such a phase function; this is numerically approximated on a Chebyshev grid with a small number of nodes. For analytic coefficients we prove that each iteration, up to a certain maximum number, reduces the residual by a factor of order of the local frequency. The algorithm adapts both the stepsize and the choice of method, switching to a conventional spectral collocation method away from oscillatory regions. In numerical experiments we find that our proposal outperforms other state-of-the-art oscillatory solvers, most significantly at low to intermediate frequencies and at low tolerances, where it may use up to $10^6$ times fewer function evaluations. Even in high-frequency regimes, our implementation is on average 10 times faster than other specialized solvers.

Show Abstract

Combining machine learning with structure-based protein design to predict and engineer post-translational modifications of proteins

Moritz Ertelt, V. Mulligan, et al.

Post-translational modifications (PTMs) of proteins play a vital role in their function and stability. These modifications influence protein folding, signaling, protein-protein interactions, enzyme activity, binding affinity, aggregation, degradation, and much more. To date, over 400 types of PTMs have been described, representing chemical diversity well beyond the genetically encoded amino acids. Such modifications pose a challenge to the successful design of proteins, but also represent a major opportunity to diversify the protein engineering toolbox. To this end, we first trained artificial neural networks (ANNs) to predict eighteen of the most abundant PTMs, including protein glycosylation, phosphorylation, methylation, and deamidation. In a second step, these models were implemented inside the computational protein modeling suite Rosetta, which allows flexible combination with existing protocols to model the modified sites and understand their impact on protein stability as well as function. Lastly, we developed a new design protocol that either maximizes or minimizes the predicted probability of a particular site being modified. We find that this combination of ANN prediction and structure-based design can enable the modification of existing, as well as the introduction of novel, PTMs. The potential applications of our work include, but are not limited to, glycan masking of epitopes, strengthening protein-protein interactions through phosphorylation, as well as protecting proteins from deamidation liabilities. These applications are especially important for the design of new protein therapeutics where PTMs can drastically change the therapeutic properties of a protein. Our work adds novel tools to Rosetta’s protein engineering toolbox that allow for the rational design of PTMs.

Show Abstract

Nuclear instance segmentation and tracking for preimplantation mouse embryos

H. Nunley , Binglun Shao, Prateek Grover, A. Watters, S. Shvartsman, L. M. Brown, et al.

For investigations into fate specification and cell rearrangements in live images of preimplantation embryos, automated and accurate 3D instance segmentation of nuclei is invaluable; however, the performance of segmentation methods is limited by the images' low signal-to-noise ratio and high voxel anisotropy and the nuclei's dense packing and variable shapes. Supervised machine learning approaches have the potential to radically improve segmentation accuracy but are hampered by a lack of fully annotated 3D data. In this work, we first establish a novel mouse line expressing near-infrared nuclear reporter H2B-miRFP720. H2B-miRFP720 is the longest wavelength nuclear reporter in mice and can be imaged simultaneously with other reporters with minimal overlap. We then generate a dataset, which we call BlastoSPIM, of 3D microscopy images of H2B-miRFP720-expressing embryos with ground truth for nuclear instance segmentation. Using BlastoSPIM, we benchmark the performance of five convolutional neural networks and identify Stardist-3D as the most accurate instance segmentation method across preimplantation development. Stardist-3D, trained on BlastoSPIM, performs robustly up to the end of preimplantation development (> 100 nuclei) and enables studies of fate patterning in the late blastocyst. We, then, demonstrate BlastoSPIM's usefulness as pre-train data for related problems. BlastoSPIM and its corresponding Stardist-3D models are available at: blastospim.flatironinstitute.org.

Show Abstract

To be or not to be: orb, the fusome and oocyte specification in Drosophila

J. I. Alsous, S. Shvartsman, et al.

In the fruit fly Drosophila melanogaster, two cells in a cyst of 16 interconnected cells have the potential to become the oocyte, but only one of these will assume an oocyte fate as the cysts transition through regions 2a and 2b of the germarium. The mechanism of specification depends on a polarized microtubule network, a dynein dependent Egl:BicD mRNA cargo complex, a special membranous structure called the fusome and its associated proteins, and the translational regulator orb. In this work, we have investigated the role of orb and the fusome in oocyte specification. We show here that specification is a stepwise process. Initially, orb mRNAs accumulate in the two pro-oocytes in close association with the fusome. This association is accompanied by the activation of the orb autoregulatory loop, generating high levels of Orb. Subsequently, orb mRNAs become enriched in only one of the pro-oocytes, the presumptive oocyte, and this is followed, with a delay, by Orb localization to the oocyte. We find that fusome association of orb mRNAs is essential for oocyte specification in the germarium, is mediated by the orb 3′ UTR, and requires Orb protein. We also show that the microtubule minus end binding protein Patronin functions downstream of orb in oocyte specification. Finally, in contrast to a previously proposed model for oocyte selection, we find that the choice of which pro-oocyte becomes the oocyte does not seem to be predetermined by the amount of fusome material in these two cells, but instead depends upon a competition for orb gene products.

Show Abstract

Efficient tensor network simulation of IBM’s Eagle kicked Ising experiment

J. Tindall, M. Fishman, M. Stoudenmire, D. Sels

We report an accurate and efficient classical simulation of a kicked Ising quantum system on the heavy hexagon lattice. A simulation of this system was recently performed on a 127-qubit quantum processor using noise-mitigation techniques to enhance accuracy [Y. Kim et al., Nature, 618, 500–5 (2023)]. Here we show that, by adopting a tensor network approach that reflects the geometry of the lattice and is approximately contracted using belief propagation, we can perform a classical simulation that is significantly more accurate and precise than the results obtained from the quantum processor and many other classical methods. We quantify the treelike correlations of the wave function in order to explain the accuracy of our belief propagation-based approach. We also show how our method allows us to perform simulations of the system to long times in the thermodynamic limit, corresponding to a quantum computer with an infinite number of qubits. Our tensor network approach has broader applications for simulating the dynamics of quantum systems with treelike correlations.

Show Abstract

Nested R̂ : Assessing the convergence of Markov chain Monte Carlo when running many short chains

C. Margossian, Matthew D. Hoffman, Pavel Sountsov, Lionel Riou-Durand, Aki Vehtari, Andrew Gelman

Recent developments in Markov chain Monte Carlo (MCMC) algorithms allow us to run thousands of chains in parallel almost as quickly as a single chain, using hardware accelerators such as GPUs. While each chain still needs to forget its initial point during a warmup phase, the subsequent sampling phase can be shorter than in classical settings, where we run only a few chains. To determine if the resulting short chains are reliable, we need to assess how close the Markov chains are to their stationary distribution after warmup. The potential scale reduction factor Rˆ is a popular convergence diagnostic but unfortunately can require a long sampling phase to work well. We present a nested design to overcome this challenge and a generalization called nested Rˆ. This new diagnostic works under conditions similar to Rˆ and completes the workflow for GPU-friendly samplers. In addition, the proposed nesting provides theoretical insights into the utility of Rˆ, in both classical and short-chains regimes.

Show Abstract

Protein dynamics underlying allosteric regulation

Miro A. Astore, Akshada S. Pradhan, E. Thiede, S. Hanson

Allostery is the mechanism by which information and control are propagated in biomolecules. It regulates ligand binding, chemical reactions, and conformational changes. An increasing level of experimental resolution and control over allosteric mechanisms promises a deeper understanding of the molecular basis for life and powerful new therapeutics. In this review, we survey the literature for an up-to-date biological and theoretical understanding of protein allostery. By delineating five ways in which the energy landscape or the kinetics of a system may change to give rise to allostery, we aim to help the reader grasp its physical origins. To illustrate this framework, we examine three systems that display these forms of allostery: allosteric inhibitors of beta-lactamases, thermosensation of TRP channels, and the role of kinetic allostery in the function of kinases. Finally, we summarize the growing power of computational tools available to investigate the different forms of allostery presented in this review.

Show Abstract

Modulation of Aβ 16–22 aggregation by glucose

Meenal Jain , A. Sahoo, Silvina Matysiak

The self-assembly of amyloid-beta (Aβ) peptides into fibrillar structures in the brain is a signature of Alzheimer's disease. Recent studies have reported correlations between Alzheimer's disease and type-2 diabetes. Structurally, hyperglycemia induces covalent protein crosslinkings by advanced glycation end products (AGE), which can affect the stability of Aβ oligomers. In this work, we leverage physics-based coarse-grained molecular simulations to probe alternate thermodynamic pathways that affect peptide aggregation propensities at varying concentrations of glucose molecules. Similar to previous experimental reports, our simulations show a glucose concentration-dependent increase in Aβ aggregation rates, without changes in the overall secondary structure content. We discovered that glucose molecules prefer partitioning onto the aggregate–water interface at a specific orientation, resulting in a loss of molecular rotational entropy. This effectively hastens the aggregation rates, as peptide self-assembly can reduce the available surface area for peptide–glucose interactions. This work introduces a new thermodynamic-driven pathway, beyond chemical cross-linking, that can modulate Aβ aggregation.

Show Abstract

Uniform approximation of common Gaussian process kernels using equispaced Fourier grids

A. Barnett, Philip Greengard, Ph.D., M. Rachh

The high efficiency of a recently proposed method for computing with Gaussian processes relies on expanding a (translationally invariant) covariance kernel into complex exponentials, with frequencies lying on a Cartesian equispaced grid. Here we provide rigorous error bounds for this approximation for two popular kernels—Matérn and squared exponential—in terms of the grid spacing and size. The kernel error bounds are uniform over a hypercube centered at the origin. Our tools include a split into aliasing and truncation errors, and bounds on sums of Gaussians or modified Bessel functions over various lattices. For the Matérn case, motivated by numerical study, we conjecture a stronger Frobenius-norm bound on the covariance matrix error for randomly-distributed data points. Lastly, we prove bounds on, and study numerically, the ill-conditioning of the linear systems arising in such regression problems.

Show Abstract

Decomposing imaginary time Feynman diagrams using separable basis functions: Anderson impurity model strong coupling expansion

J. Kaye, H. Strand, D. Golez

We present a deterministic algorithm for the efficient evaluation of imaginary time diagrams based on the recently introduced discrete Lehmann representation (DLR) of imaginary time Green's functions. In addition to the efficient discretization of diagrammatic integrals afforded by its approximation properties, the DLR basis is separable in imaginary time, allowing us to decompose diagrams into linear combinations of nested sequences of one-dimensional products and convolutions. Focusing on the strong coupling bold-line expansion of generalized Anderson impurity models, we show that our strategy reduces the computational complexity of evaluating an $M$th-order diagram at inverse temperature $\beta$ and spectral width $\omega_{\max}$ from $\mathcal{O}((\beta \omega_{\max})^{2M-1})$ for a direct quadrature to $\mathcal{O}(M (\log (\beta \omega_{\max}))^{M+1})$, with controllable high-order accuracy. We benchmark our algorithm using third-order expansions for multi-band impurity problems with off-diagonal hybridization and spin-orbit coupling, presenting comparisons with exact diagonalization and quantum Monte Carlo approaches. In particular, we perform a self-consistent dynamical mean-field theory calculation for a three-band Hubbard model with strong spin-orbit coupling representing a minimal model of Ca$_2$RuO$_4$, demonstrating the promise of the method for modeling realistic strongly correlated multi-band materials. For both strong and weak coupling expansions of low and intermediate order, in which diagrams can be enumerated, our method provides an efficient, straightforward, and robust black-box evaluation procedure. In this sense, it fills a gap between diagrammatic approximations of the lowest order, which are simple and inexpensive but inaccurate, and those based on Monte Carlo sampling of high-order diagrams.

Show Abstract