2573 Publications

Improved statistical and computational complexity of the mean-field Langevin dynamics under structured data

Atsushi Nitanda, Kazusato Oko, Taiji Suzuki, D. Wu

Recent works have shown that neural networks optimized by gradient-based methods can adapt to sparse or low-dimensional target functions through feature learning; an often studied target is the sparse parity function on the unit hypercube. However, such isotropic data setting does not capture the anisotropy and low intrinsic dimensionality exhibited in realistic datasets. In this work, we address this shortcoming by studying how gradient-based feature learning interacts with structured (anisotropic) input data: we consider the classification of -sparse parity on high-dimensional orthotope where the feature coordinates have varying magnitudes, and analyze the learning complexity of the mean-field Langevin dynamics (MFLD), which describes the noisy gradient descent update on two-layer neural network. We show that the statistical complexity (i.e. sample size) and computational complexity (i.e. network width) of MFLD can both be improved when prominent directions of the anisotropic input data align with the support of the target function. Moreover, by employing a coordinate transform determined by the gradient covariance, the width can be made independent of the target degree. Lastly, we demonstrate the benefit of feature learning by establishing a kernel lower bound on the classification error, which applies to neural networks in the lazy regime.

Show Abstract

Self-organized dynamics of a viscous drop with interfacial nematic activity

M. Firouznia , David Saintillan

We study emergent dynamics in a viscous drop subject to interfacial nematic activity. Using hydrodynamic simulations, we show how the interplay of nematodynamics, activity-driven flows and surface deformations gives rise to a sequence of self-organized behaviors of increasing complexity, from periodic braiding motions of topological defects to chaotic defect dynamics and active turbulence, along with spontaneous shape changes and translation. Our findings recapitulate qualitative features of experiments and shed light on the mechanisms underpinning morphological dynamics in active interfaces.

Show Abstract
April 17, 2024

Multiscale simulations of molecular recognition by phase separated MUT-16: A scaffolding protein of Mutator foci

Kumar Gaurav, Virginia Busetto, S. Hanson

Biomolecular recruitment by phase separated condensates has emerged as a key organising principle of biological processes. One such process is the RNA silencing pathway, which regulates gene expression and genomic defense against foreign nucleic acids. In C. elegans, this pathway involves siRNA amplification at perinuclear germ granules named Mutator foci. The formation of Mutator foci depends on the phase separation of MUT-16, acting as a scaffolding protein to recruit other components of the Mutator complex. Earlier studies have indicated a crucial role for an exoribonuclease, MUT-7, in RNA silencing. The recruitment of MUT-7 to Mutator foci is facilitated by a bridging protein, MUT-8. However, how MUT-8 binds to MUT-16 remains elusive. We resolved the molecular drivers of MUT-16 phase separation and the recruitment of MUT-8 using multi-scale molecular dynamics simulations and in vitro experiments. Residue-level coarse-grained simulations predicted the relative phase separation propensities of MUT-16 disordered regions, which we validated by experiments.

Coarse-grained simulations at residue-level and near atomic-resolution also indicated the essential role of aromatic amino acids (Tyr and Phe) in MUT-16 phase separation. Furthermore, coarse-grained and atomistic simulations of MUT-8 N-terminal prion-like domain with phase separated MUT-16 condensate revealed the importance of cation-π interaction between Tyr residues of MUT-8 and Arg/Lys residues of MUT-16. By re-introducing atomistic detail to condensates from coarse-grained and 350 µs all-atom simulations in explicit solvent on Folding@Home, we demonstrate Arg-Tyr interaction surpasses the strength of Lys-Tyr interactions in the recruitment of MUT-8. The atomistic simulations show that the planar guanidinium group of Arg also engages in sp2-π interaction, and hydrogen bonds with the Tyr residues and these additional favorable contacts are missing in the Lys-Tyr interactions. In agreement with simulations, the mutation of seven Arg residues in MUT-16 to Lys and Ala weakens MUT-8 binding in vitro.

Show Abstract
April 15, 2024

Efficient convergent boundary integral methods for slender bodies

The interaction of fibers in a viscous (Stokes) fluid plays a crucial role in industrial and biological processes, such as sedimentation, rheology, transport, cell division, and locomotion. Numerical simulations generally rely on slender body theory (SBT), an asymptotic, nonconvergent approximation whose error blows up as fibers approach each other. Yet convergent boundary integral equation (BIE) methods which completely resolve the fiber surface have so far been impractical due to the prohibitive cost of layer-potential quadratures in such high aspect-ratio 3D geometries. We present a high-order Nyström quadrature scheme with aspect-ratio independent cost, making such BIEs practical. It combines centerline panels (each with a small number of poloidal Fourier modes), toroidal Green's functions, generalized Chebyshev quadratures, HPC parallel implementation, and FMM acceleration. We also present new BIE formulations for slender bodies that lead to well conditioned linear systems upon discretization. We test Laplace and Stokes Dirichlet problems, and Stokes mobility problems, for slender rigid closed fibers with (possibly varying) circular cross-section, at separations down to 1/20 of the slender radius, reporting convergence typically to at least 10 digits. We use this to quantify the breakdown of numerical SBT for close-to-touching rigid fibers. We also apply the methods to time-step the sedimentation of 512 loops with up to 1.65 million unknowns at around 7 digits of accuracy.

Show Abstract

Design of Coiled-Coil Protein Nanostructures for Therapeutics and Drug Delivery

D. Renfrew, et al.

Coiled-coil protein motifs have become widely employed in the design of biomaterials. Some of these designs have been studied for use in drug delivery due to the unique ability of coiled-coils to impart stability, oligomerization, and supramolecular assembly. To leverage these properties and improve drug delivery, release, and targeting, a variety of nano- to mesoscale architectures have been adopted. Coiled-coil drug delivery and therapeutics have been developed by using the coiled-coil alone, designing for higher-order assemblies such as fibers and hydrogels, and combining coiled-coil proteins with other biocompatible structures such as lipids and polymers. We review the recent development of these structures and the design criteria used to generate functional proteins of varying sizes and morphologies.

Show Abstract

Deep Learning Sequence Models for Transcriptional Regulation

Deciphering the regulatory code of gene expression and interpreting the transcriptional effects of genome variation are critical challenges in human genetics. Modern experimental technologies have resulted in an abundance of data, enabling the development of sequence-based deep learning models that link patterns embedded in DNA to the biochemical and regulatory properties contributing to transcriptional regulation, including modeling epigenetic marks, 3D genome organization, and gene expression, with tissue and cell-type specificity. Such methods can predict the functional consequences of any noncoding variant in the human genome, even rare or never-before-observed variants, and systematically characterize their consequences beyond what is tractable from experiments or quantitative genetics studies alone. Recently, the development and application of interpretability approaches have led to the identification of key sequence patterns contributing to the predicted tasks, providing insights into the underlying biological mechanisms learned and revealing opportunities for improvement in future models.

Show Abstract

The Hund-metal path to strong electronic correlations

Different families of materials follow distinct routes to strong-correlation physics. In so-called Mott insulator systems, the Coulomb repulsion of electrons impedes their motion and blocks their kinetic energy. Materials in the heavy-fermion family have two fluids of electrons, which live rather independent lives at high temperatures: mobile electrons and localized f electrons that form local magnetic moments. At very low temperatures, the hybridization, or quantum mechanical mixing, between those two species of electrons becomes relevant. Then a single fluid of itinerant, albeit slowly moving, “heavy” electronic quasiparticles emerges below a characteristic scale known as the Kondo temperature.
Show Abstract
April 1, 2024

Galaxy clustering analysis with SimBIG and the wavelet scattering transform

B. Régaldo-Saint Blancard, ChangHoon Hahn, Shirley Ho, Jiamin Hou, Pablo Lemos, Elena Massara , C. Modi, Azadeh Moradinezhad Dizgah, Liam Parker, Y. Yao, M. Eickenberg

The non-Gaussian spatial distribution of galaxies traces the large-scale structure of the Universe and therefore constitutes a prime observable to constrain cosmological parameters. We conduct Bayesian inference of the Λ CDM parameters Ωm, Ωb, h , ns, and σ8 from the Baryon Oscillation Spectroscopic Survey CMASS galaxy sample by combining the wavelet scattering transform (WST) with a simulation-based inference approach enabled by the SimBIG forward model. We design a set of reduced WST statistics that leverage symmetries of redshift-space data. Posterior distributions are estimated with a conditional normalizing flow trained on 20,000 simulated SimBIG galaxy catalogs with survey realism. We assess the accuracy of the posterior estimates using simulation-based calibration and quantify generalization and robustness to the change of forward model using a suite of 2000 test simulations. When probing scales down to kmax=0.5 h /Mpc , we are able to derive accurate posterior estimates that are robust to the change of forward model for all parameters, except σ8. We mitigate the robustness issues with σ8 by removing the WST coefficients that probe scales smaller than k ∼0.3 h /Mpc . Applied to the Baryon Oscillation Spectroscopic Survey CMASS sample, our WST analysis yields seemingly improved constraints obtained from a standard perturbation-theory-based power spectrum analysis with kmax=0.25 h /Mpc for all parameters except h . However, we still raise concerns on these results. The observational predictions significantly vary across different normalizing flow architectures, which we interpret as a form of model misspecification. This highlights a key challenge for forward modeling approaches when using summary statistics that are sensitive to detailed model-specific or observational imprints on galaxy clustering.

Show Abstract

Promoter and Gene-Body RNA-Polymerase II co-exist in partial demixed condensates

Arya Changiarath , Jasper J. Michels, S. Hanson

In cells, transcription is tightly regulated on multiple layers. The condensation of the transcription machinery into distinct phases is hypothesized to spatio-temporally fine tune RNA polymerase II behaviour during two key stages, transcription initiation and the elongation of the nascent RNA transcripts. However, it has remained unclear whether these phases would mix when present at the same time or remain distinct chemical environments; either as multi-phase condensates or by forming entirely separate condensates. Here we combine particle-based multi-scale simulations and experiments in the model organism C. elegans to characterise the biophysical properties of RNA polymerase II condensates. Both simulations and the in vivo work describe a lower critical solution temperature (LCST) behaviour of RNA Polymerase II, with condensates dissolving at lower temperatures whereas higher temperatures promote condensate stability, which highlights that these condensates are physio-chemically distinct from heterochromatin condensates. The LCST behavior of CTD correlates with gradual shifts in the transcription program but is largely uncoupled from the classical stress response. Expanding the simulations we model how the degree of phosphorylation of the disordered C-terminal domain of RNA polymerase II (CTD), which is characteristic for each step of transcription, controls the existence and morphology of multi-phasic condensates. We show that the two phases putatively underpinning the initiation of transcription and transcription elongation constitute distinct chemical environments and are in agreement with RNA polymerase II condensates observed in C. elegans embryos by super resolution microscopy. Our analysis shows how depending on its post transcriptional modifications and its interaction partner a single protein can form multiple partially engulfed condensates, potentially promoting the selective recruitment of additional factors to these two phases.

Show Abstract
March 27, 2024
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.