2697 Publications

Electrohydrodynamic drift of a drop away from an insulating wall

Diptendu Sen, M. Firouznia , Jeremy Koch, et al.

An isolated charge-neutral drop suspended in an unbounded medium does not migrate in a uniform dc electric field. A nearby wall breaks the symmetry and causes the drop to drift towards or away from the boundary, depending on the electric properties of the fluids and the wall. In the case of an electrically insulating wall and an electric field applied tangentially to the wall, the interaction of the drop with its electrostatic image gives rise to repulsion by the wall. However, the electrohydrodynamic flow causes either repulsion for a drop with R/P1. We experimentally measure droplet trajectories and quantify the wall-induced electrohydrodynamic lift in the case R/P1 case. The results show that the lateral migration of a drop in a uniform electric field applied parallel to an insulating wall is dominated by the long-range flow due to the image stresslet.

Show Abstract

Integrated single-cell multiome analysis reveals muscle fiber-type gene regulatory circuitry modulated by endurance exercise

Aliza B. Rubenstein, X. Chen, O. Troyanskaya, et al.

Endurance exercise induces multisystem adaptations that improve performance and benefit health. Gene regulatory circuit responses within individual skeletal muscle cell types, which are key mediators of exercise effects, have not been studied. Here, we map transcriptome, chromatin, and regulatory circuit responses to acute endurance exercise in muscle using same-cell RNA-seq/ATAC-seq multiome assays. High-quality data were obtained from 37,154 nuclei comprising 14 cell types in vastus lateralis samples collected before and 3.5 h after either 40 min cycling exercise at 70% VO2max or 40 min supine rest. Both shared and cell-type-specific regulatory programs were identified. Differential gene expression and accessibility sites are largely distinct within nuclei for each cell type and muscle fiber, with the largest numbers of regulatory events observed in the three muscle fiber types (slow, fast, and intermediate) and lumican (LUM)-expressing fibro-adipogenic progenitor cells. Single-cell regulatory circuit triad reconstruction (transcription factor, chromatin interaction site, regulated gene) also identifies largely distinct gene regulatory circuits modulated by exercise in the three muscle fiber types and LUM-expressing fibro-adipogenic progenitor cells, involving a total of 328 transcription factors acting at chromatin sites regulating 2025 genes. This web-accessible single-cell data set and regulatory circuitry map serve as a resource for understanding the molecular underpinnings of the metabolic and physiological effects of exercise and for guiding interpretation of the exercise response literature in bulk tissue.

Show Abstract

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Zixuan Wang, Eshaan Nichani, A. Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, D. Wu

Transformer-based language models have demonstrated impressive capabilities across a range of complex reasoning tasks. Prior theoretical work exploring the expressive power of transformers has shown that they can efficiently perform multi-step reasoning tasks involving parallelizable computations. However, the learnability of such constructions, particularly the conditions on the data distribution that enable efficient learning via SGD, remains an open question. Towards answering this question, we study the learnability of a task called the \emph{$k$-fold composition}, which requires computing an interleaved composition of $k$ input permutations and $k$ hidden permutations, and can be expressed by a transformer with $O(\log k)$ layers. On the negative front, we provide a Statistical Query lower bound showing that any learner which is trained on samples from the $k$-fold composition task and makes polynomially many queries must have sample size exponential in $k$, thus establishing a statistical-computational gap. On the other hand, we show that this function class can be efficiently learned, with runtime and sample complexity polynomial in $k$, by gradient descent on an $O(\log k)$-depth transformer via two different curriculum learning strategies: one in which data consists of $k’$-fold composition functions with $k’ \le k$ presented in increasing order of difficulty, and another in which all data is presented simultaneously. Our work sheds light on the necessity and sufficiency of having both easy and hard examples in the data distribution for transformers to learn complex compositional tasks.

Show Abstract

Variations in neuronal selectivity create efficient representational geometries for perception

Our visual capabilities depend on neural response properties in visual areas of our brains. Neurons exhibit a wide variety of selective response properties, but the reasons for this diversity are unknown. Here, we related the distribution of neuronal tuning properties to the information capacity of the population. Our results from theory, simulations, and analysis of recordings from macaque primary visual cortex (V1) reveal that diversity of amplitude and bandwidth drive complementary changes to the representational geometry of a population. Amplitude diversity pushes the centers of the representations further apart, whereas bandwidth heterogeneity decorrelates the center locations. These geometric changes separate out representations for distinct stimuli, creating more efficient encoding. We study how both types of diversity affect the population code for two different perceptual tasks: discrimination and identification. While both types of diversity improve encoding for both tasks, their distinct impacts on geometry make each more beneficial for one of the two tasks. Amplitude diversity impacts coding efficiency more for discrimination than it does for identification, while bandwidth diversity has a stronger impact on identification. These complementary effects indicate the importance of both types of diversity for perception. Finally, because tuning diversity exists across species and brain areas, our results suggest a fundamental neural coding strategy that may be applicable to a wide range of behavior.

Show Abstract

Sequestration of ribosome biogenesis factors in HSV- 1 nuclear aggregates revealed by spatially resolved thermal profiling

Peter J. Metzger , Tavis J. Reed , O. Troyanskaya

Viruses exploit host cell reliance on compartmentalization to facilitate their replication. Herpes simplex virus type 1 (HSV-1) modulates the subcellular localization of host proteins to suppress immune activation, license viral gene expression, and achieve translational shutoff. To spatially resolve dynamic protein-protein interaction (PPI) networks during infection with an immunostimulatory HSV-1 strain, we integrated nuclear/cytoplasmic fractionation with thermal proximity coaggregation analysis (N/C-TPCA). The resulting expanded depth and spatial resolution of PPIs charted compartment-specific assemblies of protein complexes throughout infection. We find that a broader suite of host chaperones than previously anticipated exhibits nuclear recruitment to form condensates known as virus-induced chaperone-enriched (VICE) domains. Monitoring protein and RNA constituents and ribosome activity, we establish that VICE domains sequester ribosome biogenesis factors from ribosomal RNA, accompanying a cell-wide defect in ribosome supply. These findings highlight infection-driven VICE domains as nodes of translational remodeling and demonstrate the utility of N/C-TPCA to study dynamic biological contexts.

Show Abstract

A live-cell biosensor of in vivo receptor tyrosine kinase activity reveals feedback regulation of a developmental gradient

Emily K. Ho , Rebecca P. Kim-Yip , S. Shvartsman, et al.

A lack of tools for detecting receptor activity in vivo has limited our ability to fully explore receptor-level control of developmental patterning. Here, we extend phospho-tyrosine tag (pYtag) biosensors to visualize endogenous receptor tyrosine kinase (RTK) activity in Drosophila. We build biosensors for three RTKs that function across developmental stages and tissues. By characterizing Torso::pYtag during embryonic terminal patterning, we find that Torso activity differs from downstream extracellular signal-regulated kinase (ERK) activity in two surprising ways: Torso activity is narrowly restricted to the poles but produces a broader gradient of ERK and decreases over developmental time, while ERK activity is sustained, an effect mediated by ERK pathway-dependent negative feedback. Our results suggest that a narrow domain of Torso activity, tuned in amplitude by negative feedback, locally activates signaling effectors, which diffuse through the syncytial embryo to form the ERK gradient. Altogether, the results of this work highlight the usefulness of pYtags for investigating receptor-level regulation of developmental patterning.

Show Abstract

Microtubules in Martini: Parameterizing a heterogeneous elastic-network towards a mechanically accurate microtubule

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport, exhibiting complex structural dynamics governed by diverse biophysical factors. Atomistic simulations of microtubule assemblies remain challenging due to their extensive spatiotemporal scales. To address this, we present a multiscale approach combining the primarily top-down Martini 3 coarse-grained (CG) model with an appropriately parameterized heterogeneous elastic network to capture microtubule mechanics and molecular detail efficiently. By iteratively tuning the elastic network, we matched the structural fluctuations of CG heterodimeric building blocks to atomistic reference data, reproducing experimentally consistent mechanical properties. This framework helped us identify stabilizing long-lived interactions between charged C-terminal tails and the folded domain of neighboring tubulin subunits, offering insight into sequence-specific contributions to lattice stability. Our efforts culminated in the construction of a 200 nm microtubule composed of million interaction centers, enabling exploration of large-scale microtubule-associated processes with amino acid-level resolution. This work bridges the gap between molecular specificity and computational scalability, offering a platform for simulating biophysical processes across cellular length and time scales.

Show Abstract

RocketSHP: Ultra-fast Proteome-scale Prediction of Protein Dynamics

Proteins are dynamic molecules that depend on conformational flexibility to carry out functions in the cell, yet despite significant advances in the modeling of static protein structure, prediction of these dynamics remains challenging. We introduce RocketSHP, a machine learning model that predicts dynamic protein properties from sequence or static structure with unprecedented speed and accuracy. Trained on thousands of molecular dynamics trajectories spanning diverse protein families, RocketSHP simultaneously models multiple dynamics features: root-mean-square fluctuations (RMSF), generalized correlation coefficients (GCC-LMI), and a novel structural heterogeneity profile (SHP) based on recent structure quantization methods. RocketSHP significantly outperforms existing methods in predicting simulation-derived dynamics. We reduce RMSF prediction error by 57% compared to BioEmu and calibrated Dyna-1 predictions, including an up to 73% error reduction for long proteins. We validate these predictions with experimental hetNOE data, and we demonstrate the ability to adapt predictions to different physical temperatures. We highlight RocketSHP’s utility in constructing allosteric networks in the oncogene KRAS and identify structural sub-modules with correlated motions, and we validate RocketSHP by showing that changes in node centrality within predicted KRAS allosteric networks correlate with changes of folding free energy in experimental DMS data. Our approach makes predictions in seconds rather than hours or days, enabling us to perform the first comprehensive dynamics analysis of the entire human proteome. RocketSHP bridges the gap between static structural biology and dynamic functional understanding, enabling dynamics-aware structural analysis and variant effect prediction at scales previously unavailable. RocketSHP is available as free and open-source software at https://github.com/flatironinstitute/RocketSHP.

Show Abstract
June 17, 2025

Complex scaling for open waveguides

C. Epstein, Tristan Goodwill, Jeremy Hoskins, S. Quinn, M. Rachh

In this work we analyze the complex scaling method applied to the problem of time-harmonic scalar wave propagation in junctions between `leaky,' or open dielectric waveguides. In [arXiv:2302.04353, arXiv:2310.05816, arXiv:2401.04674, arXiv:2411.11204], it was shown that under suitable assumptions the problem can be reduced to a system of Fredholm second-kind integral equations on an infinite interface, transverse to the waveguides. Here, we show that the kernels appearing in the integral equation admit a rapidly decaying analytic continuation on certain natural totally real submanifolds of $\mathbb{C}^2.$ We then show that for suitable, physically-meaningful, boundary data the resulting solutions to the integral equations themselves admit analytic continuation and satisfy related asymptotic estimates. By deforming the integral equation to a suitable contour, the decay in the kernels, density, and data enable straightforward discretization and truncation, with an error that decays exponentially in the truncation length. We illustrate our results with several representative numerical examples.

Show Abstract

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract
June 10, 2025
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates