2697 Publications

Query Efficient Structured Matrix Learning

Noah Amsel, Pratyush Avi, Tyler Chen, Feyza Duman Keles, Chinmay Hegde, Cameron Musco, Christopher Musco, D. Persson

We study the problem of learning a structured approximation (low-rank, sparse, banded, etc.) to an unknown matrix $A$ given access to matrix-vector product (matvec) queries of the form $x \rightarrow Ax$ and $x \rightarrow A^Tx$. This problem is of central importance to algorithms across scientific computing and machine learning, with applications to fast multiplication and inversion for structured matrices, building preconditioners for first-order optimization, and as a model for differential operator learning. Prior work focuses on obtaining query complexity upper and lower bounds for learning specific structured matrix families that commonly arise in applications.
We initiate the study of the problem in greater generality, aiming to understand the query complexity of learning approximations from general matrix families. Our main result focuses on finding a near-optimal approximation to $A$ from any finite-sized family of matrices, $\mathcal{F}$. Standard results from matrix sketching show that $O(\log|\mathcal{F}|)$ matvec queries suffice in this setting. This bound can also be achieved, and is optimal, for vector-matrix-vector queries of the form $x,y\rightarrow x^TAy$, which have been widely studied in work on rank-$1$ matrix sensing.
Surprisingly, we show that, in the matvec model, it is possible to obtain a nearly quadratic improvement in complexity, to $\tilde{O}(\sqrt{\log|\mathcal{F}|})$. Further, we prove that this bound is tight up to log-log this http URL covering number arguments, our result extends to well-studied infinite families. As an example, we establish that a near-optimal approximation from any \emph{linear matrix family} of dimension $q$ can be learned with $\tilde{O}(\sqrt{q})$ matvec queries, improving on an $O(q)$ bound achievable via sketching techniques and vector-matrix-vector queries.

Show Abstract

Comprehensive characterization of human color discrimination thresholds

Fangfang Hong, Ruby Bouhassira, Jason Chow, Craig Sanders, Michael Shvartsman, Phillip Guan, A. Williams, D. H. Brainard

Discrimination thresholds reveal the limits of human perception; scientists have studied them since the time of Fechner in the 1800s. Forced-choice psychophysical methods combined with the method of constant stimuli or parametric adaptive trial-placement procedures are well-suited for measuring one-dimensional psychometric functions. However, extending these methods to characterize psychometric fields in higher-dimensional stimulus spaces, such as three-dimensional color space, poses a significant challenge. Here, we introduce a novel Wishart Process Psychophysical Model (WPPM) that leverages the smooth variation of threshold across stimulus space. We demonstrate the use of the WPPM in conjunction with a non-parametric adaptive trial-placement procedure by characterizing the full psychophysical field for color discrimination in the isoluminant plane. Each participant (N = 8) completed between 6,000 and 6,466 three-alternative forced-choice (3AFC) oddity color discrimination trials. The WPPM was fit to these trials. Importantly, once fit, the WPPM allows readout of discrimination performance between any pair of stimuli, providing a comprehensive characterization of the psychometric field. In addition, the WPPM readouts were validated for each participant by comparison with 25 probe psychometric functions. These were measured with an additional 6,000 trials per participant that were held out from the WPPM fit. The dataset offers a foundational resource for developing perceptual color metrics and for benchmarking mechanistic models of color processing. This approach is broadly generalizable to other perceptual domains …

Show Abstract

Representational drift and learning-induced stabilization in the piriform cortex

Guillermo B. Morales, Miguel A. Muñoz, Y. Tu

The brain encodes external stimuli through patterns of neural activity, forming internal representations of the world. Increasing experimental evidence showed that neural representations for a specific stimulus can change over time in a phenomenon called “representational drift” (RD). However, the underlying mechanisms for this widespread phenomenon remain poorly understood. Here, we study RD in the piriform cortex of the olfactory system with a realistic neural network model that incorporates two general mechanisms for synaptic weight dynamics operating at two well-separated timescales: spontaneous multiplicative fluctuations on a scale of days and spike-timing-dependent plasticity (STDP) effects on a scale of seconds. We show that the slow multiplicative fluctuations in synaptic sizes, which lead to a steady-state distribution of synaptic weights consistent with experiments, can induce RD effects that are in quantitative agreement with recent empirical evidence. Furthermore, our model reveals that the fast STDP learning dynamics during presentation of a given odor drives the system toward a low-dimensional representational manifold, which effectively reduces the dimensionality of synaptic weight fluctuations and thus suppresses RD. Specifically, our model explains why representations of already “learned” odors drift slower than unfamiliar ones, as well as the dependence of the drift rate with the frequency of stimulus presentation—both of which align with recent experimental data. The proposed model not only offers a simple explanation for the emergence of RD and its relation to learning in the piriform cortex, but also provides a general theoretical framework for studying representation dynamics in other neural systems.

Show Abstract

Prediction of local convergent shifts in evolutionary rates with phyloConverge

Elysia Saputra , W. Mao , et al.

Convergence analysis can characterize genetic elements underlying morphological adaptations. However, its performance on regulatory elements is limited due to their modular composition of transcription factor motifs, which have rapid turnover and experience different evolutionary pressures.

We introduce phyloConverge, a phylogenetic method that performs scalable, fine-grained local convergence analysis of genomic elements at flexible length scales. Using a benchmarking case of convergent subterranean mammal adaptation, phyloConverge identifies rate-accelerated conserved noncoding elements (CNEs) with high specificity and statistical robustness relative to competing methods. From CNE-level scoring, we detect the convergent regression of entire CNE units and highlight the contrast that subterranean-associated coding region regression is highly specific to ocular functions, whereas regulatory element regression is enriched for accompanying neuronal phenotypes and other developmental processes. From transcription factor motif-level scoring, we dissect elements into subregions with uneven convergence signals and demonstrate the modular adaptation of CNEs with high functional specificity. Finally, we demonstrate phyloConverge’s scalability to perform high-resolution convergence analysis genome-wide.

Show Abstract

On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning

A. Bietti, Joan Bruna, L. Pillaud-Vivien

We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.

Show Abstract

DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction

R. Morel, J. Han, E. Oyallon

We address the problem of predicting the next states of a dynamical system governed by unknown temporal partial differential equations (PDEs) using only a short trajectory. While standard transformers provide a natural blackbox solution to this task, the presence of a wellstructured evolution operator in the data suggests a more tailored and efficient approach. Specifically, when the PDE is fully known, classical numerical solvers can evolve the state accurately with only a few parameters. Building on this observation, we introduce DISCO, a model that uses a large hypernetwork to process a short trajectory and generate the parameters of a much smaller operator network, which then predicts the next states through time integration. Our framework decouples dynamics estimation — i.e., DISCovering an evolution Operator from a short trajectory — from state prediction — i.e., evolving this operator. Experiments show that pretraining our model on diverse physics datasets achieves state-of-the-art performance while requiring significantly fewer epochs. Moreover, it generalizes well to unseen initial conditions and remains competitive when fine-tuned on downstream tasks. The code will be made publicly available upon publication at https: //github.com/RudyMorel/DISCO.

Show Abstract

Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry

Integrating task-relevant information into neural representations is a fundamental ability of both biological and artificial intelligence systems. Recent theories have categorized learning into two regimes: the rich regime, where neural networks actively learn task-relevant features, and the lazy regime, where networks behave like random feature models. Yet this simple lazy-rich dichotomy overlooks a diverse underlying taxonomy of feature learning, shaped by differences in learning algorithms, network architectures, and data properties. To address this gap, we introduce an analysis framework to study feature learning via the geometry of neural representations. Rather than inspecting individual learned features, we characterize how task-relevant representational manifolds evolve throughout the learning process. We show, in both theoretical and empirical settings, that as networks learn features, task-relevant manifolds untangle, with changes in manifold geometry revealing distinct learning stages and strategies beyond the lazy-rich dichotomy. This framework provides novel insights into feature learning across neuroscience and machine learning, shedding light on structural inductive biases in neural circuits and the mechanisms underlying out-of-distribution generalization.

Show Abstract
July 11, 2025

The Fruit Fly Auxodrome: a computer vision setup for longitudinal studies of Drosophila development

Changyuan Wang , Denis F Faerberg , S. Shvartsman, Robert A Marmion

Studies in Drosophila have contributed a great deal to our understanding of developmental mechanisms. Indeed, familiar names of critical signaling components, such as Hedgehog and Notch, have their origins in the readily identifiable morphological phenotypes of Drosophila. Most studies that led to the identification of these and many other highly conserved genes were based on the end-point phenotypes, such as the larval cuticle or the adult wing. Additional information can be extracted from longitudinal studies, which can reveal how the phenotypes emerge over time. Here we present the Fruit Fly Auxodrome, an experimental setup that enables monitoring and quantitative analysis of the entirety of development of 96 individually housed Drosophila from hatching to eclosion. The Auxodrome combines an inexpensive live imaging setup and a computer vision pipeline that provides access to a wide range of quantitative information, such as the times of hatching and pupation, as well as dynamic patterns of larval activity. We demonstrate the Auxodrome in action by recapitulating several previously reported features of wild-type development as well as developmental delay in a Drosophila model of a human disease. The scalability of the presented design makes it readily suitable for large-scale longitudinal studies in multiple developmental contexts.

Show Abstract

The open-source Masala software suite: Facilitating rapid methods development for synthetic heteropolymer design

Tristan Zaborniak, B. Turzo, D. Renfrew, V. Mulligan, et al.

Although canonical protein design has benefited from machine learning methods trained on databases of protein sequences and structures, synthetic heteropolymer design still relies heavily on physics-based methods. The Rosetta software, which provides diverse physics-based methods for designing sequences, exploring conformations, docking molecules, and performing analysis, has proven invaluable to this field. Nevertheless, Rosetta’s aging architecture, monolithic structure, non-open source code, and steep development learning curve are beginning to hinder new methods development. Here, we introduce the Masala software suite, a free, open-source set of C++ libraries intended to extend Rosetta and other software, and ultimately to be a successor to Rosetta. Masala is structured for modern computing hardware, and its build system automates the creation of application programming interface (API) layers, permitting Masala’s use as an extension library for existing software, including Rosetta. Masala features modular architecture in which it is easy for novice developers to add new plugin modules, which can be independently compiled and loaded at runtime, extending functionality of software linking Masala without source code alteration. Here, we describe implementation of Masala modules that accelerate protein and synthetic peptide design. We describe the implementation of Masala real-valued local optimizers and cost function network optimizers that can be used as drop-in replacements for Rosetta’s minimizer and packer when designing heteropolymers. We explore design-centric guidance terms for promoting desirable features, such as hydrogen bond networks, or discouraging undesirable features, such as unsatisfied buried hydrogen bond donors and acceptors, which we have re-implemented far more efficiently in Masala, providing up to two orders of magnitude of speedup in benchmarks. Finally, we discuss development goals for future versions of Masala.

Show Abstract

Decomposition of phenotypic heterogeneity in autism reveals underlying genetic programs

Aviya Litman, N. Sauerwald, C. Park, O. Troyanskaya, et al.

Unraveling the phenotypic and genetic complexity of autism is extremely challenging yet critical for understanding the biology, inheritance, trajectory and clinical manifestations of the many forms of the condition. Using a generative mixture modeling approach, we leverage broad phenotypic data from a large cohort with matched genetics to identify robust, clinically relevant classes of autism and their patterns of core, associated and co-occurring traits, which we further validate and replicate in an independent cohort. We demonstrate that phenotypic and clinical outcomes correspond to genetic and molecular programs of common, de novo and inherited variation and further characterize distinct pathways disrupted by the sets of mutations in each class. Remarkably, we discover that class-specific differences in the developmental timing of affected genes align with clinical outcome differences. These analyses demonstrate the phenotypic complexity of children with autism, identify genetic programs underlying their heterogeneity, and suggest specific biological dysregulation patterns and mechanistic hypotheses.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates