2795 Publications

Error Breakdown and Sensitivity Analysis of Dynamical Quantities in Markov State Models

Yehor Tuchkov, L. Evans, S. Hanson, E. Thiede

Markov state models (MSMs) are widely employed to analyze the kinetics of complex systems. But despite their effectiveness in many applications, MSMs are prone to systematic or statistical errors, often exacerbated by suboptimal hyperparameter choice. In this article, we attempt to understand how these choices affect the error of estimates of mean first-passage times and committors, key quantities in chemical rate theory. We first evaluate the performance of the recently introduced “stopped-process estimator” [Strahan, J. Long-time-scale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein. J. Chem. Theory Comput. 2021, 17, 2948–2963. 10.1021/acs.jctc.0c00933.] that attempts to reduce error caused by choosing a too-large lag time. We then study the effect of statistical errors on Markov state model construction using the condition number, which measures an MSM’s sensitivity to perturbation. This analysis helps give an insight into which factors cause an MSM to be more or less sensitive to statistical error. Our work highlights the importance of choosing a good sampling measure, the measure from which the initial points are drawn, and has implications for recent work applying a variational principle for evaluating the committor.

Show Abstract

Error Breakdown and Sensitivity Analysis of Dynamical Quantities in Markov State Models

Yehor Tuchkov, L. Evans, S. Hanson, E. Thiede

Markov state models (MSMs) are widely employed to analyze the kinetics of complex systems. But despite their effectiveness in many applications, MSMs are prone to systematic or statistical errors, often exacerbated by suboptimal hyperparameter choice. In this article, we attempt to understand how these choices affect the error of estimates of mean first-passage times and committors, key quantities in chemical rate theory. We first evaluate the performance of the recently introduced “stopped-process estimator” [Strahan, J. Long-time-scale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein. J. Chem. Theory Comput. 2021, 17, 2948–2963. 10.1021/acs.jctc.0c00933.] that attempts to reduce error caused by choosing a too-large lag time. We then study the effect of statistical errors on Markov state model construction using the condition number, which measures an MSM’s sensitivity to perturbation. This analysis helps give an insight into which factors cause an MSM to be more or less sensitive to statistical error. Our work highlights the importance of choosing a good sampling measure, the measure from which the initial points are drawn, and has implications for recent work applying a variational principle for evaluating the committor.

Show Abstract

Space-time adaptive methods for parabolic evolution equations

We present a family of integral equation-based solvers for the heat equation, reaction-diffusion systems, the unsteady Stokes equation and the incompressible Navier-Stokes equations in two space dimensions. Our emphasis is on the development of methods that can efficiently follow complex solution features in space-time by refinement and coarsening at each time step on an adaptive quadtree. For simplicity, we focus on problems posed in a square domain with periodic boundary conditions. The performance and robustness of the methods are illustrated with several numerical examples.

Show Abstract

Randomized block-Krylov subspace methods for low-rank approximation of matrix functions

D. Persson, Tyler Chen, Christopher Musco

The randomized SVD is a method to compute an inexpensive, yet accurate, low-rank approximation of a matrix. The algorithm assumes access to the matrix through matrix-vector products (matvecs). Therefore, when we would like to apply the randomized SVD to a matrix function, f(A), one needs to approximate matvecs with f(A) using some other algorithm, which is typically treated as a black-box. Chen and Hallman (SIMAX 2023) argued that, in the common setting where matvecs with f(A) are approximated using Krylov subspace methods (KSMs), a more efficient low-rank approximation is possible if we open this black-box. They present an alternative approach that significantly outperforms the naive combination of KSMs with the randomized SVD, although the method lacked theoretical justification. In this work, we take a closer look at the method, and provide strong and intuitive error bounds that justify its excellent performance for low-rank approximation of matrix functions.

Show Abstract

Truncated kernel windowed Fourier projection: a fast algorithm for the 3D free-space wave equation

We present a spectrally accurate fast algorithm for evaluating the solution to the scalar wave equation in free space driven by a large collection of point sources in a bounded domain. With $M$ sources temporally discretized by $N_t$ time steps of size $\Delta t$, a naive potential evaluation at $M$ targets on the same time grid requires $\mathcal O(M^2 N_t)$ work. Our scheme requires $\mathcal{O}\left((M + N^3\log N)N_t\right)$ work, where $N$ scales as $\mathcal O(1/\Delta t)$, i.e., the maximum signal frequency. This is achieved by using the recently-proposed windowed Fourier projection (WFP) method to split the potential into a local part, evaluated directly, plus a smooth history part approximated by an $N^3$-point equispaced discretization of the Fourier transform, where each Fourier coefficient obeys a simple recursion relation. The growing oscillations in the spectral representation (which would be present with a naive use of the Fourier transform) are controlled by spatially truncating the hyperbolic Green's function itself. Thus, the method avoids the need for absorbing boundary conditions. We demonstrate the performance of our algorithm with up to a million sources and targets at 6-digit accuracy. We believe it can serve as a key component in addressing time-domain wave equation scattering problems.

Show Abstract

Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model

Rio Alexa Fear, Payel Mukhopadhyay, M. McCabe, A. Bietti, M. Cranmer

Recent advances in mechanistic interpretability have revealed that large language models (LLMs) develop internal representations corresponding not only to concrete entities but also distinct, human-understandable abstract concepts and behaviour. Moreover, these hidden features can be directly manipulated to steer model behaviour. However, it remains an open question whether this phenomenon is unique to models trained on inherently structured data (ie. language, images) or if it is a general property of foundation models. In this work, we investigate the internal representations of a large physics-focused foundation model. Inspired by recent work identifying single directions in activation space for complex behaviours in LLMs, we extract activation vectors from the model during forward passes over simulation datasets for different physical regimes. We then compute "delta" representations between the two regimes. These delta tensors act as concept directions in activation space, encoding specific physical features. By injecting these concept directions back into the model during inference, we can steer its predictions, demonstrating causal control over physical behaviours, such as inducing or removing some particular physical feature from a simulation. These results suggest that scientific foundation models learn generalised representations of physical principles. They do not merely rely on superficial correlations and patterns in the simulations. Our findings open new avenues for understanding and controlling scientific foundation models and has implications for AI-enabled scientific discovery

Show Abstract

Diffusion for Fusion: Designing Stellarators with Generative AI

Misha Padidar, T. Huang, A. Giuliani, M. Spivak

Stellarators are a prospective class of fusion-based power plants that confine a hot plasma with three-dimensional magnetic fields. Typically framed as a PDE-constrained optimization problem, stellarator design is a time-consuming process that can take hours to solve on a computing cluster. Developing fast methods for designing stellarators is crucial for advancing fusion research. Given the recent development of large datasets of optimized stellarators, machine learning approaches have emerged as a potential candidate. Motivated by this, we present an open inverse problem to the machine learning community: to rapidly generate high-quality stellarator designs which have a set of desirable characteristics. As a case study in the problem space, we train a conditional diffusion model on data from the QUASR database to generate quasisymmetric stellarator designs with desirable characteristics (aspect ratio and mean rotational transform). The diffusion model is applied to design stellarators with characteristics not seen during training. We provide evaluation protocols and show that many of the generated stellarators exhibit solid performance: less than 5% deviation from quasisymmetry and the target characteristics. The modest deviation from quasisymmetry highlights an opportunity to reach the sub 1% target. Beyond the case study, we share multiple promising avenues for generative modeling to advance stellarator design.

Show Abstract

Comprehensive characterization of human color discrimination thresholds

Fangfang Hong, Ruby Bouhassira, Jason Chow, Craig Sanders, Michael Shvartsman, Phillip Guan, A. Williams, D. H. Brainard

Color discrimination thresholds—the smallest detectable color differences—provide a benchmark for models of color vision, enable quantitative evaluation of eye diseases, and inform the design of display technologies. Despite their importance, a comprehensive characterization of these thresholds has long been considered intractable due to the psychophysical curse of dimensionality. Here, we address this challenge using a novel semi-parametric Wishart Process Psychophysical Model (WPPM), which leverages the feature that the internal noise limiting color discrimination varies smoothly across stimulus space. The model was fit to data collected with a non-parametric adaptive trial-placement procedure, enabling efficient stimulus selection. Together, through the combination of adaptive trial placement and post hoc WPPM fitting, we achieved comprehensive characterization of color discrimination in the isoluminant plane with only ~6,000 trials per participant (N = 8). Once fit, the WPPM allows readouts of discrimination performance for any stimulus pair. We validated these readouts against 25 probe psychometric functions, measured with an additional 6,000 trials per participant held out from model fitting. In conclusion, our study provides a foundational dataset for color vision, and our approach generalizes beyond color to any domain in which the internal noise limiting performance varies smoothly across stimulus space, offering a powerful and efficient method for comprehensively characterizing various perceptual discrimination thresholds.

Show Abstract

The Determinant Ratio Matrix Approach to Solving 3D Matching and 2D Orthographic Projection Alignment Tasks

Andrew J. Hanson, S. Hanson

Pose estimation is a general problem in computer vision with wide applications. The relative orientation of a 3D reference object can be determined from a 3D rotated version of that object, or from a projection of the rotated object to a 2D planar image. This projection can be a perspective projection (the PnP problem) or an orthographic projection (the OnP problem). We restrict our attention here to the OnP problem and the full 3D pose estimation task (the EnP problem). Here we solve the least squares systems for both the error-free EnP and OnP problems in terms of the determinant ratio matrix (DRaM) approach. The noisy-data case can be addressed with a straightforward rotation correction scheme. While the SVD and optimal quaternion eigensystem methods solve the noisy EnP 3D-3D alignment exactly, the noisy 3D-2D orthographic (OnP) task has no known comparable closed form, and can be solved by DRaM-class methods. We note that while previous similar work has been presented in the literature exploiting both the QR decomposition and the Moore-Penrose pseudoinverse transformations, here we place these methods in a larger context that has not previously been fully recognized in the absence of the corresponding DRaM solution. We term this class of solutions as the DRaM family, and conduct comparisons of the behavior of the families of solutions for the EnP and OnP rotation estimation problems. Overall, this work presents both a new solution to the 3D and 2D orthographic pose estimation problems and provides valuable insight into these classes of problems. With hindsight, we are able to show that our DRaM solutions to the exact EnP and OnP problems possess derivations that could have been discovered in the time of Gauss, and in fact generalize to all analogous N-dimensional Euclidean pose estimation problems.

Show Abstract
November 24, 2025

Cellular and Spatial Drivers of Unresolved Injury and Functional Decline in the Human Kidney

Blue B. Lake, X. Chen, R. Sealfon, O. Troyanskaya, et al.

Building upon a foundational Human Kidney resource, we present a comprehensive multi-modal atlas that defines spatially resolved versus unresolved repair states and mechanisms in human kidney disease. Homeostatic interactions between injured kidney epithelium and its surrounding milieu determine successful repair outcomes, while pathogenic signaling promotes unresolved inflammation and fibrosis leading to chronic disease. We integrated multiple single-cell and spatial modalities across ∼700 samples from >350 patients (∼250 research biopsies), analyzing ∼1.7 million cells alongside complementary mouse multi-omic profiles spanning acute-to-chronic injury and aging (>300,000 cells) and spatial transcriptomic analysis of >150 human biopsies. This cross-species atlas delineates functional pathways and druggable targets across the nephron and defines gene regulatory networks and chromatin landscapes governing tubular, fibroblast, and immune cell transitions from injury to either recovery or failed repair states. We identified distinct cellular states associated with specific pathological features that show dynamic distributions between acute kidney injury (AKI) and chronic kidney disease (CKD), organized within unique spatial niches that reveal progression mechanisms from early injury to unresolved disease. Gene regulatory analyses prioritized key transcription factor activities (SOX4, SOX9, NFKB1, REL, KLFs) and their target networks establishing disease states and tissue microenvironments. These regulatory programs were directly linked to clinical outcomes, identifying molecular signatures of recovery and secreted biomarkers predictive of AKI-to-CKD progression, providing a key resource for therapeutic development and precision medicine approaches in kidney disease.

Show Abstract
November 24, 2025
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates