2596 Publications

Tensor Decomposition Meets RKHS: Efficient Algorithms for Smooth and Misaligned Data

B. Larsen, Tamara G. Kolda, Anru R. Zhang, A. Williams

The canonical polyadic (CP) tensor decomposition decomposes a multidimensional data array into a sum of outer products of finite-dimensional vectors. Instead, we can replace some or all of the vectors with continuous functions (infinite-dimensional vectors) from a reproducing kernel Hilbert space (RKHS). We refer to tensors with some infinite-dimensional modes as quasitensors, and the approach of decomposing a tensor with some continuous RKHS modes is referred to as CP-HiFi (hybrid infinite and finite dimensional) tensor decomposition. An advantage of CP-HiFi is that it can enforce smoothness in the infinite dimensional modes. Further, CP-HiFi does not require the observed data to lie on a regular and finite rectangular grid and naturally incorporates misaligned data. We detail the methodology and illustrate it on a synthetic example.

Show Abstract

CryoBench: Diverse and challenging datasets for the heterogeneity problem in cryo-EM

Minkyu Jeon, M. Astore, S. Hanson, P. Cossio, et al.

Cryo-electron microscopy (cryo-EM) is a powerful technique for determining high-resolution 3D biomolecular structures from imaging data. As this technique can capture dynamic biomolecular complexes, 3D reconstruction methods are increasingly being developed to resolve this intrinsic structural heterogeneity. However, the absence of standardized benchmarks with ground truth structures and validation metrics limits the advancement of the field. Here, we propose CryoBench, a suite of datasets, metrics, and performance benchmarks for heterogeneous reconstruction in cryo-EM. We propose five datasets representing different sources of heterogeneity and degrees of difficulty. These include conformational heterogeneity generated from simple motions and random configurations of antibody complexes and from tens of thousands of structures sampled from a molecular dynamics simulation. We also design datasets containing compositional heterogeneity from mixtures of ribosome assembly states and 100 common complexes present in cells. We then perform a comprehensive analysis of state-of-the-art heterogeneous reconstruction tools including neural and non-neural methods and their sensitivity to noise, and propose new metrics for quantitative comparison of methods. We hope that this benchmark will be a foundational resource for analyzing existing methods and new algorithmic development in both the cryo-EM and machine learning communities.

Show Abstract

cppdlr: Imaginary time calculations using the discrete Lehmann representation

J. Kaye, Hugo U. r. Strand, Nils Wentzell

We introduce cppdlr, a C++ library implementing the discrete Lehmann representation (DLR) of functions in imaginary time and Matsubara frequency, such as Green's functions and self-energies. The DLR is based on a low-rank approximation of the analytic continuation kernel, and yields a compact and explicit basis consisting of exponentials in imaginary time and simple poles in Matsubara frequency. cppdlr constructs the DLR basis and associated interpolation grids, and implements standard operations. It provides a flexible yet high-level interface, facilitating the incorporation of the DLR into both small-scale applications and existing large-scale software projects.

Show Abstract

Unraveling the Molecular Complexity of N-Terminus Huntingtin Oligomers: Insights into Polymorphic Structures

Neha Nanajkar, A. Sahoo, Silvina Matysiak

Huntington’s disease (HD) is a fatal neurodegenerative disorder resulting from an abnormal expansion of polyglutamine (polyQ) repeats in the N-terminus of the huntingtin protein. When the polyQ tract surpasses 35 repeats, the mutated protein undergoes misfolding, culminating in the formation of intracellular aggregates. Research in mouse models suggests that HD pathogenesis involves the aggregation of N-terminal fragments of the huntingtin protein (htt). These early oligomeric assemblies of htt, exhibiting diverse characteristics during aggregation, are implicated as potential toxic entities in HD. However, a consensus on their specific structures remains elusive. Understanding the heterogeneous nature of htt oligomers provides crucial insights into disease mechanisms, emphasizing the need to identify various oligomeric conformations as potential therapeutic targets. Employing coarse-grained molecular dynamics, our study aims to elucidate the mechanisms governing the aggregation process and resultant aggregate architectures of htt. The polyQ tract within htt is flanked by two regions: an N-terminal domain (N17) and a short C-terminal proline-rich segment. We conducted self-assembly simulations involving five distinct N17 + polyQ systems with polyQ lengths ranging from 7 to 45, utilizing the ProMPT force field. Prolongation of the polyQ domain correlates with an increase in β-sheet-rich structures. Longer polyQ lengths favor intramolecular β-sheets over intermolecular interactions due to the folding of the elongated polyQ domain into hairpin-rich conformations. Importantly, variations in polyQ length significantly influence resulting oligomeric structures. Shorter polyQ domains lead to N17 domain aggregation, forming a hydrophobic core, while longer polyQ lengths introduce a competition between N17 hydrophobic interactions and polyQ polar interactions, resulting in densely packed polyQ cores with outwardly distributed N17 domains. Additionally, at extended polyQ lengths, we observe distinct oligomeric conformations with varying degrees of N17 bundling. These findings can help explain the toxic gain-of-function that htt with expanded polyQ acquires.

Show Abstract

Computational tools for cellular scale biophysics

Mathematical models are indispensable for disentangling the interactions through which biological components work together to generate the forces and flows that position, mix, and distribute proteins, nutrients, and organelles within the cell. To illuminate the ever more specific questions studied at the edge of biological inquiry, such models inevitably become more complex. Solving, simulating, and learning from these more realistic models requires the development of new analytic techniques, numerical methods, and scalable software. In this review, we discuss some recent developments in tools for understanding how large numbers of cytoskeletal filaments, driven by molecular motors and interacting with the cytoplasm and other structures in their environment, generate fluid flows, instabilities, and material deformations which help drive crucial cellular processes.

Show Abstract

Deciphering missense coding variants with AlphaMissense

Z. Pan, Chandra L. Theesfeld

Genetic diagnosis promises to guide treatment and manage expectations for patients and physicians. Yet even when a variant in a disease gene is identified, the assignment of pathogenic impact is not always possible.1 Of the 215 million possible substitutions in approximately 19,900 genes, 71 million are missense mutations that result in an amino acid substitution rather than a stop codon or a frameshift.2 Only 4 million missense variants have been observed, of which approximately 2% have been clinically classified as pathogenic or benign by testing companies and collected in the public ClinVar repository. The rest are classified as variants of uncertain significance (VUS) due to the dearth of information on the functional impact or pathogenic consequences of the mutation.
A key challenge is to understand how changes in protein sequence affect function and contribute to disease. While the development of mutational scanning assays enables scientists to test thousands of substitutions at a time in cell lines, it is not possible to experimentally test all mutations, let alone assess fitness in humans. To meet this challenge, computational approaches that integrate many types of information and can predict functional impacts are becoming increasingly more sophisticated in their ability to accurately classify variants.
The early and powerful strategy for modeling the pathogenic impacts of variants involved employing evolutionary sequence information through the use of multiple sequence alignments (MSA). This approach examines sequence conservation across species and within humans, as demonstrated in models like PolyPhen and SIFT.3 The integration of functional insights related to protein domains and functions further enhances these models, coupled with artificial intelligence.3 Prediction of a correct 3-dimensional protein structure has long been a grail in research. Marks et al.4 suggested a global statistical model to massively reduce the search space of protein conformations by linking the pairwise correlations from MSA to fold a protein into a correct 3-dimensional structure (directly from Marks et al.4). AlphaFold5 marked a significant advancement in the field by using a large language model (LLM) to associate protein structure with MSA with unprecedented accuracy, effectively solving the “protein folding problem.” The ability of protein LLMs to learn not just amino acid relationships in linear sequences but also extremely rich relationships in any number of dimensions and contexts powers such models.

Show Abstract

Neuronal and behavioral responses to naturalistic texture images in macaque monkeys

C M Ziemba, R L T Goris, G M Stine, R K Perez, E. P. Simoncelli, J A Movshon

The visual world is richly adorned with texture, which can serve to delineate important elements of natural scenes. In anesthetized macaque monkeys, selectivity for the statistical features of natural texture is weak in V1, but substantial in V2, suggesting that neuronal activity in V2 might directly support texture perception. To test this, we investigated the relation between single cell activity in macaque V1 and V2 and simultaneously measured behavioral judgments of texture. We generated stimuli along a continuum between naturalistic texture and phase-randomized noise and trained two macaque monkeys to judge whether a sample texture more closely resembled one or the other extreme. Analysis of responses revealed that individual V1 and V2 neurons carried much less information about texture naturalness than behavioral reports. However, the sensitivity of V2 neurons, especially those preferring naturalistic textures, was significantly closer to that of behavior compared with V1. The firing of both V1 and V2 neurons predicted perceptual choices in response to repeated presentations of the same ambiguous stimulus in one monkey, despite low individual neural sensitivity. However, neither population predicted choice in the second monkey. We conclude that neural responses supporting texture perception likely continue to develop downstream of V2. Further, combined with neural data recorded while the same two monkeys performed an orientation discrimination task, our results demonstrate that choice-correlated neural activity in early sensory cortex is unstable across observers and tasks, untethered from neuronal sensitivity, and therefore unlikely to directly reflect the formation of perceptual decisions.

Show Abstract

Analytic method for quadratic polarons in nonparabolic bands

Including the effect of lattice anharmonicity on electron-phonon interactions has recently garnered attention due to its role as a necessary and significant component in explaining various phenomena, including superconductivity, optical response, and temperature dependence of mobility. This study focuses on analytically treating the effects of anharmonic electron-phonon coupling on the polaron self-energy, combined with numerical Diagrammatic Monte Carlo data. Specifically, we incorporate a quadratic interaction into the method of squeezed phonon states, which has proven effective for analytically calculating the polaron parameters. Additionally, we extend this method to nonparabolic finite-width conduction bands while maintaining the periodic translation symmetry of the system. Our results are compared with those obtained from Diagrammatic Monte Carlo, partially reported in a recent study [S. Ragni et al., Phys. Rev. B 107, L121109(2023)], covering a wide range of coupling strengths for the nonlinear interaction. Remarkably, our analytic method predicts the same features as the Diagrammatic Monte Carlo simulation.
Show Abstract
August 1, 2024

Detector-tuned overlap catastrophe in quantum dots

The Anderson overlap catastrophe (AOC) is a many-body effect arising as a result of a shakeup of a Fermi sea due to an abrupt change of a local potential, leading to a power-law dependence of the density of states on energy. Here we demonstrate that a standard quantum-dot detector can be employed as a highly tuneable probe of the AOC, where the power law can be continuously modified by a gate voltage. We show that signatures of the AOC have already appeared in previous experiments, and give explicit predictions allowing to tune and pinpoint their non-perturbative aspects.
Show Abstract
August 1, 2024
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.