2005 Publications

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

R. M. Gower, Mathieu Blondel, Nidham Gazagnadou, Fabian Pedregosa

Tuning the step size of stochastic gradient descent is tedious and error prone. This has motivated the development of methods that automatically adapt the step size using readily available information. In this paper, we consider the family of SPS (Stochastic gradient with a Polyak Stepsize) adaptive methods. These are methods that make use of gradient and loss value at the sampled points to adaptively adjust the step size. We first show that SPS and its recent variants can all be seen as extensions of the Passive-Aggressive methods applied to nonlinear problems. We use this insight to develop new variants of the SPS method that are better suited to nonlinear models. Our new variants are based on introducing a slack variable into the interpolation equations. This single slack variable tracks the loss function across iterations and is used in setting a stable step size. We provide extensive numerical results supporting our new methods and a convergence theory.

Show Abstract

The Hough Stream Spotter: A New Method for Detecting Linear Structure in Resolved Stars and Application to the Stellar Halo of M31

S. Pearson, S. E. Clark, A. J. Demirjian, K. Johnston, M. Ness, T. Starkenburg, B. F. Williams, R. A. Ibata

Stellar streams from globular clusters (GCs) offer constraints on the nature of dark matter and have been used to explore the dark matter halo structure and substructure of our Galaxy. Detection of GC streams in other galaxies would broaden this endeavor to a cosmological context, yet no such streams have been detected to date. To enable such exploration, we develop the Hough Stream Spotter (HSS), and apply it to the Pan-Andromeda Archaeological Survey (PAndAS) photometric data of resolved stars in M31's stellar halo. We first demonstrate that our code can re-discover known dwarf streams in M31. We then use the HSS to blindly identify 27 linear GC stream-like structures in the PAndAS data. For each HSS GC stream candidate, we investigate the morphologies of the streams and the colors and magnitudes of all stars in the candidate streams. We find that the five most significant detections show a stronger signal along the red giant branch in color–magnitude diagrams than spurious non-stream detections. Lastly, we demonstrate that the HSS will easily detect globular cluster streams in future Nancy Grace Roman Space Telescope data of nearby galaxies. This has the potential to open up a new discovery space for GC stream studies, GC stream gap searches, and for GC stream-based constraints on the nature of dark matter.

Show Abstract

Euchromatin activity enhances segregation and compaction of heterochromatin in the cell nucleus

Achal Mahajan, W. Yan, Alexandra Zidovska, D. Saintillan, M. Shelley

The large-scale organization of the genome inside the cell nucleus is critical for the cell’s function. Chromatin – the functional form of DNA in cells – serves as a substrate for active nuclear processes such as transcription, replication and DNA repair. Chromatin’s spatial organization directly affects its accessibility by ATP-powered enzymes, e.g., RNA polymerase II in the case of transcription. In differentiated cells, chromatin is spatially segregated into compartments – euchromatin and heterochromatin – the former being largely transcriptionally active and loosely packed, the latter containing mostly silent genes and densely compacted. The euchromatin/heterochromatin segregation is crucial for proper genomic function, yet the physical principles behind it are far from understood. Here, we model the nucleus as filled with hydrodynamically interacting active Zimm chains – chromosomes – and investigate how large heterochromatic regions form and segregate from euchromatin through their complex interactions. Each chromosome presents a block copolymer composed of heterochromatic blocks, capable of crosslinking that increases chromatin’s local compaction, and euchromatic blocks, subjected to stochastic force dipoles that capture the microscopic stresses exerted by nuclear ATPases. These active stresses lead to a dynamic self-organization of the genome, with its coherent motions driving the mixing of chromosome territories as well as large-scale heterochromatic segregation through crosslinking of distant genomic regions. We study the stresses and flows that arise in the nucleus during the heterochromatic segregation, and identify their signatures in Hi-C proximity maps. Our results reveal the fundamental role of active mechanical processes and hydrodynamic interactions in the kinetics of chromatin compartmentalization and in the emergent large-scale organization of the nucleus.

Show Abstract
February 22, 2022

The Homogeneity of the Star-forming Environment of the Milky Way Disk over Time

M. Ness, A. J. Wheeler, K. McKinnon, D. Horta Darrington, A. R. Casey, E. Cunningham, A. Price-Whelan

Stellar abundances and ages afford the means to link chemical enrichment to galactic formation. In the Milky Way, individual element abundances show tight correlations with age, which vary in slope across ([Fe/H]–[α/Fe]). Here, we step from characterizing abundances as measures of age, to understanding how abundances trace properties of stellar birth environment in the disk over time. Using measurements from ∼27,000 APOGEE stars (R = 22,500, signal-to-noise ratio > 200), we build simple local linear models to predict a sample of elements (X = Si, O, Ca, Ti, Ni, Al, Mn, Cr) using (Fe, Mg) abundances alone, as fiducial tracers of supernovae production channels. Given [Fe/H] and [Mg/H], we predict these elements, [X/H], to about double the uncertainty of their measurements. The intrinsic dispersion, after subtracting measurement errors in quadrature is ≈0.015–0.04 dex. The residuals of the prediction (measurement − model) for each element demonstrate that each element has an individual link to birth properties at fixed (Fe, Mg). Residuals from primarily massive-star supernovae (i.e., Si, O, Al) partially correlate with guiding radius. Residuals from primarily supernovae Ia (i.e., Mn, Ni) partially correlate with age. A fraction of the intrinsic scatter that persists at fixed (Fe, Mg), however, after accounting for correlations, does not appear to further discriminate between birth properties that can be traced with present-day measurements. Presumably, this is because the residuals are also, in part, a measure of the typical (in)-homogeneity of the disk's stellar birth environments, previously inferred only using open cluster systems. Our study implies at fixed birth radius and time that there is a median scatter of ≈0.01–0.015 dex in elements generated in supernovae sources.

Show Abstract

Interacting Stellar EMRIs as Sources of Quasi-periodic Eruptions in Galactic Nuclei

B. Metzger, N. C. Stone, S. Gilbaum

A star that approaches a supermassive black hole (SMBH) on a circular extreme mass ratio inspiral (EMRI) can undergo Roche lobe overflow (RLOF), resulting in a phase of long-lived mass transfer onto the SMBH. If the interval separating consecutive EMRIs is less than the mass-transfer timescale driven by gravitational wave emission (typically ∼1–10 Myr), the semimajor axes of the two stars will approach each another on scales of ≲ hundreds to thousands of gravitational radii. Close flybys tidally strip gas from one or both RLOFing stars, briefly enhancing the mass-transfer rate onto the SMBH and giving rise to a flare of transient X-ray emission. If both stars reside in a common orbital plane, these close interactions will repeat on a timescale as short as hours, generating a periodic series of flares with properties (amplitudes, timescales, sources lifetimes) remarkably similar to the “quasi-periodic eruptions” (QPEs) recently observed from galactic nuclei hosting low-mass SMBHs. A cessation of QPE activity is predicted on a timescale of months to years, due to nodal precession of the EMRI orbits out of alignment by the SMBH spin. Channels for generating the requisite coplanar EMRIs include the tidal separation of binaries (Hills mechanism) or Type I inward migration through a gaseous AGN disk. Alternative stellar dynamical scenarios for QPEs, that invoke single stellar EMRIs on an eccentric orbit undergoing a runaway sequence of RLOF events, are strongly disfavored by formation rate constraints.

Show Abstract

Spatial Transformer K-Means

Romain Cosentino, Randall Balestriero, Y. Bahroun, A. Sengupta, Richard Baraniuk, Behnaam Aazhang

K-means defines one of the most employed centroid-based clustering algorithms with performances tied to the data's embedding. Intricate data embeddings have been designed to push K-means performances at the cost of reduced theoretical guarantees and interpretability of the results. Instead, we propose preserving the intrinsic data space and augment K-means with a similarity measure invariant to non-rigid transformations. This enables (i) the reduction of intrinsic nuisances associated with the data, reducing the complexity of the clustering task and increasing performances and producing state-of-the-art results, (ii) clustering in the input space of the data, leading to a fully interpretable clustering algorithm, and (iii) the benefit of convergence guarantees.

Show Abstract

Inverse Dirichlet weighting enables reliable training of physics informed neural networks

S. Maddu, et al.

We characterize and remedy a failure mode that may arise from multi-scale dynamics with scale imbalances during training of deep neural networks, such as physics informed neural networks (PINNs). PINNs are popular machine-learning templates that allow for seamless integration of physical equation models with data. Their training amounts to solving an optimization problem over a weighted sum of data-fidelity and equation-fidelity objectives. Conflicts between objectives can arise from scale imbalances, heteroscedasticity in the data, stiffness of the physical equation, or from catastrophic interference during sequential training. We explain the training pathology arising from this and propose a simple yet effective inverse Dirichlet weighting strategy to alleviate the issue. We compare with Sobolev training of neural networks, providing the baseline of analytically ε-optimal training. We demonstrate the effectiveness of inverse Dirichlet weighting in various applications, including a multi-scale model of active turbulence, where we show orders of magnitude improvement in accuracy and convergence over conventional PINN training. For inverse modeling using sequential training, we find that inverse Dirichlet weighting protects a PINN against catastrophic forgetting.

Show Abstract

Heating of Magnetically Dominated Plasma by Alfvén-Wave Turbulence

J. Nättilä, A. Beloborodov

Magnetic energy around astrophysical compact objects can strongly dominate over plasma rest mass. Emission observed from these systems may be fed by dissipation of Alfvén wave turbulence, which cascades to small damping scales, energizing the plasma. We use 3D kinetic simulations to investigate this process. When the cascade is excited naturally, by colliding large-scale Alfvén waves, we observe quasithermal heating with no nonthermal particle acceleration. We also find that the particles are energized along the magnetic field lines and so are poor producers of synchrotron radiation. At low plasma densities, our simulations show the transition to “charge-starved” cascades, with a distinct damping mechanism.

Show Abstract

Heating of Magnetically Dominated Plasma by Alfvén-Wave Turbulence

J. Nättilä, A. Beloborodov

Magnetic energy around astrophysical compact objects can strongly dominate over plasma rest mass. Emission observed from these systems may be fed by dissipation of Alfvén wave turbulence, which cascades to small damping scales, energizing the plasma. We use 3D kinetic simulations to investigate this process. When the cascade is excited naturally, by colliding large-scale Alfvén waves, we observe quasithermal heating with no nonthermal particle acceleration. We also find that the particles are energized along the magnetic field lines and so are poor producers of synchrotron radiation. At low plasma densities, our simulations show the transition to “charge-starved” cascades, with a distinct damping mechanism.

Show Abstract

Measuring errors over time: towards a quantitative theory of chromosome segregation error correction

G. Ha, P. Dieterle, H. Shen, D. Needleman

The mammalian mitotic spindle segregates an equal number of chromosomes to daughter cells. Over the course of spindle assembly, many initially erroneous attachments between kinetochores and microtubules are fixed through a process called error correction. Despite the importance of chromosome segregation errors in many human health conditions, we lack quantitative methods to characterize the dynamic error correction process and how it is impaired in disease states. We have developed a novel experimental method and analysis framework to quantify chromosome segregation error correction in human tissue culture cells with live cell confocal imaging of spindle assembly, timed premature chromosome separation, and automated counting of kinetochores after cell division. Using our assay we targeted Aurora B kinase, a key regulator of kinetochore-microtubule attachments, with two small molecules that either inhibited Aurora B activity or perturbed its localization. While both inhibitors increased the steady state error baseline over 10-fold from control, they differed in their initial error states and times to reach steady state. Our results indicate that error correction dynamics, and not just endpoint segregation errors, are important for understanding the involvement of proteins in error correction. Future work will focus on distinguishing the functional roles of different proteins in error correction, characterizing how kinetochore-microtubule affinity and microtubule stability determine error correction dynamics, and constructing and testing a mathematical theory of error correction.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates