443
Publications
Good Rates From Bad Coordinates: The Exponential Average Time-dependent Rate Approach
Nicodemo Mazzaferro, Subarna Sasmal, P. Cossio, Glen M. Hocky
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations
Kazusato Oko, Yujin Song, Taiji Suzuki, D. Wu
Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs
C. Margossian, L. Pillaud-Vivien, L. Saul
How Truncating Weights Improves Reasoning in Language Models
Lei Chen, Joan Bruna, A. Bietti
Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task
Siavash Golkar, A. Bietti, Mariel Pettee, Michael Eickenberg, et al.
Crowdsourcing with Difficulty: A Bayesian Rating Model for Heterogeneous Items
Seong Woo Han, Ozan Adıgüzel, B. Carpenter
Neurosift: DANDI exploration and NWB visualization in the browser
J. Magland, J. Soules, Cody Baker, Benjamin Dichter
Why is parameter averaging beneficial in SGD? An objective smoothing perspective
Atsushi Nitanda, Ryuhei Kikuchi, Shugo Maeda, D. Wu
- Previous Page
- Viewing
- Next Page