428
Publications
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
B. Şimşek, Amire Bendjeddou, Wulfram Gerstner, Johanni Brea
Uniform approximation of common Gaussian process kernels using equispaced Fourier grids
A. Barnett, Philip Greengard, Ph.D., M. Rachh
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, R. M. Gower
Stochastic Optimal Control Matching
Carles Domingo-Enrich, J. Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen
- Previous Page
- Viewing
- Next Page