Gerald D. Fischbach Auditorium, 2nd Floor
160 Fifth Avenue at 21st Street
New York, NY 10010
Deadline: Friday, April 28
Participation is by invitation only. All participants must register.
Friday, May 19 9:00 AM Breakfast & Welcome 9:35 AM Irit Dinur 10:25 AM Break 10:40 AM Tengyu Ma 11:30 AM Break 11:45 AM Jelani Nelson 12:35 PM Lunch & Discussion 2:15 PM Moritz Hardt 3:05 PM Break 3:20 PM Afonso Bandiera 4:10 PM Tea 5:10 PM Public Lecture: Alex Andoni
Tengyu Ma Princeton University Jelani Nelson Harvard University Moritz Hardt Irit Dinur Weizmann Institute of Science Alex Andoni Columbia University
Better understanding of non-convex methods in machine learning
Recent breakthroughs in machine learning, especially deep learning, often involve learning complex and high-dimensional models on massive datasets with non-convex optimization. Though empirically successful, the formal study of such non-convex methods is much less developed.
The talk will showcase a few results that aim to develop new algorithmic approaches and analysis tools in these settings. First, we show that matrix completion — a famous problem in machine learning — can be solved by stochastic gradient descent on the straightforward non-convex objective function in polynomial time. (Formally, we show that all local minima of the objective are also global minima.) Then, we will analyze the landscape of the objective functions for linearized residual nets, and demonstrate that over-parameterization and re-parameterization of the models can make the optimization easier.
Based on joint works with Rong Ge, Jason D. Lee, and Moritz Hardt. No prior knowledge on machine learning is assumed.
Optimality of the Johnson-Lindenstrauss lemma
Dimensionality reduction in Euclidean space, as attainable by the Johnson-Lindenstrauss lemma (also known as “random projections”), has been a fundamental tool in algorithm design and machine learning. The JL lemma states that any n points in Euclidean space can be mapped to m-dimensional Euclidean space while preserving all pairwise distances up to 1+epsilon, where m only needs to be on the order of (log n) /epsilon^2, independent of the original dimension. In this talk, I discuss our recent proof that the JL lemma is optimal, in the sense that for any n there are point sets of size n such that no embedding providing (1+epsilon)-distortion exists into a dimension that is more than a constant factor better than what the JL lemma guarantees. I will also discuss some subsequent work and future directions.
Joint work with Kasper Green Larsen (Aarhus University).
Gradient Descent Learns Linear Dynamical Systems
We prove that gradient descent efficiently converges to the global optimizer of the maximum likelihood objective of an unknown linear time-invariant dynamical system from a sequence of noisy observations generated by the system. Even though objective function is non-convex, we provide polynomial running time and sample complexity bounds under strong but natural assumptions. Linear systems identification has been studied for many decades, yet, to the best of our knowledge, these are the first polynomial guarantees for the problem we consider.
Joint work with Tengyu Ma and Ben Recht.
New York University
On Phase Transitions for Spiked Random Matrix and Tensor Models
A central problem of random matrix theory is to understand the eigenvalues of spiked random matrix models, in which a prominent eigenvector (or low rank structure) is planted into a random matrix. These distributions form natural statistical models for principal component analysis (PCA) problems throughout the sciences, where the goal is often to recover or detect the planted low rank structured. In this talk we discuss fundamental limitations of statistical methods to perform these tasks and methods that outperform PCA at it. Emphasis will be given to low rank structures arising in Synchronization problems. Time permitting, analogous results for spiked tensor models will also be discussed.
Joint work with Amelia Perry, Alex Wein, and Ankur Moitra.
Weizmann Institute of Science
Agreement testing and PCPs
I will describe the notion of agreement testing, which allows to deduce global structure from local agreement checks. In retrospect, agreement testing theorems are a key combinatorial component in nearly all PCPs.
Recently, we have discovered a connection between agreement tests and high dimensional expanders. Namely, every high dimensional expander gives rise to an agreement theorem. I will describe (some properties of) the fascinating bounded degree high dimensional expanders that were constructed by Lubotzky, Samuels, and Vishne. Since these objects are sparse, they imply a strong derandomization of direct product and direct sum testing.
Based on joint work with Tali Kaufman.
The Geometry of Similarity Search
How does one efficiently search for similar items in a large dataset, say, of images? This is a classic algorithmic task that arises when dealing with massive datasets. Its applications span many areas, including information retrieval, machine learning, data mining, computer vision, signal processing, bioinformatics, etc. For example, this task underlies the classic ‘nearest neighbor’ classification rule in machine learning. The natural solution for similarity search — to scan the entire dataset, comparing a given item to each item in the dataset — is prohibitively slow for modern datasets.
Andoni will describe how efficient solutions for similarity search benefit from the tools and perspectives of high-dimensional geometry. The latter emerged as the natural language of similarity search: e.g., a 20 x 20 pixel image naturally corresponds to a 400-dimensional vector, one coordinate per pixel. Andoni will show how geometric tools such as dimension reduction and randomized space partitions enable efficient similarity search algorithms.
Air and Train
Group AThe foundation will arrange and pay for all air and train travel to Annual Meeting for those in Group A. Please provide your travel specifications by clicking the registration link above. If you are unsure of your group, please refer to your invitation sent via email.
Group BThe foundation will cover hotel if requested for those in Group B. Please register at the link above so that we are able to capture your hotel and dietary requirements. If you are unsure of your group, please refer to your invitation sent via email.
Group CIndividuals in Group C are considered local and will not receive travel or hotel support. Please register at the link above so that we are able to capture your dietary requirements. If you are unsure of your group, please refer to your invitation sent via email.
Personal CarFor participants driving to Manhattan, The Roger Hotel offers valet parking. Please note there are no in-and-out privileges when using the hotel’s garage, so it is encouraged that participants walk or take public transportation to the Simons Foundation.
Participants in Groups A & B who require accommodations are hosted by the foundation for a maximum of two nights at The Roger hotel:
The Roger New York
131 Madison Avenue
New York, NY 10016
(between 30th and 31st Streets)
Any additional nights are at the attendee’s own expense.
To arrange accommodations, please register at the link above.
For driving directions to The Roger, please click here.
Reimbursement and Travel Policy
Any expenses not directly paid for by the Simons Foundation are subject to reimbursement based on the travel policy. An email will be sent within a week following the conclusion of the symposium with further instructions on submitting your expenses via the foundation’s web-based expense reimbursement platform.
Receipts are required for any expenses over $50 USD and are due within THIRTY DAYS (30) after the conclusion of the symposium. Should you have any questions, please contact Emily Klein.
Travel and Hotel AssistanceChristophe Vergnol, Protravel International
General Meeting AssistanceMeghan Fazzi