2023 Mathematical and Scientific Foundations of Deep Learning Annual Meeting

Date & Time

Peter Bartlett, University of California, Berkeley
Rene Vidal, Johns Hopkins University

Past Meetings:

Meeting Goals:
This meeting will bring together members of the NSF-Simons Research Collaborations on the Mathematical and Scientific Foundations of Deep Learning (MoDL) and of projects in the NSF program on Stimulating Collaborative Advances Leveraging Expertise in the Mathematical and Scientific Foundations of Deep Learning (SCALE MoDL). The focus of the meeting is the set of challenging theoretical questions posed by deep learning methods and the development of mathematical and statistical tools to understand their success and limitations, to guide the design of more effective methods, and to initiate the study of the mathematical problems that emerge. The meeting aims to report on progress in these directions and to stimulate discussions of future

  • Agendaplus--large

    Thursday, September 28th

    9:30 AMEmmanuel Abbe | Neural networks, logic, and generalization on the unseen
    10:30 AMBREAK
    11:00 AMShankar Sastry | Human AI teams in societal systems transformation
    12:00 PMLUNCH
    1:00 PMMisha Belkin | The challenges of training infinitely large neural networks
    2:00 PMBREAK
    2:30 PMJoshua Vogelstein | Beyond IID: surprising results in out-of-distribution and prospective learning theory
    3:30 PMBREAK
    4:00 PMNati Srebro | An agnostic view of overfitting

    Friday, September 29th

    9:30 AMRobert Ghrist | Topological Tools in Deep Learning: Sheaves, Cohomology, and Laplacians
    10:30 AMBREAK
    11:00 AMBin Yu | Sparse dictionary learning and deep learning in practice and theory
    12:00 PMLUNCH
    1:00 PMGuillermo Sapiro | Large Observational Study of the Causal Effects of a Nudge and the Geometry of Causality
  • Abstractsplus--large

    Emmanuel Abbe

    Neural networks, logic, and generalization on the unseen

    We consider the learning of Boolean/logic functions with neural networks trained by SGD. We first discuss the time and sample complexity to control the classic generalization error, via the leap complexity measure. We then move to the generalization on the unseen (GOTU) setting. There we show that certain architectures like Transformers have a bias towards learning min-degree-profile interpolators. This provides an explanation for the limitation of such architectures to “extrapolate” or “reason” in out-of-distribution scenarios, such as for length generalization problems. We then conclude with some directions to improve the latter.

    Misha Belkin
    UC San Diego

    The challenges of training infinitely large neural networks

    Remarkable recent advances in deep neural networks are rapidly changing science and society. Never before has a technology been deployed so widely and so quickly with so little understanding of its fundamentals. I will argue that developing a fundamental mathematical theory of deep learning is necessary for a successful AI transition and, furthermore, that such a theory may well be within reach. I will discuss what such a theory might look like and some of its ingredients that we already have available. In particular, I will discuss why infinitely wide neural networks make sense from both theoretical and practical points of view and how feature learning can be incorporated into the resulting algorithms.

    Robert Ghrist
    University of Pennsylvania

    Topological Tools in Deep Learning: Sheaves, Cohomology, and Laplacians

    This presentation gives an introduction to the intersection of sheaf theory and cohomology with deep learning. Sheaves are data structures over topological spaces and serve as a powerful language for articulating complex relationships over networks and higher-order structures. Cohomology is the theory that converts local structure and constraints into global qualitative features of the data structure. Classical entities in deep learning (neural nets, the graph Laplacian, convolution) lift to richer objects and actions on sheaves and cohomology. The talk will focus on introducing the framework and generating novel Laplacian operators to solve distributed problems, with an emphasis on applicability to network learning algorithms. Also addressed is the generalization of these methods from vector-valued data with linear constraints to lattice-valued data with nonlinear structure. Although rooted deeply in algebraic-topological methods, the concepts and techniques introduced will shed light on practical applications and be accessible to all.

    Guillermo Sapiro
    Duke University

    Large Observational Study of the Causal Effects of a Nudge and the Geometry of Causality

    Nudges are interventions promoting healthy behavior without forbidding options or significant incentives. As an example of a nudge, the Apple Watch encourages users to stand by delivering a notification if they have been sitting for the first 50 minutes of an hour.

    Based on 76 billion minutes of observational standing data from 160,000 subjects in the public Apple Heart and Movement Study, amount of data in the field that makes this work one of the largest ever in the subject, we estimate the causal effect of this notification using a novel regression discontinuity design for time-series data with time-varying treatment. We show that the nudge increases the probability of standing by up to 44%, a very significant effect compared to what has been reported in the literature, remaining effective with time, even after almost 2 years. The nudge’s effectiveness increases with age, and it is independent of gender. Closing Apple Watch Activity Rings, a visualization of participants’ daily progress in Move, Exercise, and Stand, further increases the nudge’s impact. We conclude the presentation with some recent work on connections between geometry and causal inference.

    The first part of the presentation is joint work with Achille Nazaret while the second is with Amir Farzam and Allen Tannenbaum.

    Nati Srebro

    An agnostic view of overfitting

    In recent years it has become clear that overfitting is not always as catastrophic as classical statistical learning theory suggests, and that even in noisy settings, interpolating learning can still yield good generalization. This has led to a line of research correcting our classical understanding and seeking to explain how, and more importantly when, overfitting is benign and doesn’t hurt generalization. Most of this work in the past couple of years focused on achieving consistency in linear (or kernel) regression under strong distribution assumptions. In this talk I will discuss a view of overfitting which is more in line with agnostic distribution-independent learning theory, and focuses not on consistency but on asking “how much do we lose by overfitting vs optimally balancing fit and complexity,” as well as on more “universal” models of learning that go beyond linear regression.

    Bin Yu
    Georgia Tech

    Sparse dictionary learning and deep learning in practice and theory

    Sparse dictionary learning has a long history and produces wavelet-like filters when fed with natural image patches, corresponding to the V1 primary visual cortex of the human brain. Wavelets as local Fourier Transforms are interpretable in physical sciences and beyond. In this talk, we will first describe adaptive wavelet distillation (AWD) to turn black-box deep learning models interpretable in cosmology and cellular biology problems while improving predictive performance. Then we present theoretical results that, under simple sparse dictionary models, gradient descent in auto-encoder fitting converges to one point on a manifold of global minima, and which minimum depends on the batch size. In particular, we show that when using a small batch-size as in stochastic gradient descent (SGD), a qualitatively different type of “feature selection” occurs.

  • Travel & Hotel plus--large

    Air and Train
    For individuals in Groups A and B the foundation will arrange and pay for round-trip travel from their home city to the conference.

    All travel and hotel arrangements must be booked through the Simons Foundation’s preferred travel agency. Travel arrangements not booked through the preferred agency must be pre-approved by the Simons Foundation and a reimbursement quote must be obtained through the foundation’s travel agency. Travel specifications can be provided by clicking the registration link above.

    Personal Car
    Personal car trips over 250 miles each way require prior approval from the Simons Foundation via email.

    The James NoMad Hotel offers valet parking. Please note there are no in-and-out privileges when using the hotel’s garage, therefore it is encouraged that participants walk or take public transportation to the Simons Foundation.

    Participants in Groups A & B who require accommodations are hosted by the foundation for a maximum of three nights at The James NoMad Hotel. Any additional nights are at the attendee’s own expense. To arrange accommodations, please register at the link above.

    The James NoMad Hotel
    22 E 29th St
    New York, NY 10016
    (between 28th and 29th Streets)

    For driving directions to The James NoMad, please click here.

  • Attendance & Reimbursement Policies plus--large

    In-person participants and speakers are expected to attend all meeting days. Partial participation is permitted so long as the individual fully attends the first day, which is typically Thursday for two-day meetings. Participants receiving hotel and travel support wishing to arrive on meeting days which conclude at 2:00 PM will be asked to attend remotely.

    COVID-19 Vaccination
    Individuals accessing Simons Foundation and Flatiron Institute buildings must be fully vaccinated against COVID-19.

    Entry & Building Access
    Upon arrival, guests will be required to show their photo ID to enter the Simons Foundation and Flatiron Institute buildings. After checking-in at the meeting reception desk, guests will be able to show their meeting name badge to re-enter the building. If you forget your name badge, you will need to provide your photo ID.

    The Simons Foundation and Flatiron Institute buildings are not considered “open campuses” and meeting participants will only have access to the spaces in which the meeting will take place. All other areas are off limits without prior approval.

    If you require a private space to conduct a phone call or remote meeting, please contact your meeting manager at least 48-hours ahead of time so that they may book a space for you within the foundation’s room reservation system.

    Meeting participants are required to give 24 hour advance notice of any guests meeting them at the Simons Foundation either before or after the meeting. Outside guests are discouraged from joining meeting activities, including meals.

    Ad hoc meeting participants who did not receive a meeting invitation directly from the Simons Foundation are discouraged.

    Children under the age of 18 are not permitted to attend meetings at the Simons Foundation. Furthermore, the Simons Foundation does not provide childcare facilities or support of any kind. Special accommodations will be made for nursing parents.

    Individuals in Groups A & B will be reimbursed for meals and local expenses including ground transportation. Expenses should be submitted through the foundation’s online expense reimbursement platform after the meeting’s conclusion.

    Expenses accrued as a result of meetings not directly related to the Simons Foundation-hosted meeting (a satellite collaboration meeting held at another institution, for example) will not be reimbursed by the Simons Foundation and should be paid by other sources.

    Below are key reimbursement takeaways; a full policy will be provided with the final logistics email circulated approximately 2 weeks prior to the meeting’s start.

    The daily meal limit is $125 and itemized receipts are required for expenses over $24 USD. The foundation DOES NOT provide a meal per diem and only reimburses actual meal expenses.

    • Meals taken on travel days are reimbursable.
    • Meals taken outside those provided by the foundation (breakfast, lunch, breaks and/or dinner) are not reimbursable.
    • If a meal was not provided on a meeting day, dinner for example, that expense is reimbursable.
    • Meals taken on days not associated with Simons Foundation-coordinated events are not reimbursable.
    • Minibar expenses are not reimbursable
    • Meal expenses for a non-foundation guest are not reimbursable.
    • Group meals consisting of fellow meeting participants paid by a single person will be reimbursed up to $65 per person per meal and the amount will count towards each individual’s $125 daily meal limit.

    Ground Transportation
    Expenses for ground transportation will be reimbursed for travel days (i.e. traveling to/from the airport) as well as local transportation. While in NYC, individuals are encouraged to use public transportation and not use taxi, Uber or Lyft services.

  • Contactsplus--large

    Registration and Travel Assistance
    Ovation Travel Group
    [email protected]
    (917) 408-8384 (24-Hours)

    Meeting Questions and Assistance
    Emily Klein
    Assistant Manager, Events, MPS, Simons Foundation
    [email protected]
    (646) 751-1262

Subscribe to MPS announcements and other foundation updates