New Directions in Theoretical Machine Learning

Date & Time

Sanjeev Arora, Princeton University
Maria-Florina Balcan, Carnegie Mellon University
Sanjoy Dasgupta, University of California, San Diego
Sham Kakade, University of Washington

The Simons Symposium on New Directions in Theoretical Machine Learning brought together a select group of experts with diverse skills and backgrounds to discuss the following questions:

  • How can we build on the recent success in supervised learning for perceptual and related tasks?
  • What’s next for ML if perception gets solved?
  • Is the current set of methods sufficient to take us to the next level of “intelligent” reasoning?
  • If not, what is missing, and how can we rectify it?
  • What role can classical ideas in Reasoning, Representation Learning, Reinforcement Learning, Interactive Learning, etc. have to play?
  • What modes of analyses do we need to even conceptualize the next level of machine capabilities, and what will be good ways to test those capabilities?
  • Meeting Reportplus--large

    Overview: The workshop brought together a select group of experts with different backgrounds to discuss the next set of theoretical challenges for advancing machine learning and artificial intelligence. There have a been a number of breakthrough developments in the last few years: language models are rapidly getting better (in the last few months, there have been impressive advances in computer generated text); image recognition is working far more accurately in new domains (relevant to scene parsing and navigation); and progress in robotics is rapidly accelerating (with much focus on self-driving cars). The workshop had leading experts in all of these areas and focused on formulating new foundational and algorithmic questions to accelerate the progress of the field.  

    Developments and Novel Questions:

    What is the role of models and physics in robotics, reinforcement learning, interactive learning and self-driving cars?

    A central question in robotics and interactive learning is how to make use of models of the environment (such as physical simulators). While much of machine learning does not utilize models of the world (ML is largely prediction based), the area of robotics is one where it is increasingly clear that models of the environment are important. D. Bagnell is one of the world’s leading experts in self-driving cars, along with being a leading robotics theorist, and he discussed the interplay of human demonstration with physical models; the focus was on how to obtain provably better planning algorithms (say for self-driving cars) utilizing models of the world along with human examples. E. Todorov is a robotics theorist and has also developed one of the most widely used physics simulators in robotics; he spoke about how ML methods for robotics are unlikely to succeed without incorporating physical models. E. Brunskill discussed the fundamental limits of how accurately we need to learn a model in order to utilize a model of planning purposes.

    How do our current set of methods need to be modified in order to take us to the next level of ‘intelligent’ reasoning?

    There has been much recent remarkable success in natural language processing. There were a number of talks related to the key shortcomings in current language modeling approaches, in terms of a lack of the ability to capture reasoning and long-term dependencies. D. Roth gave a thought-provoking talk on how to incorporate reasoning and logic in language models through a notion of incidental reasoning, where an agent learns a way to reason from co-occurrences in the data. G. van den Broeck talked about systems for probabilistic reasoning that can both be learned and used efficiently, overcoming complexity barriers that have traditionally hobbled the field. S. Kakade discussed the challenges with language models that utilize long-term memory, relevant to reasoning from information stored in the ‘deep past.’ On a related topic of reasoning, C. Szegedy discussed the role of reasoning in the context of mathematics and theorem proving.

    How can representation learning and unsupervised learning be used for faster learning in new domains and for finding better perceptual representations?

    For example, people learn to identify new objects with just a few examples. There is an increasing belief that such accelerated learning may be possible with machine learning approaches in the near future, where systems can also rapidly learn to identify new objects. P. Isola gave a talk, “Toward the Evolution and Development of Prior Knowledge,” which focused on insightful ideas concerning how to build systems that learn from implicit signals (e.g., how to incorporate video information when learning about object recognition). S. Arora gave a theory of representational learning that elucidates how perceptual representations in language (or other domains including vision) can be learned in an unsupervised manner (without a teacher), which allows for more rapid learning in new contexts.

    What are new models for of learning, such as lifelong learning, learning with a teacher, etc.?

    There is a growing need to have systems evolve over time and gradually improve from experience. T. Mitchell gave a talk about systems that continually learn from experience based on insights from his Never-Ending Language Learner (NELL) system, which was a continually running algorithm for learning. S. Dasgupta spoke about models in which a benign teacher chooses examples that enable a learner to quickly acquire an accurate model. D. Sadigh and Y. Yue discussed challenges in settings where a robot or other mechanical device must interact with and learn from a human.

    Other topics on the frontiers of machine learning

    1. To what extent, and in what ways, can causality be inferred from data? (B. Scholkopf)
    2. How can data be used to guide the design of algorithms? (N. Balcan)
    3. What is the new theoretical landscape of privacy models, in the wake of recent EU regulation? (K. Chaudhuri)

    Another important direction discussed was data-driven algorithm design for combinatorial problems, an important aspect of modern data science and algorithm design. Rather than using off-the-shelf algorithms that only have worst-case performance guarantees, practitioners typically optimize over large families of parameterized algorithms and tune the parameters of these algorithms using a training set of problem instances from their domain to determine a configuration with high expected performance over future instances. However, so far, most of this work came with no performance guarantees. Nina Balcan presented recent provable guarantees for these scenarios, both for the batch and online scenarios where a collection of typical problem instances from the given application are presented either all at once or in an online fashion, respectively. The challenge is that for many combinatorial problems of significant importance to machine learning, including partitioning and subset selection problems, a small tweak to the parameters can cause a cascade of changes in the algorithm’s behavior, so the algorithm’s performance is a discontinuous function of its parameters, which leads to very interesting learning theory questions as well.

    Future collaborations:

    The interactions between the researchers were excellent, with active discussion and potential future follow-ups in a number of areas. These include:

    • Tengyu Ma and Sham Kakade plan to examine the question of implicit regularization in deep neural networks in language models.
    • Drew Bagnell, Sham Kakade, Emo Todorov and Elad Hazan discussed questions of error feedback learning (and iterative learning control) as robust control methods, where we could obtain provable guarantees.
    • Nina Balcan and Yisong Yue discussed providing fast branch and bound algorithms with provable guarantees for solving mixed-integer programs by using a data-driven algorithm design approach.
    • Nina Balcan, Sanjoy Dasgupta, Guy van den Broeck and Dan Roth talked about label-efficient learning in large-scale multitask scenarios where one can exploit known or learned logical constraints among tasks.
    • Phillip Isola and Sham Kakade discussed questions regarding a theory and algorithm for learning from ‘two views’ (e.g., learn from a video stream, where one view is the past and the other is the ‘future’).
    • Phillip Isola and Sanjeev Arora plan to further discuss ideas for representation learning.
  • Agenda & Slidesplus--large


    10:00 - 11:00 AMChristian Szegedy | Is Math Only for Humans?
    View Slides
    11:30 - 12:30 PMTom Mitchell | What Questions Should a Theory of Learning Agent Answer?
    View Slides
    5:00 - 6:00 PMDan Roth | Incidental Supervision and Reasoning Challenges
    View Slides
    6:15 - 7:15 PMSanjeev Arora | Discussion: Thoughts on a theory for unsupervised learning, with applications to RL, learning to learn, etc.
    View Slides


    10:00 - 11:00 AMDrew Bagnell | Imitation, Feedback and Games
    11:30 - 12:30 PMEmo Todorov | Model-based Control
    View Slides
    Andreas Krause | Towards Safe Reinforcement Learning
    5:00 - 6:00 PMYisong Yue | Blending Learning & Control via Functional Regularization
    View Slides
    6:15 - 7:15 PMEmma Brunskill | Machine Learning Challenges from Computers that Learn to Help People
    View Slides


    9:45 - 2:00 PMGuided Hike to Partnach Gorge
    5:00 - 6:00 PMElad Hazan | New Algorithms and Directions in Reinforcement Learning
    View Slides
    6:15 - 7:15 PMPhillip Isola | From Amoebas to Infants: Toward the Evolution and Development of Prior Knowledge
    View Slides


    10:00 - 11:00 AMDorsa Sadigh | Machine Learning for Human-Robot Systems
    View Slides
    11:30 - 12:30 PMNina Balcan | Data-driven/Machine-learning Augmented Algorithm Design
    View Slides
    5:00 - 6:00 PMBernhard Schölkopf | Toward causal learning
    View Slides
    6:15 - 7:15 PMSanjoy Dasgupta | Using Interaction to Overcome Basic Hurdles in Learning
    View Slides
    Daniel Hsu | Interactive Learning via Reductions
    View Slides


    10:00 - 11:00 AMUlrike Luxburg | Comparison-based Machine Learning
    Sham Kakade | Learning, Memory, and Entropy
    View Slides
    11:30 - 12:30 PMGuy Van den Broeck | Circuit Languages as a Synthesis of Learning and Reasoning
    5:00 - 6:00 PMKamalika Chaudhuri | New Directions in Privacy-preserving Data Analysis
    View Slides
    Moritz Hardt | The sociotechnical forces against overfitting
    View Slides
    6:15 - 7:15 PMTengyu Ma | Data-dependent Regularization and Sample Complexity Bounds of Deep Neural Networks
    View Slides
  • Participantsplus--large
    Sanjeev AroraPrinceton University
    Drew BagnellAurora Innovation
    Maria-Florina BalcanCarnegie Mellon University
    Emma BrunskillStanford University
    Kamalika ChaudhuriUniversity of California, San Diego
    Sanjoy DasguptaUniversity of California, San Diego
    Moritz HardtUniversity of California Berkeley
    Elad HazanPrinceton University
    Daniel HsuColumbia University
    Phillip IsolaMassachusetts Institute of Technology
    Sham KakadeUniversity of Washington
    Andreas KrauseETH Zürich
    Tengyu MaStanford University
    Tom MitchellCarnegie Mellon University
    Dan RothUniversity of Pennsylvania
    Dorsa SadighStanford University
    Bernard SchölkopfMax Planck Institute for Intelligent Systems
    Christian SzegedyGoogle
    Emo TodorovUniversity of Washington
    Guy van den BroeckUniversity of California, Los Angeles
    Ulrike von LuxburgUniversität of Tübingen
    Yisong YueCalifornia Institute of Technology
Subscribe to MPS announcements and other foundation updates