Machine Learning at the Flatiron Institute: Clément Hongler

Date


Title: Arrows of Time for Large Language Models

Abstract:
Large Language Models famously predict the next token in a text. What happens if we teach them to predict the next word? It turns out that some subtle differences emerge. I will discuss some empirical and theoretical results about this, and also some (hopefully exciting) consequences and perspectives suggested by our results.

About the Speaker

Clément Hongler has worked on statistical mechanics, quantum field theory, deep learning theory, and a few other things. He enjoys talking with people from various horizons.

Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.