Title: Ensuring Reliability and Understanding Theoretical Limits of Foundation Models
Abstract: Why do even sophisticated LLM struggle with simple code completions? Why do we find new jailbreaking attacks every day? And what do these failures tell us about how to build agentic systems?