Scaling and Generalizing Bayesian Inference
A core problem in statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in probabilistic modeling, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this talk, I review and discuss variational inference (VI), a method that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning. It tends to be faster than more traditional methods, such as Markov chain Monte Carlo sampling.
After quickly reviewing the basics, I will discuss our recent research on VI. I first describe stochastic variational inference, an approximate inference algorithm for handling massive data sets, and demonstrate its application to probabilistic topic models of millions of articles. Then I discuss black box variational inference, a generic algorithm for approximating the posterior. Black box inference easily applies to many models but requires minimal mathematical work to implement. I will demonstrate black box inference on deep exponential families—a method for Bayesian deep learning—and describe how it enables powerful tools for probabilistic programming.