An Introduction to Probabilistic Programming (1809.10756v2)

Published 27 Sep 2018 in stat.ML, cs.AI, cs.LG, and cs.PL

Abstract: This book is a graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages. We start with a discussion of model-based reasoning and explain why conditioning is a foundational computation central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a first-order probabilistic programming language (PPL) whose programs correspond to graphical models with a known, finite, set of random variables. In the context of this PPL we introduce fundamental inference algorithms and describe how they can be implemented. We then turn to higher-order probabilistic programming languages. Programs in such languages can define models with dynamic computation graphs, which may not instantiate the same set of random variables in each execution. Inference requires methods that generate samples by repeatedly evaluating the program. Foundational algorithms for this kind of language are discussed in the context of an interface between program executions and an inference controller. Finally we consider the intersection of probabilistic and differentiable programming. We begin with a discussion of automatic differentiation, and how it can be used to implement efficient inference methods based on Hamiltonian Monte Carlo. We then discuss gradient-based maximum likelihood estimation in programs that are parameterized using neural networks, how to amortize inference using by learning neural approximations to the program posterior, and how language features impact the design of deep probabilistic programming systems.

Citations (189)

View on Semantic Scholar

Summary

The paper introduces a unified probabilistic programming framework that integrates finite and dynamic models to automate complex inference.
It details evaluation-based inference techniques and a CPS transformation to streamline sequential and distributed computations.
The study exemplifies key applications, including open-universe Gaussian mixtures and program induction, highlighting the paradigm's versatility.

Overview of "An Introduction to Probabilistic Programming"

This academic paper serves as a detailed introduction to probabilistic programming, a paradigm that combines the strengths of programming languages and probabilistic modeling to automate the inference process. The document is structured to guide readers through both theoretical fundamentals and practical considerations, beginning with finite variable models and extending to dynamic models with an unbounded number of random variables. The primary focus is on the design and operational intricacies of probabilistic programming languages and corresponding inference methodologies.

Finite and Dynamic Graphical Models

The paper initially presents a First-order Probabilistic Programming Language (FOPPL), which is equipped to handle models with a finite number of random variables. These are mapped to static computation graphs akin to Bayesian networks or factor graphs. Through translation to graphical models, it becomes possible to apply traditional inference algorithms such as Gibbs sampling and Metropolis-Hastings. This approach leverages existing theories in graphical models to automate inference, albeit within the constraints of finite variable cardinality.

The document progresses to discuss Higher-order Probabilistic Programming Languages (HOPPL) which support recursion, first-class functions, and unbounded loops. Such features allow for dynamic models that potentially include an infinite number of random variables, addressing a critical limitation of FOPPL-like languages. Models such as open-universe Gaussian mixtures and program induction are compelling examples that demonstrate the expressive power of HOPPLs.

Evaluation-Based Inference Strategies

For dynamic models, evaluation-based inference strategies provide valuable methodologies. These techniques involve executing a program while pausing at stochastic choices like sample and observe expressions, thereby avoiding the need for precompiled graphs. Algorithms such as Likelihood Weighting and Sequential Monte Carlo are adapted to this framework, leveraging lazy evaluation and enabling their application to models with dynamic support.

The paper further explores Metropolis-Hastings algorithm tailored to probabilistic programs. The challenge in this setting is ensuring unique addressing for random variables, managed effectively through an addressing transformation that dynamically generates trace-specific identifiers. A novel messaging interface approach delineates model execution from inference control, positing a client-server architecture to facilitate asynchronous, distributed inference processes.

Continuation-Passing Style (CPS) Transformation

A key technical accomplishment presented is the continuation-passing style (CPS) transformation. This transformation linearizes the program execution into discrete steps, each capable of pausing and resuming, facilitating operations like forking and inter-process communication necessary for stateful inference processes. Inference algorithms such as Single-Site Metropolis-Hastings derive substantial efficiencies by utilizing CPS to condition computations on previously sampled values and distributing computational load across processes.

Implications and Future Directions

The implications of these methodologies extend beyond mere technical execution to enable the integration of probabilistic reasoning into a wide array of existing software infrastructures. The document outlines how probabilistic programming can be used to denote complex models where the number and relationships of random variables arise naturally from the problem domain.

The introduction of probabilistic programming enables researchers and practitioners to abstractly and succinctly define models over complex, high-dimensional spaces. The theoretical groundwork supports automated machinery to perform inferences that were traditionally infeasible, offering expanded research and application opportunities across domains such as computer vision, natural language processing, and artificial intelligence.

Conclusion

This paper is a comprehensive resource for understanding the principles and practice of probabilistic programming. It meticulously discusses the language features necessary for cognitive-intuition modeling, details on the transformations required for inference, and evaluates appropriate algorithms for both static and inherently dynamic models. This foundation potentially paves the way for future research on enhanced inference methods, integration with deep learning architectures, and further refinement of probabilistic programming languages to support increasingly sophisticated models.

PDF Markdown

Related Papers

Tweets

https://twitter.com/simple_mcmc/status/1750304145368637642

YouTube

Show All Videos