Thu 8 Nov 2018 13:52 - 14:14

Probabilistic programming systems (PP systems) allow developers
to model stochastic phenomena and perform efficient inference on
the models. The number and adoption of probabilistic programming
systems is growing significantly. However, there is no prior study
of bugs in these systems and no methodology for systematically
testing PP systems. Yet, testing PP systems is highly non-trivial,
especially when they perform approximate inference.
In this paper, we characterize 118 previously reported bugs in
three open-source PP systems—Edward, Pyro and Stan—and pro-
pose ProbFuzz, an extensible system for testing PP systems. Prob-
Fuzz allows a developer to specify templates of probabilistic models,
from which it generates concrete probabilistic programs and data
for testing. ProbFuzz uses language-specific translators to generate
these concrete programs, which use the APIs of each PP system.
ProbFuzz finds potential bugs by checking the output from running
the generated programs against several oracles, including an accu-
racy checker. Using ProbFuzz, we found 67 previously unknown
bugs in recent versions of these PP systems. Developers already
accepted 51 bug fixes that we submitted to the three PP systems,
and their underlying systems, PyTorch and TensorFlow.

Thu 8 Nov
Probabilistic Reasoning
Antonio Filieri
