001: Paul Bürkner

A conversation with Paul Bürkner about his R package brms that implements Bayesian multilevel models in R using the probabilistic programming language Stan.


What is special about the brms package?
The brms (Bayesian Regression Models using Stan) package provides a consistent framework (e.g., consistent programming syntax in R) for fitting a wide range of Bayesian statistical models.

How and why did you start working on brms?
Existing solutions that used JAGS (Just Another Gibbs Sampler) weren't good enough when I tried to fit ordinal models. When I first tried Stan, it worked quite well. 

How does Markov Chain Monte Carlo (MCMC) work?
The goal is to explore and understand distributions, which are like hills. But since we don't know how those distributions or hills look like, we have to walk over them to explore them. We move over more frequently to regions of high density (regions where the hills are high). By exploring the distributions this way, we end up with samples of the distributions (in contrast to maximum likelihood estimation where we get only a point estimate).

Why is Stan so good?
It handles complex models very well and the sampling algorithm (Hamiltonian Monte Carlo No-U-Turn Sampler) used in Stan converges very well relatively to algorithms used by JAGS and BUGS. However, each sample or step is more difficult to compute because the algorithm uses more information about the distribution—it uses not just the height but also the derivative of the density to make more informed next sampling steps. Hence, fewer steps or samples are needed to explore and understand the distribution, but each step is more difficult to compute. 

What is model complexity?
It's not just the number of parameters, but also how highly correlated the parameters are. For example, hierarchical models have complex dependencies between parameters. 

How was the process of making the package like?
I was the only person working on it. The most challenging part was to figure out how to structure the package to allow different parts to interact with one another. I refactored the package at least three times and spent maybe more than a thousand hours working on it so far. Many ideas and features are suggested by users. 

What are the plans for brms?
brms version 3.0 will be released in the near future. It will be another major refactoring to ensure even more consistency throughout. Support for structural equation modelling (models with latent variables) and in the future (with the help of grants and funding), as well as Bayesian multilevel structural equation modelling in the future.

A common misunderstanding or error?
Computing Bayes factors isn't the most important part of Bayesian inference, so even though brms can compute Bayes factors, it isn't the main goal. Also, Bayes factors depend heavily on the priors. 

What would you do with a pot of money?
Pay myself to work on the package. Or maybe pay someone else to work on it. 

What's your productivity stack? 
* Program in R, RStudio, and Stan, and relies on the tidyverse R family of packages.
* Most pressing issue is to have something that closes my email. 
* Writes manuscripts in R Markdown and converts them to latex and pdf later. 
* Fairly unstructured days and weeks unless giving lectures. "I write too little, program too much."
* daily/weekly schedule: gives lectures, else just work anywhere; few appointments; unstructured; flexible
* For academia: freedom. Against academia: uncertainty with the future. 
* If not an academic, what would you do? Statistics freelancer, and dance a little bit more. 
* Tip or hack for new graduate students: Take time to go into the details because later on you won't have time for it. Do what you like; academia is hard, so try to have fun.
© 2019 The Science PaperCast