3 min read

What are Priors in Bayesian Models?

With Bayesian statistics we can incorporate prior information into a model. That means that if we have a priori information about parameters in our model (and we usually do), we can actually use that information! We just need to specify prior probability distributions for our model parameters.

For example, we know that a binary variable for whether a field is organic should have a negative impact on yield, so we might want to nudge the coefficient on organic, let’s call it \(\alpha\), in that direction. One way to accomplish this is by specifying a normal prior probability distribution on \(\alpha\) with a mean of -0.5 and a standard deviation of 0.5.

In formula notation, it looks like this:

\[\alpha \sim N(-0.5, 0.5)\]

Of course, when building Bayesian models you’ll want to first standardize all your variables. This is because different scales across variables in your model can make Bayesian model fitting using Markov-Chain Monte-Carlo (MCMC) less efficient. Usually, standardization means we subtract each variable by its mean (center) and divide by its standard deviation (scale).

Interpreting coefficients for variables that have been standardized is more straightforward, and makes it easier to compare the magnitude of effects across different variables on the outcome of interest. For example, if \(\alpha\) turns out to be -0.8, this means that organic fields have a 0.8 standard deviation lower yield than conventional fields, ceteris paribus. Specifying priors on coefficients also seems to be somewhat easier for standardized variables, although some would (strongly) disagree.

This is all great, but how do we determine what the priors should be? Usually we can use domain knowledge or previous research. But sometimes we don’t know a priori what the effect of a certain variable should be on the outcome. What should we do in these cases?

Even if we don’t know the relationship between a variable and the outcome, we can use regularizing priors to reduce overfitting. A common regularizing prior is \(N(0, 1)\). Regularization is the process of adding information to the model to ensure it doesn’t get too excited by noise in the data. It makes estimates more conservative, and may improve the predictive power of your model on new data.

To learn more about priors choices in Bayesian models, visit the Stan developers Github page Prior Choice Recommendations.

If you use R, I highly recommend the brms package for fitting Bayesian models. It is a very flexible, highly capable package that allows users to fit very complex Bayesian models in Stan with ease. The author is also very responsive to issues users raise on Github.

For a thorough and insightful introduction to Bayesian statistics, try Richard McElreath’s Statistical Rethinking. This is a textbook, but it is a very good textbook with R code examples and thoughtful explanations throughout. The textbook also has an accompanying YouTube series that you can watch for free!