Week #2

Author

Adrian Correndo & Josefina Lacasa

Introduction to Bayesian Stats

This article is intended to provide a brief introduction to key concepts about Bayesian theory and differences with the traditional Frequentist approach:

Important

Neither of both, Frequentist or Bayesian, are always the most suitable solution for your analysis. However….😉

For this reason, today we are going to discuss and compare both approaches.

Let’s watch some short videos about it

1. Frequentism vs Bayesianism

What do you think?

  • Open discussion….

2. Main Differences

Perhaps, the main disagreement between Frequentism and Bayesianism is about the TRUTH.

The Frequentism vision is heavily rooted on the actual existence of the TRUTH. Every time we estimate a model’s parameter, we expect to approximate to a true value. It is named “frequentism” because it is based on the frequency of repeated events.

For example, if we want to assess the probability of getting a #6 when rolling a dice, Frequentism says that “If we roll a dice close to infinite times, the proportion of #6 out of the total number of rolls will approach 16.7% (the theoretical probability)”. Thus, Frequentism makes inference conditional to an ideal, “theoretical” condition of repeating the same experiment infinite times. In other words, conclusions rely on events we have not observed.

The Bayesian approach instead, DOES NOT assume the existence of the TRUTH. In contrast, it is based on PROBABILITIES & BELIEFS.

PROBABILITIES: For the Bayesian vision, everything is a matter of probability. Any fact or result about an “estimate” could range from extremely unlikely to extremely likely. However, anything is considered completely true or false.

BELIEFS: here is probably the most important point of the Bayesian vision. Bayesian models allow to introduce (and update) prior knowledge on a topic, introducing our own certainty or uncertainty about events. EVEN IF WE DON’T KNOW ANYTHING about it (spoiler alter: uninformative prior!).

Bayesianism considers probability as an expression of the degree of certainty (or uncertainty).

Following the same example with the dice roll, Bayesian interpretation says that “we are, a priori, 16.7% certain we are going to get a #6”. The, Bayesianism makes inference conditional to the data we observed. We basically test the likelihood of a prior hypothesis being true given the observed data, and we generate a new “likelihood” of updated hypothesis being true given the observed data.

And now, to compare previous beliefs (prior) to updated knowledge (posterior) we can introduce the concept of Bayes Factor, which is a ratio between to candidate statistical models represented by marginal likelihood to measure the support for one model over the other. For example, if we have 0.167 as a prior belief of obtaining a #6, and our updated likelihood (after combining with observed data) results 0.334:

\[ Bayes Factor = \frac{0.334}{0.167} = 2 \]

Thus, our updated hypothesis is twice as likely to be true than our prior hypothesis given the observed data.

Therefore, when we analyze our data:

  1. Frequentism assumes models being fixed and our data random (maximum likelihood, conditional to theoretical events).

  2. Bayesian assumes that models can vary around our data (conditional to observed data)

Tip

For simple models, however, the two approaches would be practically indistinguishable…

HINT: think about uninformative prior knowledge!

Note

The structure of the Bayesian theory is similar to the Human Logic process:

(i) we have some data,

(ii) we have beliefs about the underlying process,

(iii) combining both, we can update our beliefs.

3. Bayes Theorem

\[ P(A_{true} | B) = \frac{P(B|A_{true}) * P(A_{true})}{P(B)}\] \[ Posterior = \frac{Likelihood * Prior}{Evidence}\]

Bayes’ Rule Video

4. The Priors

Priors are basically a formalization of our believes in a form of a mathematical function describing a “distribution”. First, it depends on the nature of the variable of interest, which could be “discrete” or “continuous”. Second, it depends on what we know (or not) about the process of interest…

5. Credible vs. Confidence intervals

There is a very important difference between Frequentism and Bayesianism in terms of error interpretation. Let’s say we estimate 95% confidence and credible intervals for \(\theta\):

  • Confidence intervals (Frequentist): “If we repeat the experiment infinite times, 95% of the estimated confidence intervals will contain the true value of \(\theta\) (based on repeated measurements)“. Note that since \(\theta\) is fixed, it can only be within or outside the interval.

  • Credible intervals (Bayesian): In contrast, Bayesianism has a “literal” interpretation of the error saying, *“there is a probability of 95% that the parameter* \(\theta\) lies within this credible interval”. This is a range of probable values. Note:”Given that the prior is correct”…

6. Useful Resources

Introductory theory:

Bayesian Models: A Statistical Primer for Ecologists. Hobbs and Hooten
Bringing Bayesian Models to Life. Hooten and Hefley

Not from biological sciences but still very good:
Statistical Rethinking. McElreath

On Bayesian workflow/philosophy:
Bayesian workflow
Scientific Reasoning: The Bayesian Approach. Howson and Urbach

Advanced theory:

Bayesian Data Analysis 3

Agronomy papers:

Makowski et al., 2020

Miscellaneous:

Blog: Statistical Modeling, Causal Inference, and Social Science. Gelman et al.
Podcast: Learning Bayesian Statistics