4. Missing half in the Bayes' Theorem

By: Krischal Khanal

In the previous blogs, we saw how to use Bayes' Theorem to update belief. Then we coded that up, and ended with a question on what if result is stochastic? However, to understand that we must understand one missing piece of the Bayes' Theorem.

Bayes' Theorem is classically defined as:

Identities

Where,

Where, the symbols are defined as:

: Hypothesis
: Evidence/event

Using Bayes' Theorem we are able to update the belief of hypothesis in the light of evidence.
In other words, updates to .

Note

The belief in a hypothesis means "how much certain are you that the hypothesis is true?" Thus, if hypothesis is , belief in that hypothesis is . It's a probability.

If you assign a probability to your belief, you also implicitly assign rest of the probability for the exact negation of the belief.

Let's take the same example of rare disease.
If : You have the rare disease.
is your belief in you having that disease.

Then, : You do not have the rare disease.
is your belief in you not having that disease.

It's always the case that,

If you can use Bayes' Theorem to update , you can also update the using the same theorem.

Thus,

Identity

Complementary Bayes Theorem

This is the missing half of the Bayes' Theorem. It might seem obvious, in part because it's obvious. But generalizing this, we gain a profound understanding of the Bayes' theorem.

Generalizing the Bayes' Theorem: 1. Multiple hypotheses

Here we consider two competing, mutually exclusive hypotheses: and
If we sum all the mutually exclusive hypothesis.

Adding them,

And, if there were mutually exclusive hypotheses:
Using Bayes' Theorem for each one, we get:

Equivalently,

Note

Every denominator can be replaced by , but it's important to emphasize that it's composed of multiple hypothesis, and your belief has a role in shaping the overall marginal probability .

Using Bayes' Theorem in Parallel:

Missing \begin{bmatrix} or extra \end{bmatrix}P(H_1 | E) \\ P(H_2 | E) \\ \vdots \\ P(H_n | E) \end{bmatrix} = \frac{1}{\sum_{k=1}^n P(H_k)P(E | H_k)} \begin{bmatrix} P(H_1) \cdot P(E | H_1) \\ P(H_2) \cdot P(E | H_2) \\ \vdots \\ P(H_n) \cdot P(E | H_n) \end{bmatrix}

Keep in mind, the sums of mutually exclusive hypotheses remain 1 even after update.

Bayes' Theorem is only shifting the probabilities (adjusting the belief) among the hypotheses based on the evidence.

What if hypotheses aren't mutually exclusive?

We can update the belief just the same, but the sum of the probabilities isn't guaranteed to be a constant.

Thus, we are able to update belief of discrete probability distribution over the finite hypothesis.

Generalizing the Bayes' Theorem: 2. Continuous domain

We can even generalize where domain of hypothesis is continuous, and belief is in the form of probability density function.

This is the foundation for Bayesian Inference. If you imagine E being data and being parameters of probability distribution model i.e. your belief, then is your updated belief after you've seen the data. This is an example of Bayesian inference in action.

In the next blog we'll look into coding the parallel bayesian update for multiple hypothesis.

Krischal's Digital Garden

Explorer

Generalizing the Bayes' Theorem: 1. Multiple hypotheses

Generalizing the Bayes' Theorem: 2. Continuous domain

Graph View

Table of Contents