Mathematics · Probability & Statistics
Bayes' Theorem Calculator
Calculate conditional probability using Bayes' Theorem given prior probability, likelihood, and marginal probability.
Calculator
Formula
P(A|B) is the posterior probability of event A given event B has occurred. P(B|A) is the likelihood — the probability of observing B given A is true. P(A) is the prior probability of event A. P(B) is the marginal probability of event B (the total probability of observing B under all hypotheses).
Source: Bayes, T. (1763). An Essay towards Solving a Problem in the Doctrine of Chances. Philosophical Transactions of the Royal Society of London, 53, 370–418. Reproduced in Biometrika, 45(3/4), 293–315 (1958).
How it works
Bayes' Theorem mathematically formalizes the process of belief revision. The theorem states that the posterior probability of a hypothesis A given observed evidence B is equal to the likelihood of observing B if A were true, multiplied by the prior probability of A, divided by the marginal probability of B. The prior P(A) encodes your initial belief about A before seeing any evidence — this might come from historical base rates, expert knowledge, or previous experiments. The likelihood P(B|A) quantifies how probable the observed evidence is under the assumption that A is true. The marginal probability P(B), sometimes called the 'evidence' or 'normalizing constant', is the total probability of observing B across all possible hypotheses, ensuring that the posterior probabilities sum to one.
The marginal probability P(B) can be expanded using the law of total probability: P(B) = P(B|A)·P(A) + P(B|¬A)·P(¬A). This means if you know the prior P(A), the likelihood P(B|A), and the false positive rate P(B|¬A), you can compute P(B) without needing it directly. This decomposition is critical in practice because P(B) is rarely measured directly — it must be derived from the constituent parts. The result, P(A|B), is the posterior probability: your updated rational belief about hypothesis A after accounting for the evidence B.
The theorem underpins Bayesian inference, a complete statistical paradigm that treats probability as a degree of belief rather than a long-run frequency. In machine learning, Bayesian classifiers use this theorem to classify text, filter spam, and diagnose medical conditions. In epidemiology, it is used to determine the probability that a patient truly has a disease given a positive diagnostic test result — often revealing startlingly counterintuitive results when base rates are low. In court, it has been applied (and sometimes misapplied) to assess the weight of forensic evidence. Understanding and correctly applying Bayes' Theorem is a cornerstone of probabilistic reasoning in science and engineering.
Worked example
Consider a classic medical screening scenario. Suppose a disease affects 1% of the population, so the prior probability P(Disease) = 0.01. A diagnostic test for this disease has a sensitivity (true positive rate) of 90%, meaning P(Positive Test | Disease) = 0.90. The test also has a false positive rate of 10%, so P(Positive Test | No Disease) = 0.10. We want to find the probability that a person actually has the disease given they tested positive.
Step 1 — Compute the marginal probability P(Positive Test):
P(Positive) = P(Pos|Disease)·P(Disease) + P(Pos|No Disease)·P(No Disease)
P(Positive) = (0.90 × 0.01) + (0.10 × 0.99)
P(Positive) = 0.009 + 0.099 = 0.108
Step 2 — Apply Bayes' Theorem:
P(Disease | Positive) = [P(Positive | Disease) × P(Disease)] / P(Positive)
P(Disease | Positive) = (0.90 × 0.01) / 0.108
P(Disease | Positive) = 0.009 / 0.108 ≈ 0.0833 (8.33%)
This result is often deeply surprising: even with a 90% accurate test, a positive result only means an 8.33% chance of actually having the disease. This happens because the disease is rare — the prior is very low at 1% — so most positive tests come from the large pool of healthy people generating false positives. This phenomenon, known as the base rate fallacy, is exactly why Bayes' Theorem is so vital in medical and scientific reasoning.
Limitations & notes
Bayes' Theorem is mathematically exact, but its practical application depends entirely on the quality of its inputs. Choosing an appropriate prior P(A) is often subjective and contested — different researchers may assign different priors to the same hypothesis, leading to different posteriors. In low-data regimes, the choice of prior can dominate the result, making conclusions sensitive to assumptions that may be difficult to justify. The marginal probability P(B) must be computed carefully using the law of total probability; omitting relevant alternative hypotheses will produce an underestimated P(B) and an overconfident posterior. Additionally, Bayes' Theorem in its simple form applies to binary hypotheses (A and ¬A); extending it to multiple competing hypotheses requires the full sum over all alternatives in the denominator. It also assumes that probabilities are well-calibrated and that events are defined precisely — in real-world applications, ambiguous event definitions or poorly measured base rates can severely distort results. Finally, the theorem does not account for model uncertainty or the possibility that the hypotheses themselves are incorrectly specified.
Frequently asked questions
What is the difference between prior and posterior probability?
The prior probability P(A) represents your belief about hypothesis A before observing any new evidence — it encodes existing knowledge such as historical base rates or expert judgment. The posterior probability P(A|B) is your updated belief about A after incorporating the new evidence B via Bayes' Theorem. The theorem provides the mathematically correct way to transition from prior to posterior, ensuring belief revision is coherent and consistent with probability theory.
How do I calculate the marginal probability P(B) if I don't know it directly?
You can calculate P(B) using the law of total probability: P(B) = P(B|A)·P(A) + P(B|¬A)·P(¬A). This requires knowing the false positive rate P(B|¬A) — the probability of observing the evidence B even when hypothesis A is false — along with the prior P(A). Since P(¬A) = 1 − P(A), you only need P(A), P(B|A), and P(B|¬A) to fully determine P(B). This decomposition is the standard approach in diagnostic testing and probabilistic inference.
Why does a highly accurate test sometimes give a low posterior probability?
This counterintuitive result is caused by a low prior probability (base rate). When a condition is very rare in the tested population, even a small false positive rate produces many more false positives than true positives in absolute numbers, because the healthy population is vastly larger. For example, testing for a disease with a 1% prevalence using a 90%-accurate test still yields a posterior of only ~8.3% — meaning over 90% of positive results are false alarms. This is why population-level screening programs require very high specificity and careful interpretation.
What is a likelihood ratio and why does it matter?
The likelihood ratio (LR) is the ratio P(B|A) / P(B|¬A), measuring how much more likely the evidence B is when hypothesis A is true compared to when it is false. A high likelihood ratio means the evidence strongly supports A, while an LR near 1 means the evidence is uninformative. Likelihood ratios are particularly useful in clinical medicine and forensic science because they allow rapid mental multiplication: the posterior odds equal the prior odds multiplied by the likelihood ratio, providing an intuitive way to chain multiple pieces of independent evidence together.
Can Bayes' Theorem be applied sequentially with multiple pieces of evidence?
Yes — one of the most elegant properties of Bayes' Theorem is that it can be applied iteratively. The posterior from one application becomes the prior for the next when new independent evidence arrives. For example, after a positive test result raises your posterior probability from 1% to 8.3%, if a second independent positive test is obtained, you apply Bayes' Theorem again using 8.3% as the new prior. This sequential updating is the foundation of Bayesian filtering algorithms like the Kalman filter used in navigation systems, and online learning algorithms in artificial intelligence.
Last updated: 2025-01-15 · Formula verified against primary sources.