The Standard Engine.
MathematicsFinanceHealthPhysicsEngineeringBrowse all

Mathematics · Probability & Statistics · Probability Distributions

Binomial Distribution Calculator

Calculates the binomial probability P(X = k), cumulative probability P(X ≤ k), mean, variance, and standard deviation for a binomial experiment with n trials and success probability p.

Calculator

Advertisement

Formula

P(X = k) is the probability of exactly k successes in n independent trials. n is the total number of trials. k is the number of successes of interest (0 ≤ k ≤ n). p is the probability of success on any single trial. (1 − p) is the probability of failure. The binomial coefficient C(n, k) counts the number of ways to arrange k successes among n trials. The mean is μ = np, the variance is σ² = np(1−p), and the standard deviation is σ = √(np(1−p)).

Source: Wackerly, D., Mendenhall, W., & Scheaffer, R. (2008). Mathematical Statistics with Applications, 7th ed. Thomson Brooks/Cole. Chapter 3.

How it works

A binomial experiment has four defining characteristics: a fixed number of trials (n), each trial is independent of all others, each trial has exactly two outcomes (success or failure), and the probability of success (p) is constant across all trials. When all four conditions are satisfied, the number of successes X follows a binomial distribution, written X ~ B(n, p). This distribution is discrete, meaning X can only take integer values from 0 to n.

The probability mass function (PMF) is P(X = k) = C(n, k) · p^k · (1−p)^(n−k). The binomial coefficient C(n, k) = n! / (k!(n−k)!) counts the number of distinct ways to choose k successes from n trials, regardless of order. The term p^k represents the probability of those k successes occurring, and (1−p)^(n−k) represents the probability of the remaining (n−k) failures. The cumulative distribution function (CDF), P(X ≤ k), sums the PMF from 0 to k and gives the probability of observing at most k successes. The distribution's mean is μ = np, its variance is σ² = np(1−p), and its standard deviation is σ = √(np(1−p)).

Practical applications are vast. A pharmaceutical company testing a drug with a known 30% response rate can calculate the probability that at least 5 out of 12 patients respond. A manufacturer knowing a 2% defect rate can determine the probability of finding 0 defective items in a batch of 50. In finance, binomial models underpin option pricing theory. In machine learning, the binomial distribution models classification accuracy under random guessing. This calculator handles n up to several hundred accurately by computing probabilities in log-space to avoid numerical overflow or underflow.

Worked example

Suppose a student guesses randomly on a multiple-choice exam with 10 questions, each having 4 answer choices (so p = 0.25). What is the probability of getting exactly 3 correct?

Step 1 — Identify parameters: n = 10, k = 3, p = 0.25, (1−p) = 0.75.

Step 2 — Compute the binomial coefficient: C(10, 3) = 10! / (3! × 7!) = (10 × 9 × 8) / (3 × 2 × 1) = 720 / 6 = 120.

Step 3 — Compute the probability terms: p^k = 0.25³ = 0.015625. (1−p)^(n−k) = 0.75⁷ = 0.133484.

Step 4 — Multiply: P(X = 3) = 120 × 0.015625 × 0.133484 = 0.250282, or about 25.03%.

Step 5 — Cumulative probability P(X ≤ 3): Summing P(X=0) + P(X=1) + P(X=2) + P(X=3) gives approximately 0.7759, meaning there is a 77.59% chance of getting 3 or fewer correct by pure guessing.

Mean: μ = 10 × 0.25 = 2.5 correct answers. Standard deviation: σ = √(10 × 0.25 × 0.75) = √1.875 ≈ 1.369.

Limitations & notes

The binomial distribution requires strict independence between trials — if outcomes influence each other (e.g., sampling without replacement from a small population), the hypergeometric distribution is more appropriate. The probability p must remain constant across all trials; if p changes over time or across trials, a more general model is needed. For very large n combined with very small p (rare events), the Poisson distribution with λ = np often provides a computationally simpler and equally accurate approximation. Similarly, when np ≥ 5 and n(1−p) ≥ 5, the normal distribution with μ = np and σ² = np(1−p) can be used as an approximation, which is especially useful for calculating tail probabilities. This calculator computes probabilities in log-space to maintain numerical accuracy for large n, but for n greater than a few thousand combined with extreme values of k, results should be verified with dedicated statistical software such as R, Python (SciPy), or Mathematica.

Frequently asked questions

What is the difference between P(X = k) and P(X ≤ k) in the binomial distribution?

P(X = k) is the exact probability of getting precisely k successes — no more, no less. P(X ≤ k) is the cumulative probability of getting k or fewer successes, which equals the sum of P(X=0) + P(X=1) + ... + P(X=k). The cumulative form is often more useful in hypothesis testing and decision-making contexts.

When can I use the normal approximation to the binomial distribution?

The normal approximation is reliable when both np ≥ 5 and n(1−p) ≥ 5. Under these conditions, X is approximately normally distributed with mean μ = np and variance σ² = np(1−p). Apply a continuity correction by computing P(X ≤ k) ≈ P(Z ≤ (k + 0.5 − np) / √(np(1−p))) to improve accuracy.

What is the difference between the binomial and hypergeometric distributions?

The binomial distribution applies when sampling with replacement (or from a very large population), so each trial's probability p stays constant. The hypergeometric distribution applies when sampling without replacement from a finite population, meaning p changes with each draw. For population sizes more than 20 times the sample size, the two distributions give nearly identical results.

How do I calculate the probability of at least k successes using this calculator?

P(X ≥ k) = 1 − P(X ≤ k−1). For example, to find the probability of at least 4 successes, calculate the cumulative probability with k set to 3, then subtract from 1. If k = 0, then P(X ≥ 0) = 1 by definition since at least 0 successes always occurs.

Why does the binomial distribution assume independence between trials?

Independence ensures that the outcome of one trial carries no information about any other trial, which is the mathematical foundation required for the probability of any specific sequence of k successes and (n−k) failures to equal p^k(1−p)^(n−k). Violations of independence — such as contagion in disease spread, or sequential dependencies in quality control — invalidate the model and typically require Markov chain models, the negative binomial distribution, or simulation-based methods instead.

Last updated: 2025-01-15 · Formula verified against primary sources.