The Standard Engine.
MathematicsFinanceHealthPhysicsEngineeringBrowse all

Mathematics · Probability & Statistics · Inferential Statistics

Sample Size Calculator

Calculates the minimum sample size required for a population proportion study given a desired confidence level, margin of error, and estimated population proportion.

Calculator

Advertisement

Formula

n is the required sample size. Z is the Z-score corresponding to the desired confidence level (e.g., 1.96 for 95% confidence). p is the estimated population proportion (use 0.5 if unknown, which maximises the required sample size). e is the margin of error expressed as a decimal (e.g., 0.05 for ±5%). For finite populations, apply the finite population correction: n_adjusted = n / (1 + (n - 1) / N), where N is the total population size.

Source: Cochran, W.G. (1977). Sampling Techniques, 3rd Edition. John Wiley & Sons.

How it works

Sample size determination is rooted in the mathematics of probability and estimation theory. When drawing a random sample from a population, there is always a degree of uncertainty about whether the sample accurately reflects the whole. Statistical theory lets us quantify this uncertainty through two key parameters: the confidence level and the margin of error. The confidence level (commonly 95%) expresses how often the true population parameter would fall within your reported interval if the study were repeated many times. The margin of error (commonly ±5%) sets the acceptable range of deviation from the true value.

The core formula — n = Z² · p(1−p) / e² — directly links these parameters to sample size. The Z-score converts the confidence level into a standardised value from the normal distribution: 1.645 for 90%, 1.96 for 95%, 2.326 for 98%, and 2.576 for 99%. The term p(1−p) represents the variance of a Bernoulli distribution; it is maximised at p = 0.5, which is why 50% is the conservative default when the true proportion is unknown. The margin of error e is entered as a decimal. All three components together determine how many observations are needed to meet your precision requirements.

For finite populations, the unadjusted formula over-estimates the necessary sample size. The finite population correction (FPC) — n_adj = n / (1 + (n−1)/N) — reduces the requirement proportionally to how large the sample is relative to the total population. This correction becomes meaningful when the sample would represent more than 5% of the population. The calculator applies the FPC automatically when you enter a population size greater than zero, which is especially useful for internal company surveys, clinical trials with limited patient pools, or quality audits of finite production batches.

Worked example

A market research firm wants to survey customers of a retail chain to estimate the proportion who are satisfied with their in-store experience. The team wants a 95% confidence level and a margin of error of ±4%. They have no prior data on satisfaction, so they use the conservative proportion of p = 50%.

Step 1 — Identify the Z-score: For 95% confidence, Z = 1.96.

Step 2 — Apply the formula:
n = (1.96² × 0.50 × 0.50) / 0.04²
n = (3.8416 × 0.25) / 0.0016
n = 0.9604 / 0.0016
n = 600.25 → rounded up to 601 respondents

Step 3 — Apply the finite population correction: The chain has N = 4,000 loyalty card members. Applying the FPC:
n_adj = 601 / (1 + (601 − 1) / 4000)
n_adj = 601 / (1 + 0.15)
n_adj = 601 / 1.15 ≈ 523 respondents

By accounting for the finite population, the firm reduces its required sample by 78 respondents — saving interview time and cost — while maintaining the same statistical validity. The adjusted sample represents approximately 13.1% of the total population.

Limitations & notes

This calculator assumes simple random sampling, meaning every member of the population has an equal and independent chance of being selected. Results may not be valid for cluster sampling, stratified sampling, or systematic sampling designs, which require more complex formulas and design effect adjustments. The formula also assumes the population is large and that responses are independent, which may not hold in small or tightly clustered communities.

The margin of error calculated here applies specifically to proportions (yes/no, agree/disagree questions). For continuous measurements such as average income or test scores, different formulas based on population standard deviation are required. Additionally, the formula does not account for non-response bias — if a significant portion of your sampled respondents do not reply, the effective sample size shrinks and your precision degrades. Researchers should inflate their target sample size by a non-response factor (e.g., divide by 0.8 if you expect 20% non-response). Finally, statistical sample size is a necessary but not sufficient condition for a valid study — sampling frame errors, measurement bias, and confounding variables can invalidate results regardless of how large the sample is.

Frequently asked questions

Why is 50% used as the default population proportion?

The term p(1−p) in the formula reaches its maximum value of 0.25 when p = 0.5. Using 50% therefore produces the largest — and most conservative — sample size estimate. This ensures adequate precision even if the true proportion turns out to be different from your initial guess. If you have reliable prior data suggesting the proportion is closer to 20% or 80%, using that value will reduce your required sample size.

What is the difference between confidence level and margin of error?

The confidence level expresses how reliably your interval captures the true population value across repeated samples. A 95% confidence level means that if you ran the same study 100 times, approximately 95 of the resulting intervals would contain the true proportion. The margin of error is the half-width of that interval — for example, a result of 42% ± 5% means the true value likely falls between 37% and 47%. These two parameters trade off against each other and against sample size: a higher confidence level or smaller margin of error always requires a larger sample.

When should I apply the finite population correction?

The finite population correction (FPC) becomes important when your sample size is more than roughly 5% of the total population. For large populations of tens of thousands or more, the correction has negligible effect and the infinite-population formula is sufficient. However, for internal surveys of a company's workforce, audits of a production run, or studies targeting a specific small community, applying the FPC can meaningfully reduce the required sample size and save resources.

How does sample size affect the margin of error?

Sample size and margin of error have an inverse square-root relationship. Doubling the sample size does not halve the margin of error — it only reduces it by a factor of √2 (about 29%). To cut the margin of error in half, you must quadruple the sample size. This diminishing return is why very small margins of error (below ±2%) become extremely expensive to achieve in practice, and why ±5% is the most common standard in opinion polling and market research.

Can this calculator be used for clinical trials or medical research?

This calculator is appropriate for proportion-based outcomes (e.g., the proportion of patients who respond to a treatment). However, clinical trials often require more sophisticated power calculations that account for Type I error (alpha), Type II error (beta or statistical power), effect size, and sometimes interim analyses. Tools based on the specific statistical test being used — such as a two-proportion z-test or a t-test — should be used for formal trial design. Regulatory submissions to bodies like the FDA or EMA require detailed power and sample size justifications prepared with validated statistical software.

Last updated: 2025-01-15 · Formula verified against primary sources.