Question 1

What is a good statistical power level for a study?

Accepted Answer

The widely accepted minimum is 80% power (β = 0.20), as established by Cohen (1988). Clinical trials and studies with high-stakes decisions often require 90% or even 95% power to reduce the risk of false negatives. Regulatory agencies such as the FDA typically require at least 80–90% power for confirmatory trials.

Question 2

What is the difference between Type I and Type II errors?

Accepted Answer

A Type I error (false positive) occurs when you reject a true null hypothesis; its probability is α. A Type II error (false negative) occurs when you fail to reject a false null hypothesis; its probability is β. Statistical power equals 1 − β, so higher power means fewer false negatives. Researchers must balance both error types depending on the consequences of each mistake.

Question 3

How do I choose the right effect size for my power analysis?

Accepted Answer

Effect size should be estimated from previous studies, meta-analyses, or domain expertise — not from your own pilot data, which can be unreliable due to small sample sizes. Cohen's benchmarks (small d = 0.2, medium d = 0.5, large d = 0.8) are useful starting points when no prior data exists. Always use the smallest effect size that would be clinically or practically meaningful, not the largest one you might hope for.

Question 4

Why does increasing sample size increase statistical power?

Accepted Answer

Larger samples produce more precise estimates of the population parameter, reducing the standard error (σ/√n). This narrows the sampling distribution under the null hypothesis and shifts the non-centrality parameter λ = d√n upward, making it easier to distinguish the true effect from random noise. Power scales with √n, so quadrupling the sample size doubles the non-centrality parameter.

Question 5

Is it valid to calculate power after seeing a non-significant result?

Accepted Answer

Post hoc (observed) power is strongly discouraged by statisticians because it is a deterministic function of the p-value — a non-significant result always corresponds to low observed power, providing no independent information. If your study was non-significant, report the confidence interval for the effect size instead, which gives readers the range of plausible true effects that are consistent with your data.

Statistical Power Calculator

Formula

How it works

Worked example

Limitations & notes

Frequently asked questions

What is a good statistical power level for a study?

What is the difference between Type I and Type II errors?

How do I choose the right effect size for my power analysis?

Why does increasing sample size increase statistical power?

Is it valid to calculate power after seeing a non-significant result?

Statistical Power Calculator

Formula

How it works

Worked example

Limitations & notes

Related calculators

Frequently asked questions

What is a good statistical power level for a study?

What is the difference between Type I and Type II errors?

How do I choose the right effect size for my power analysis?

Why does increasing sample size increase statistical power?

Is it valid to calculate power after seeing a non-significant result?