P-Value Calculator — Z-Test One & Two-Tailed | LazyTools

Math & Science

P-Value Calculator (Z-Test)

Calculate the p-value from a Z-score (or standard normal test statistic) for one-tailed or two-tailed hypothesis tests. Furthermore, results are automatically evaluated at α = 0.05, 0.01, and 0.001 significance thresholds with a plain-English interpretation.

✓ Free forever ✓ No login required ✓ Works offline ✓ Instant results ✓ Step-by-step shown

How to use the P-Value Calculator (Z-Test)

Enter the Z-score

Type the standardised test statistic. Furthermore, for a one-sample Z-test: Z = (x̄ − µ₀)/(σ/√n). For comparing proportions: Z = (p̂₁ − p̂₂)/SE. Large |Z| values give small p-values.

Select the test type

Two-tailed (H₁: µ ≠ µ₀): p = 2 × P(Z > |z|). One-tailed upper (H₁: µ > µ₀): p = P(Z > z). Furthermore, one-tailed lower: p = P(Z < z). Choose the tail based on your alternative hypothesis, not on what looks significant.

Click Calculate

p-value appears alongside significance at three standard alpha thresholds. Moreover, the insight explains the correct interpretation of p — it is not the probability that H₀ is true.

Interpret correctly

p < 0.05: reject H₀ at 5% level (statistically significant). p < 0.01: highly significant. Furthermore, p ≥ 0.05 means insufficient evidence to reject H₀, not that H₀ is true.

Apply to Z-score from data

Z = (x̄ − µ₀)/(σ/√n) requires known population σ. Furthermore, for unknown σ, use the t-distribution with n−1 degrees of freedom — the t-distribution approaches Z for large n.

Variants, options and when to use each

Z-score	Two-tailed p	Interpretation
1.645	0.100	Significant at α = 0.10 only
1.960	0.050	Critical value for α = 0.05
2.326	0.020	Significant at α = 0.02
2.576	0.010	Critical value for α = 0.01
3.291	0.001	Critical value for α = 0.001

The formula explained

p (two-tailed) = 2 × [1 − Φ(|Z|)] | p (one-tailed) = 1 − Φ(Z)

p = p-value — probability of Z this extreme or more under H₀
Φ(Z) = standard normal CDF = P(Z ≤ z)
Z = standardised test statistic
α = significance level (threshold for rejecting H₀)

The p-value is the probability of obtaining a test statistic as extreme or more extreme than the observed value, assuming H₀ is true. Furthermore, for a two-tailed test, both tails are included: p = 2 × P(Z > |z|). For a one-tailed test, only the relevant tail is used. Moreover, the p-value does NOT measure the probability that H₀ is true — this is one of the most common misinterpretations in statistics.

Worked example — Z = 2.5, two-tailed test

Step	Calculation	Result
Φ(2.5)	P(Z ≤ 2.5)	0.9938
P(Z > 2.5)	1 − 0.9938	0.00621
Two-tailed p	2 × 0.00621	p = 0.01242
Significant at α=0.05?	0.01242 < 0.05	Yes

Z = 2.5 gives p = 0.0124 (two-tailed) — significant at α = 0.05 and α = 0.02, but not at α = 0.01. Furthermore, this means that if H₀ were true, we would observe |Z| ≥ 2.5 only 1.24% of the time by chance. Moreover, with α = 0.05, we reject H₀ and conclude the observed difference is statistically significant.

What is a p-value in statistics?

The p-value is the probability, under the null hypothesis H₀, of observing a test statistic as extreme or more extreme than the one calculated from the data. Furthermore, a small p-value means the observed data is unlikely under H₀ — providing evidence against H₀. The conventional significance level α = 0.05 means we reject H₀ when p < 0.05.

P-values are derived from the null distribution of the test statistic — in this calculator, the standard normal (Z) distribution. Moreover, the Z-test applies when the population standard deviation σ is known and the sample size is large (n > 30). For unknown σ and small samples, the t-distribution should be used instead.

The p-value does not equal the probability that H₀ is true. Additionally, this is the most common misinterpretation. A p-value is a conditional probability — conditioned on H₀ being true. Bayesian posterior probabilities P(H₀|data) require a prior P(H₀) and are mathematically distinct from p-values.

Who uses this calculator?

Scientists and researchers use p-values in hypothesis testing across all quantitative fields — clinical trials, psychology experiments, biology, and engineering. Furthermore, regulatory bodies (FDA, EMA) require p < 0.05 as a minimum threshold for drug efficacy claims. Quality engineers use p-values in process capability testing and SPC (statistical process control). Moreover, data scientists use p-values in A/B testing for website and product optimisation.

Historical context and related concepts

The p-value was introduced by Karl Pearson in the early 1900s and formalised by Ronald Fisher in "Statistical Methods for Research Workers" (1925). Furthermore, Fisher proposed p < 0.05 as a conventional threshold for significance. Neyman and Pearson (1933) formalised the complementary framework of Type I and Type II errors. Moreover, the p < 0.05 threshold has been debated extensively — the American Statistical Association issued a statement (2016) cautioning against over-reliance on p-values alone.

Why p-value calculation is central to scientific inference and drug approval

Regulatory approval of new drugs requires randomised controlled trials with p < 0.05 (often with multiple hypothesis correction). Furthermore, a new drug showing efficacy with p = 0.01 in a Phase III trial provides strong evidence for approval. Moreover, p-values guide resource allocation in research — statistically significant findings receive further investigation and publication.

P-values and the replication crisis in science

Many scientific fields have experienced a "replication crisis" — published findings with p < 0.05 often fail to replicate. Furthermore, this arises from p-hacking (selectively reporting significant results), small sample sizes (underpowered studies), and publication bias. Moreover, the solution includes pre-registration of hypotheses, larger samples, reporting effect sizes alongside p-values, and raising significance thresholds (some researchers advocate p < 0.005 as the new standard for discovery claims).

Frequently asked questions

What is the difference between p-value and significance level α?+

α is the pre-specified probability of Type I error (falsely rejecting H₀) — typically 0.05. The p-value is calculated from the data. Furthermore, we reject H₀ when p < α. α is set before looking at data; p-value is computed from it. Moreover, reporting p = 0.049 and p = 0.051 as fundamentally different (one "significant", one not) is statistically inappropriate.

What does "two-tailed" mean?+

Two-tailed means the alternative hypothesis is H₁: µ ≠ µ₀ — the mean could be above or below µ₀. Furthermore, both tails of the distribution are included: p = 2 × P(Z > |z|). One-tailed (upper) is H₁: µ > µ₀; one-tailed (lower) is H₁: µ < µ₀. Moreover, the choice of tail must be made before seeing data — switching to a one-tailed test after observing the direction is p-hacking.

What is the difference between Z-test and t-test?+

Z-test uses the standard normal distribution and requires known population σ. T-test uses the t-distribution with n−1 degrees of freedom when σ is estimated from data. Furthermore, for large n (> 30), the t-distribution approximates Z — results are nearly identical. Moreover, for small samples from non-normal populations, neither Z nor t provides exact p-values — bootstrap methods may be preferred.

What is multiple comparison correction?+

When testing many hypotheses simultaneously, the probability of at least one false positive increases. Furthermore, Bonferroni correction: use α/m instead of α, where m is the number of tests. For 20 tests at α = 0.05: individual threshold = 0.05/20 = 0.0025. Moreover, false discovery rate (FDR) control (Benjamini-Hochberg) is less conservative and preferred for large-scale genomics and proteomics studies.

Can I use this for t-test p-values?+

For large samples (n > 30), Z and t give very similar results — this calculator approximates t-test p-values. Furthermore, for exact t-test p-values with small samples, use a t-distribution with n−1 degrees of freedom. Moreover, the critical values differ most at low n: t₀.₀₅ with 5 df = 2.571 vs Z₀.₀₅ = 1.960.

Rate this tool

4.6

out of 5

★★★★★

486 ratings

5 ★

73%

4 ★

16%

3 ★

2 ★

1 ★

How useful was this tool?

★ ★ ★ ★ ★

P-Value Calculator (Z-Test)

How to use the P-Value Calculator (Z-Test)

Variants, options and when to use each

The formula explained

Worked example — Z = 2.5, two-tailed test

What is a p-value in statistics?

Who uses this calculator?

Historical context and related concepts

Why p-value calculation is central to scientific inference and drug approval

P-values and the replication crisis in science

Frequently asked questions

Related tools

Z-Score Calculator

Significant Figures Calculator

Bacteria Generation Time Calculator

Log Reduction Calculator

Normal Distribution Calculator

Scientific Notation Converter

Rate this tool