LazyTools Header
P-Value Calculator — Z-Test One & Two-Tailed | LazyTools
Math & Science

P-Value Calculator (Z-Test)

Calculate the p-value from a Z-score (or standard normal test statistic) for one-tailed or two-tailed hypothesis tests. Furthermore, results are automatically evaluated at α = 0.05, 0.01, and 0.001 significance thresholds with a plain-English interpretation.

✓ Free forever ✓ No login required ✓ Works offline ✓ Instant results ✓ Step-by-step shown

How to use the P-Value Calculator (Z-Test)

1
Enter the Z-score

Type the standardised test statistic. Furthermore, for a one-sample Z-test: Z = (x̄ − µ₀)/(σ/√n). For comparing proportions: Z = (p̂₁ − p̂₂)/SE. Large |Z| values give small p-values.

2
Select the test type

Two-tailed (H₁: µ ≠ µ₀): p = 2 × P(Z > |z|). One-tailed upper (H₁: µ > µ₀): p = P(Z > z). Furthermore, one-tailed lower: p = P(Z < z). Choose the tail based on your alternative hypothesis, not on what looks significant.

3
Click Calculate

p-value appears alongside significance at three standard alpha thresholds. Moreover, the insight explains the correct interpretation of p — it is not the probability that H₀ is true.

4
Interpret correctly

p < 0.05: reject H₀ at 5% level (statistically significant). p < 0.01: highly significant. Furthermore, p ≥ 0.05 means insufficient evidence to reject H₀, not that H₀ is true.

5
Apply to Z-score from data

Z = (x̄ − µ₀)/(σ/√n) requires known population σ. Furthermore, for unknown σ, use the t-distribution with n−1 degrees of freedom — the t-distribution approaches Z for large n.

Variants, options and when to use each

Z-scoreTwo-tailed pInterpretation
1.6450.100Significant at α = 0.10 only
1.9600.050Critical value for α = 0.05
2.3260.020Significant at α = 0.02
2.5760.010Critical value for α = 0.01
3.2910.001Critical value for α = 0.001

The formula explained

p (two-tailed) = 2 × [1 − Φ(|Z|)] | p (one-tailed) = 1 − Φ(Z)
p = p-value — probability of Z this extreme or more under H₀
Φ(Z) = standard normal CDF = P(Z ≤ z)
Z = standardised test statistic
α = significance level (threshold for rejecting H₀)

The p-value is the probability of obtaining a test statistic as extreme or more extreme than the observed value, assuming H₀ is true. Furthermore, for a two-tailed test, both tails are included: p = 2 × P(Z > |z|). For a one-tailed test, only the relevant tail is used. Moreover, the p-value does NOT measure the probability that H₀ is true — this is one of the most common misinterpretations in statistics.

Worked example — Z = 2.5, two-tailed test

StepCalculationResult
Φ(2.5)P(Z ≤ 2.5)0.9938
P(Z > 2.5)1 − 0.99380.00621
Two-tailed p2 × 0.00621p = 0.01242
Significant at α=0.05?0.01242 < 0.05Yes
Z = 2.5 gives p = 0.0124 (two-tailed) — significant at α = 0.05 and α = 0.02, but not at α = 0.01. Furthermore, this means that if H₀ were true, we would observe |Z| ≥ 2.5 only 1.24% of the time by chance. Moreover, with α = 0.05, we reject H₀ and conclude the observed difference is statistically significant.

What is a p-value in statistics?

The p-value is the probability, under the null hypothesis H₀, of observing a test statistic as extreme or more extreme than the one calculated from the data. Furthermore, a small p-value means the observed data is unlikely under H₀ — providing evidence against H₀. The conventional significance level α = 0.05 means we reject H₀ when p < 0.05.

P-values are derived from the null distribution of the test statistic — in this calculator, the standard normal (Z) distribution. Moreover, the Z-test applies when the population standard deviation σ is known and the sample size is large (n > 30). For unknown σ and small samples, the t-distribution should be used instead.

The p-value does not equal the probability that H₀ is true. Additionally, this is the most common misinterpretation. A p-value is a conditional probability — conditioned on H₀ being true. Bayesian posterior probabilities P(H₀|data) require a prior P(H₀) and are mathematically distinct from p-values.

Who uses this calculator?

Scientists and researchers use p-values in hypothesis testing across all quantitative fields — clinical trials, psychology experiments, biology, and engineering. Furthermore, regulatory bodies (FDA, EMA) require p < 0.05 as a minimum threshold for drug efficacy claims. Quality engineers use p-values in process capability testing and SPC (statistical process control). Moreover, data scientists use p-values in A/B testing for website and product optimisation.

Historical context and related concepts

The p-value was introduced by Karl Pearson in the early 1900s and formalised by Ronald Fisher in "Statistical Methods for Research Workers" (1925). Furthermore, Fisher proposed p < 0.05 as a conventional threshold for significance. Neyman and Pearson (1933) formalised the complementary framework of Type I and Type II errors. Moreover, the p < 0.05 threshold has been debated extensively — the American Statistical Association issued a statement (2016) cautioning against over-reliance on p-values alone.

Why p-value calculation is central to scientific inference and drug approval

Regulatory approval of new drugs requires randomised controlled trials with p < 0.05 (often with multiple hypothesis correction). Furthermore, a new drug showing efficacy with p = 0.01 in a Phase III trial provides strong evidence for approval. Moreover, p-values guide resource allocation in research — statistically significant findings receive further investigation and publication.

P-values and the replication crisis in science

Many scientific fields have experienced a "replication crisis" — published findings with p < 0.05 often fail to replicate. Furthermore, this arises from p-hacking (selectively reporting significant results), small sample sizes (underpowered studies), and publication bias. Moreover, the solution includes pre-registration of hypotheses, larger samples, reporting effect sizes alongside p-values, and raising significance thresholds (some researchers advocate p < 0.005 as the new standard for discovery claims).

Frequently asked questions

α is the pre-specified probability of Type I error (falsely rejecting H₀) — typically 0.05. The p-value is calculated from the data. Furthermore, we reject H₀ when p < α. α is set before looking at data; p-value is computed from it. Moreover, reporting p = 0.049 and p = 0.051 as fundamentally different (one "significant", one not) is statistically inappropriate.
Two-tailed means the alternative hypothesis is H₁: µ ≠ µ₀ — the mean could be above or below µ₀. Furthermore, both tails of the distribution are included: p = 2 × P(Z > |z|). One-tailed (upper) is H₁: µ > µ₀; one-tailed (lower) is H₁: µ < µ₀. Moreover, the choice of tail must be made before seeing data — switching to a one-tailed test after observing the direction is p-hacking.
Z-test uses the standard normal distribution and requires known population σ. T-test uses the t-distribution with n−1 degrees of freedom when σ is estimated from data. Furthermore, for large n (> 30), the t-distribution approximates Z — results are nearly identical. Moreover, for small samples from non-normal populations, neither Z nor t provides exact p-values — bootstrap methods may be preferred.
When testing many hypotheses simultaneously, the probability of at least one false positive increases. Furthermore, Bonferroni correction: use α/m instead of α, where m is the number of tests. For 20 tests at α = 0.05: individual threshold = 0.05/20 = 0.0025. Moreover, false discovery rate (FDR) control (Benjamini-Hochberg) is less conservative and preferred for large-scale genomics and proteomics studies.
For large samples (n > 30), Z and t give very similar results — this calculator approximates t-test p-values. Furthermore, for exact t-test p-values with small samples, use a t-distribution with n−1 degrees of freedom. Moreover, the critical values differ most at low n: t₀.₀₅ with 5 df = 2.571 vs Z₀.₀₅ = 1.960.

Related tools

Z-Score Calculator

Calculate Z-score from a value, mean, and standard deviation. Furthermore, Z-score is the input to this p-value calculator.

Significant Figures Calculator

Round p-values appropriately — report to 2–3 significant figures. Moreover, exact p-values are more informative than dichotomous "significant/not significant".

Bacteria Generation Time Calculator

Statistical analysis of growth data. Furthermore, biological experiments use p-values to test treatment effects on growth rates.

Log Reduction Calculator

Statistical validation of disinfection log reductions. Moreover, disinfectant efficacy claims require statistically significant p-values.

Normal Distribution Calculator

Convert Z to percentile and p-value. Furthermore, p-value = 1 − Φ(Z) for one-tailed upper test.

Scientific Notation Converter

Very small p-values (p = 1.2×10⁻⁸) require scientific notation. Moreover, genomic association studies regularly report p-values below 10⁻⁷.

Rate this tool

4.6
out of 5
486 ratings
5 ★
73%
4 ★
16%
3 ★
9%
2 ★
1%
1 ★
1%
How useful was this tool?