The following was presented at the College of Applied Biology’s annual conference Evidence Matters: Professional Practice in a Post-Truth World in Victoria, BC on March 3rd 2015. It is also provided below with additional text.

Using P-Values with Confidence

Background

The p-value is perhaps the most ubiquitous statistical index.

It is also the most

misunderstood,
and/or misused,
and/or misaligned

depending on whom you ask.

American Statistical Association

Wasserstein, R.L., and Lazar, N.A. 2016. The ASA’s Statement on P-Values: Context, Process, and Purpose. The American Statistician 70(2): 129–133.

What is a p-value?

A p-value of <0.05 indicates that the 95% confidence interval excludes 0.

What is the utility of a p-value?

P-values are useful because they indicate the confidence with which we can exclude 0.

T-Test

model <- lm(Length~Flow, data = data)
round(summary(model)$coefficients,4)

##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  10.1079     0.1576 64.1487    0.000
## FlowHigh     -0.5200     0.2228 -2.3336    0.025

Principle 1

1. P-values can indicate how incompatible the data are with a specified statistical model.

A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Proper inference requires full reporting and transparency.
P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Significance

In biology p-values of <0.05 are generally considered to be significant.

The End

Conclusion

Flow is a significant predictor of fish length.

Appendix

##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   10.108      0.158  64.149    0.000
## FlowHigh      -0.520      0.223  -2.334    0.025

Principle 2

P-values can indicate how incompatible the data are with a specified statistical model.

2. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.

Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Proper inference requires full reporting and transparency.
P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

A Limitation

p = 1

p = 0.01

Biological Importance

Principle 3

P-values can indicate how incompatible the data are with a specified statistical model.
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.

3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.

Proper inference requires full reporting and transparency.
P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Confidence Interval

Effects Size

Principle 4

P-values can indicate how incompatible the data are with a specified statistical model.
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.

4. Proper inference requires full reporting and transparency.

P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

P-Hacking

Principle 5

P-values can indicate how incompatible the data are with a specified statistical model.
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Proper inference requires full reporting and transparency.

5. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Conditional Probability

Much of the additional confusion around p-values stems from the fact that frequentist methods make statements about data in relation to a model.

However, intuitively what we actually want are (Bayesian) statements about models in relation to the data.

Conclusions

Use p-values with confidence intervals.

Express confidence intervals as effects sizes.

Discuss biological importance.

Don’t p-hack.

Musings of a Computational Biologist

Using P-Values with Confidence

Using P-Values with Confidence

Background

American Statistical Association

What is a p-value?

What is the utility of a p-value?

T-Test

Principle 1

Significance

The End

Conclusion

Appendix

Principle 2

A Limitation

p = 1

p = 0.01

Biological Importance

Principle 3

Confidence Interval

Effects Size

Principle 4

P-Hacking

Principle 5

Conditional Probability

Conclusions

Further Reading