Extra-Poisson Variation with the Negative Binomial Distribution

Joe Thorley · 2018-11-26 · 2 minute read

Poisson Distribution

The Poisson distribution describes the probability of the number of rare independent events given a base rate (\(\lambda\)).

set.seed(101)
n <- 1e+05
lambda <- 1.7
rpois <- rpois(n, lambda)
hist(rpois, breaks = seq(0, ceiling(max(rpois)), by = 1))

The variance (\(\sigma^2\)) and mean (\(\mu\)) of a Poisson distribution are both \(\lambda\)

round(mean(rpois), 1)
## [1] 1.7
round(var(rpois), 1)
## [1] 1.7

And the dispersion index (\(\text{DI}\)), which is defined to be \(\sigma^2 / \mu\), is 1.

Overdispersion

The Poisson distribution is often used in ecology to describe counts of the number of organisms over a given area. However, due to social behaviours such as herding and shoaling, organims are often encountered in groups. Alternatively, organisms may distribute themselves independently with respect to each other but tend to be found in areas with higher resources. If the density of resources cannot be accounted for in the model then the organisms will appear to be clustering together. In both cases \(\text{DI} > 1\) and an over-dispersed Poisson distribution is required.

Negative Binomial Distribution

A common approach is use the negative binomial distribution (NBD) which introduces the \(\phi\) parameter to model the increase in the standard deviation1.

\[\sigma^2 = \lambda + \lambda^2 \cdot \phi\]

phi <- 1
rnbinom <- rnbinom(n, mu = lambda, size = 1/phi)
hist(rnbinom, breaks = seq(0, ceiling(max(rnbinom))))

With \(\lambda = 1.7\) and \(\phi = 1\) the mean is of course unaltered

round(mean(rnbinom), 1)
## [1] 1.7

but \(\text{DI} = 1 + \lambda \cdot \phi\) = 2.7

round(var(rnbinom)/mean(rnbinom), 1)
## [1] 2.7

  1. It is worth being aware that there are many different formulations of the negative binomial distribution