Notes on the Negative Binomial Distribution in Ecology

Joe Thorley · 2018-08-06 · 2 minute read

There are many different formulations of the negative binomial distribution (NBD).

Default Formulation

In the default formulation, it is the number of failures to occur in a series of Bernoulli trials before a target number of successes is reached.

For example consider the number of failures that occur before 10 successes (\(N\)) are achieved where the probability of success (\(\rho\)) is 0.5.

With R this can be achieved as follows.

N <- 10
rho <- 0.5
hist(rnbinom(1e+05, size = N, prob = rho))

Ecological Formulation

A common use of the NBD in ecology is to model over-dispersion in count data. In this situation a more useful formulation is in terms of the mean number of counts (\(\mu\)) and the dispersion (\(\phi\)) where the standard deviation of the distribution is given by

\[\sigma = \sqrt{\mu + \mu^2 \cdot \phi}\]

It is worth noting that if \(\phi = 0\) then the NBD is equivalent to the Poisson distribution.

R

In R, the alternative ecological formulation is parameterised in terms of size (\(1/\phi\))

mu <- 100
phi <- 10
x <- rnbinom(1e+05, mu = mu, size = 1/phi)
mean(x)
## [1] 99.69504
sd(x)
## [1] 315.0446

TMB

In TMB, the default formulation dnbinom() follows the default convention of using \(N\) (size) and \(\rho\) (prob). However, an alternative formulation, dnbinom2(), uses the mean (\(\mu\)) and variance (\(\sigma^2\)). To reparameterise in terms of \(\mu\) and \(\phi\), the variance should be set to be \(\mu + \mu^2 \cdot \phi\).

STAN

In STAN, there is a formulation neg_binomial_2() which uses the alternative R formulation except the second parameter (which is equivalent to \(\phi^{-1}\)) is called phi.

For examples of the NBD in action see https://github.com/joethorley/bioRxiv-028274/blob/master/model-lek.R.

JAGS

In JAGS, the only formulation is the default formulation but with \(N\) (refered to as \(r\)) and \(\rho\) (refered to as \(p\)) switched to give dnegbin(p, r).

In order to reparameterise the default formulation in terms of \(\mu\) and \(\phi\), \(r\) should be \(1/\phi\) and \(p\) should be \(r/(r + \mu)\).