Often the concentration of a water quality parameter below a *detection limit* (DL) cannot be measured reliably.

A common approach to dealing with such *non-detects* is to substitute them with the value 0, DL/2 or DL which discards information and introduces biases.

A more reliable approach is to explicitly model the *non-detects* using censored probability distributions.

## Simulated Data

Consider the following log-normally distributed parameter (log mean of 0 and log SD of 1).

With a DL of 0.75 the detects would be distributed as follows.

## Bayesian Analysis

Now let us analyse 1,000 simulated data points using the following Bayesian model.

```
model <- "model{
log_mn ~ dnorm(0, 1)
log_sd ~ dnorm(0, 1) T(0,)
for(i in 1:length(Conc)) {
Conc[i] ~ dlnorm(log_mn, log_sd^-2)
}
}"
```

### Reliable Detection

If there was no detection limit then any differences between the estimated (red) and underlying distributions (black) are due to sampling error.

```
## log_mn log_sd
## -0.03827642 0.96038244
```

### Substitution

If, however, we had a detection limit of 0.75 and we substituted the non-detects for the DL then we would introduce an upwards bias.

```
## log_mn log_sd
## 0.2351745 0.6460452
```

### Missing Values

If we replaced the non-detects with missing values the bias would be even worse.

```
## log_mn log_sd
## 0.5697294 0.6322951
```

### Censored Distribution

But if we also updated the model to include the `interval()`

distribution and a indicator variable specifying which of the missing values are non-detects then the bias vanishes.

```
model <- "model{
log_mn ~ dnorm(0, 1)
log_sd ~ dnorm(0, 1) T(0,)
for(i in 1:length(Conc)) {
Conc[i] ~ dlnorm(log_mn, log_sd^-2)
Indicator[i] ~ dinterval(Conc[i], c(0,0.75))
}
}"
```

```
## log_mn log_sd
## -0.03030919 0.95506565
```

In the current example, the DL is a constant 0.75 but if it varied it would be necessary to pass the value for each observation in a third column.

Bayesian models provide a natural framework for modeling censored data because they readily handle missing values, easily incorporate multiple distributions and describe all the uncertainty.

## Maximum-Likelihood Analysis

However, Bayesian models can be slow to fit and require the specification of prior distributions. Consequently, some users prefer maximum-likelihood (ML) models.

### Censored Distribution

The final censored distribution can be refitted as an ML model using the NADA package in R.

`library(NADA, warn.conflicts = FALSE)`

`## Loading required package: survival`

```
data$Conc[is.na(data$Conc)] <- 0.75
mle <- with(data, cenmle(Conc, Indicator))
```

```
## log_mn log_sd
## -0.02945614 0.95384629
```