Hurdle model

A hurdle model is a class of statistical models where a random variable is modelled using two parts, the first of which is the probability of attaining the value 0, and the second part models the probability of the non-zero values. The use of hurdle models is often motivated by an excess of zeroes in the data that is not sufficiently accounted for in more standard statistical models.

In a hurdle model, a random variable x is modelled as

\Pr(x=0)=\theta

\Pr(x\neq 0)=p_{x\neq 0}(x)

where $p_{x\neq 0}(x)$ is a truncated probability distribution function, truncated at 0.

Hurdle models were introduced by John G. Cragg in 1971,^[1] where the non-zero values of x were modelled using a normal model, and a probit model was used to model the zeros. The probit part of the model was said to model the presence of "hurdles" that must be overcome for the values of x to attain non-zero values, hence the designation hurdle model. Hurdle models were later developed for count data, with Poisson, geometric,^[2] and negative binomial^[3] models for the non-zero counts .

[1]

[2]

[3]

Hurdle model

Relationship with zero-inflated models

See also

References

Wikiwand - on