Non-uniform random variate generation

Finite discrete distributions

Summarize

Perspective

For a discrete probability distribution with a finite number n of indices at which the probability mass function f takes non-zero values, the basic sampling algorithm is straightforward. The interval [0, 1) is divided in n intervals [0, f(1)), [f(1), f(1) + f(2)), ... The width of interval i equals the probability f(i). One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).

Formalizing this idea becomes easier by using the cumulative distribution function

F(i)=\sum _{j=1}^{i}f(j).

It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).

This can be done by different algorithms:

Linear search, computational time linear in n.
Binary search, computational time goes with log n.
Indexed search,^[2] also called the cutpoint method.^[3]
Alias method, computational time is constant, using some pre-computed tables.
There are other methods that cost constant time.^[4]

Remove ads

Continuous distributions

Summarize

Perspective

Generic methods for generating independent samples:

Rejection sampling for arbitrary density functions
Inverse transform sampling for distributions whose CDF is known
Ratio of uniforms, combining a change of variables and rejection sampling
Slice sampling
Ziggurat algorithm, for monotonically decreasing density functions as well as symmetric unimodal distributions
Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one or more existing sampling methods to generate more involved distributions.

Generic methods for generating correlated samples (often necessary for unusually-shaped or high-dimensional distributions):

Markov chain Monte Carlo, the general principle
Metropolis–Hastings algorithm
Gibbs sampling
Slice sampling
Reversible-jump Markov chain Monte Carlo, when the number of dimensions is not fixed (e.g. when estimating a mixture model and simultaneously estimating the number of mixture components)
Particle filters, when the observed data is connected in a Markov chain and should be processed sequentially

For generating a normal distribution:

For generating a Poisson distribution:

See Poisson distribution#Generating Poisson-distributed random variables

Remove ads

Library	Beta	Binomial	Cauchy	Chi-squared	Dirichlet	Exponential	F	Gamma	Geometric	Gumbel	Hypergeometric	Laplace	Logistic	Log-normal	Logarithmic	Multinomial	Multivariate hypergeometric	Multivariate normal	Negative binomial	Noncentral chi-squared	Noncentral F	Normal	Pareto	Poisson	Power	Rayleigh	Students's t	Triangular	von Mises	Wald	Zeta
NumPy	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
GNU Scientific Library^[5]	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes	No	No	Yes	Yes	Yes	?	Yes	Yes	No	No	No	No

Non-uniform random variate generation

Finite discrete distributions

Continuous distributions

Software libraries

See also

Footnotes

Literature

Wikiwand - on