Given two random variables that are defined on the same probability space,[1] the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. The joint distribution encodes the marginal distributions, i.e. the distributions of each of the individual random variables. It also encodes the conditional probability distributions, which deal with how the outputs of one random variable are distributed when given information on the outputs of the other random variable(s).

${\displaystyle X}$
${\displaystyle Y}$
${\displaystyle p(X)}$
${\displaystyle p(Y)}$
Many sample observations (black) are shown from a joint probability distribution. The marginal densities are shown as well.

In the formal mathematical setup of measure theory, the joint distribution is given by the pushforward measure, by the map obtained by pairing together the given random variables, of the sample space's probability measure.

In the case of real-valued random variables, the joint distribution, as a particular multivariate distribution, may be expressed by a multivariate cumulative distribution function, or by a multivariate probability density function together with a multivariate probability mass function. In the special case of continuous random variables, it is sufficient to consider probability density functions, and in the case of discrete random variables, it is sufficient to consider probability mass functions.