# Prior probability

## Distribution of an uncertain quantity / From Wikipedia, the free encyclopedia

#### Dear Wikiwand AI, let's keep it short by simply answering these key questions:

Can you list the top facts and stats about Prior distribution?

Summarize this article for a 10 year old

A **prior probability distribution** of an uncertain quantity, often simply called the **prior**, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable.

In Bayesian statistics, Bayes' rule prescribes how to update the prior with new information to obtain the posterior probability distribution, which is the conditional distribution of the uncertain quantity given new data. Historically, the choice of priors was often constrained to a conjugate family of a given likelihood function, for that it would result in a tractable posterior of the same family. The widespread availability of Markov chain Monte Carlo methods, however, has made this less of a concern.

There are many ways to construct a prior distribution.^{[1]} In some cases, a prior may be determined from past information, such as previous experiments. A prior can also be *elicited* from the purely subjective assessment of an experienced expert.^{[2]}^{[3]} When no information is available, an **uninformative prior** may be adopted as justified by the principle of indifference.^{[4]}^{[5]} In modern applications, priors are also often chosen for their mechanical properties, such as regularization and feature selection.^{[6]}^{[7]}^{[8]}

The prior distributions of model parameters will often depend on parameters of their own. Uncertainty about these hyperparameters can, in turn, be expressed as hyperprior probability distributions. For example, if one uses a beta distribution to model the distribution of the parameter *p* of a Bernoulli distribution, then:

*p*is a parameter of the underlying system (Bernoulli distribution), and*α*and*β*are parameters of the prior distribution (beta distribution); hence*hyper*parameters.

In principle, priors can be decomposed into many conditional levels of distributions, so-called *hierarchical priors*.^{[9]}