# Backpropagation

## Optimization algorithm for artificial neural networks / From Wikipedia, the free encyclopedia

#### Dear Wikiwand AI, let's keep it short by simply answering these key questions:

Can you list the top facts and stats about Backpropagation?

Summarize this article for a 10 year old

In machine learning, **backpropagation** is a gradient estimation method used to train neural network models. The gradient estimate is used by the optimization algorithm to compute the network parameter updates.

Part of a series on |

Machine learning and data mining |
---|

Problems |

Learning with humans |

Model diagnostics |

It is an efficient application of the Leibniz chain rule (1673)[1] to such networks.[2] It is also known as the reverse mode of automatic differentiation or reverse accumulation, due to Seppo Linnainmaa (1970).[3][4][5][6][7][8][9] The term "back-propagating error correction" was introduced in 1962 by Frank Rosenblatt,[10][2] but he did not know how to implement this, even though Henry J. Kelley had a continuous precursor of backpropagation[11] already in 1960 in the context of control theory.[2]

Backpropagation computes the gradient of a loss function with respect to the weights of the network for a single input–output example, and does so efficiently, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this can be derived through dynamic programming.[11][12][13] Gradient descent, or variants such as stochastic gradient descent,[14] are commonly used.

Strictly the term *backpropagation* refers only to the algorithm for computing the gradient, not how the gradient is used; but the term is often used loosely to refer to the entire learning algorithm – including how the gradient is used, such as by stochastic gradient descent.[15] In 1986 David E. Rumelhart et al. published an experimental analysis of the technique.[16] This contributed to the popularization of backpropagation and helped to initiate an active period of research in multilayer perceptrons.

Oops something went wrong: