# Stochastic gradient descent

## Optimization algorithm / From Wikipedia, the free encyclopedia

#### Dear Wikiwand AI, let's keep it short by simply answering these key questions:

Can you list the top facts and stats about Stochastic gradient descent?

Summarize this article for a 10 year old

**Stochastic gradient descent** (often abbreviated **SGD**) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data). Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in exchange for a lower convergence rate.^{[1]}

The basic idea behind stochastic approximation can be traced back to the Robbins–Monro algorithm of the 1950s. Today, stochastic gradient descent has become an important optimization method in machine learning.^{[2]}