Calculus of variations
Differential calculus on function spaces From Wikipedia, the free encyclopedia
Differential calculus on function spaces From Wikipedia, the free encyclopedia
The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers.[lower-alpha 1] Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.
A simple example of such a problem is to find the curve of shortest length connecting two points. If there are no constraints, the solution is a straight line between the points. However, if the curve is constrained to lie on a surface in space, then the solution is less obvious, and possibly many solutions may exist. Such solutions are known as geodesics. A related problem is posed by Fermat's principle: light follows the path of shortest optical length connecting two points, which depends upon the material of the medium. One corresponding concept in mechanics is the principle of least/stationary action.
Many important problems involve functions of several variables. Solutions of boundary value problems for the Laplace equation satisfy the Dirichlet's principle. Plateau's problem requires finding a surface of minimal area that spans a given contour in space: a solution can often be found by dipping a frame in soapy water. Although such experiments are relatively easy to perform, their mathematical formulation is far from simple: there may be more than one locally minimizing surface, and they may have non-trivial topology.
The calculus of variations may be said to begin with Newton's minimal resistance problem in 1687, followed by the brachistochrone curve problem raised by Johann Bernoulli (1696).[2] It immediately occupied the attention of Jacob Bernoulli and the Marquis de l'Hôpital, but Leonhard Euler first elaborated the subject, beginning in 1733. Lagrange was influenced by Euler's work to contribute significantly to the theory. After Euler saw the 1755 work of the 19-year-old Lagrange, Euler dropped his own partly geometric approach in favor of Lagrange's purely analytic approach and renamed the subject the calculus of variations in his 1756 lecture Elementa Calculi Variationum.[3][4][lower-alpha 2]
Legendre (1786) laid down a method, not entirely satisfactory, for the discrimination of maxima and minima. Isaac Newton and Gottfried Leibniz also gave some early attention to the subject.[5] To this discrimination Vincenzo Brunacci (1810), Carl Friedrich Gauss (1829), Siméon Poisson (1831), Mikhail Ostrogradsky (1834), and Carl Jacobi (1837) have been among the contributors. An important general work is that of Sarrus (1842) which was condensed and improved by Cauchy (1844). Other valuable treatises and memoirs have been written by Strauch (1849), Jellett (1850), Otto Hesse (1857), Alfred Clebsch (1858), and Lewis Buffett Carll (1885), but perhaps the most important work of the century is that of Weierstrass. His celebrated course on the theory is epoch-making, and it may be asserted that he was the first to place it on a firm and unquestionable foundation. The 20th and the 23rd Hilbert problem published in 1900 encouraged further development.[5]
In the 20th century David Hilbert, Oskar Bolza, Gilbert Ames Bliss, Emmy Noether, Leonida Tonelli, Henri Lebesgue and Jacques Hadamard among others made significant contributions.[5] Marston Morse applied calculus of variations in what is now called Morse theory.[6] Lev Pontryagin, Ralph Rockafellar and F. H. Clarke developed new mathematical tools for the calculus of variations in optimal control theory.[6] The dynamic programming of Richard Bellman is an alternative to the calculus of variations.[7][8][9][lower-alpha 3]
The calculus of variations is concerned with the maxima or minima (collectively called extrema) of functionals. A functional maps functions to scalars, so functionals have been described as "functions of functions." Functionals have extrema with respect to the elements of a given function space defined over a given domain. A functional is said to have an extremum at the function if has the same sign for all in an arbitrarily small neighborhood of [lower-alpha 4] The function is called an extremal function or extremal.[lower-alpha 5] The extremum is called a local maximum if everywhere in an arbitrarily small neighborhood of and a local minimum if there. For a function space of continuous functions, extrema of corresponding functionals are called strong extrema or weak extrema, depending on whether the first derivatives of the continuous functions are respectively all continuous or not.[11]
Both strong and weak extrema of functionals are for a space of continuous functions but strong extrema have the additional requirement that the first derivatives of the functions in the space be continuous. Thus a strong extremum is also a weak extremum, but the converse may not hold. Finding strong extrema is more difficult than finding weak extrema.[12] An example of a necessary condition that is used for finding weak extrema is the Euler–Lagrange equation.[13][lower-alpha 6]
Finding the extrema of functionals is similar to finding the maxima and minima of functions. The maxima and minima of a function may be located by finding the points where its derivative vanishes (i.e., is equal to zero). The extrema of functionals may be obtained by finding functions for which the functional derivative is equal to zero. This leads to solving the associated Euler–Lagrange equation.[lower-alpha 7]
Consider the functional where
If the functional attains a local minimum at and is an arbitrary function that has at least one derivative and vanishes at the endpoints and then for any number close to 0,
The term is called the variation of the function and is denoted by [1][lower-alpha 8]
Substituting for in the functional the result is a function of
Since the functional has a minimum for the function has a minimum at and thus,[lower-alpha 9]
Taking the total derivative of where and are considered as functions of rather than yields and because and
Therefore, where when and we have used integration by parts on the second term. The second term on the second line vanishes because at and by definition. Also, as previously mentioned the left side of the equation is zero so that
According to the fundamental lemma of calculus of variations, the part of the integrand in parentheses is zero, i.e. which is called the Euler–Lagrange equation. The left hand side of this equation is called the functional derivative of and is denoted
In general this gives a second-order ordinary differential equation which can be solved to obtain the extremal function The Euler–Lagrange equation is a necessary, but not sufficient, condition for an extremum A sufficient condition for a minimum is given in the section Variations and sufficient condition for a minimum.
In order to illustrate this process, consider the problem of finding the extremal function which is the shortest curve that connects two points and The arc length of the curve is given by with Note that assuming y is a function of x loses generality; ideally both should be a function of some other parameter. This approach is good solely for instructive purposes.
The Euler–Lagrange equation will now be used to find the extremal function that minimizes the functional with
Since does not appear explicitly in the first term in the Euler–Lagrange equation vanishes for all and thus, Substituting for and taking the derivative,
Thus for some constant Then where Solving, we get which implies that is a constant and therefore that the shortest curve that connects two points and is and we have thus found the extremal function that minimizes the functional so that is a minimum. The equation for a straight line is In other words, the shortest distance between two points is a straight line.[lower-alpha 10]
In physics problems it may be the case that meaning the integrand is a function of and but does not appear separately. In that case, the Euler–Lagrange equation can be simplified to the Beltrami identity[16] where is a constant. The left hand side is the Legendre transformation of with respect to
The intuition behind this result is that, if the variable is actually time, then the statement implies that the Lagrangian is time-independent. By Noether's theorem, there is an associated conserved quantity. In this case, this quantity is the Hamiltonian, the Legendre transform of the Lagrangian, which (often) coincides with the energy of the system. This is (minus) the constant in Beltrami's identity.
If depends on higher-derivatives of that is, if then must satisfy the Euler–Poisson equation,[17]