Top Qs
Timeline
Chat
Perspective
Algebraic statistics
From Wikipedia, the free encyclopedia
Remove ads
Algebraic statistics is a branch of mathematical statistics that focuses on the use of algebraic, geometric, and combinatorial methods in statistics. While the use of these methods has a long history in statistics, algebraic statistics is continuously forging new interdisciplinary connections.
This growing field has established itself squarely at the intersection of several areas of mathematics, including, for instance, multilinear algebra, commutative algebra, algebraic geometry, convex geometry, combinatorics, theoretical problems in statistics, and their practical applications. For example, algebraic statistics has been useful for experimental design, parameter estimation, and hypothesis testing.
Remove ads
History
Summarize
Perspective
Algebraic statistics can be traced back to Karl Pearson, who used polynomial algebra to study Gaussian mixture models. Subsequently, Ronald A. Fisher, Henry B. Mann, and Rosemary A. Bailey applied Abelian groups to the design of experiments. Experimental designs were also studied with affine geometry over finite fields and then with the introduction of association schemes by R. C. Bose. Orthogonal arrays were introduced by C. R. Rao also for experimental designs.
The field experienced a major revitalization in the 1990s. In 1998, Diaconis and Sturmfels introduced Gröbner bases for constructing Markov chain Monte Carlo algorithms for conditional sampling from discrete exponential families. Pistone and Wynn, in 1996, applied computational commutative algebra to the design and analysis of experiments, providing new tools for understanding confounding and identifiability in complex experimental settings. These works, along with the monograph by Giovanni Pistone, Eva Riccomagno, and Henry P. Wynn, in which the term “algebraic statistics” was first used, played a pivotal role in establishing this field as a unified area of research.
Modern researchers in algebraic statistics explore a wide range of topics, including computational biology, graphical models, and statistical learning.
Remove ads
Active Research Areas
Summarize
Perspective
Phylogenetics
Maximum likelihood estimation
Method of moments
Graphical models
Tropical statistics
Statistical learning theory
Algebraic geometry has also recently found applications to statistical learning theory, including a generalization of the Akaike information criterion to singular statistical models.[1]
Other topics
Algebraic analysis and abstract statistical inference
Invariant measures on locally compact groups have long been used in statistical theory, particularly in multivariate analysis. Beurling's factorization theorem and much of the work on (abstract) harmonic analysis sought better understanding of the Wold decomposition of stationary stochastic processes, which is important in time series statistics.
Encompassing previous results on probability theory on algebraic structures, Ulf Grenander developed a theory of "abstract inference". Grenander's abstract inference and his theory of patterns are useful for spatial statistics and image analysis; these theories rely on lattice theory.
Partially ordered sets and lattices
Partially ordered vector spaces and vector lattices are used throughout statistical theory. Garrett Birkhoff metrized the positive cone using Hilbert's projective metric and proved Jentsch's theorem using the contraction mapping theorem.[2] Birkhoff's results have been used for maximum entropy estimation (which can be viewed as linear programming in infinite dimensions) by Jonathan Borwein and colleagues.
Vector lattices and conical measures were introduced into statistical decision theory by Lucien Le Cam.
Remove ads
Introductory Example
Summarize
Perspective
Consider a random variable X which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities
and these numbers satisfy
Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable X with the tuple .
Now suppose X is a binomial random variable with parameter q and n = 2, i.e. X represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of q. Then
and it is not hard to show that the tuples which arise in this way are precisely the ones satisfying
The latter is a polynomial equation defining an algebraic variety (or surface) in , and this variety, when intersected with the simplex given by
yields a piece of an algebraic curve which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter q amounts to locating one point on this curve; testing the hypothesis that a given variable X is Bernoulli amounts to testing whether a certain point lies on that curve or not.
Remove ads
References
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads