Non-negative least squares

In mathematical optimization, the problem of non-negative least squares (NNLS) is a type of constrained least squares problem where the coefficients are not allowed to become negative. That is, given a matrix $A$ and a (column) vector of response variables $y$ , the goal is to find^[1]

\operatorname {arg\,min} \limits _{\mathbf {x} }\|\mathbf {Ax} -\mathbf {y} \|_{2}^{2}

subject to

x \geq 0

Here $x \geq 0$ means that each component of the vector $x$ should be non-negative, and $‖\cdot‖ 2$ denotes the Euclidean norm.

Non-negative least squares problems turn up as subproblems in matrix decomposition, e.g. in algorithms for PARAFAC^[2] and non-negative matrix/tensor factorization.^[3]^[4] The latter can be considered a generalization of NNLS.^[1]

Another generalization of NNLS is bounded-variable least squares (BVLS), with simultaneous upper and lower bounds $α i \leq x i \leq β i$ .^[5]^: 291^[6]

The first widely used algorithm for solving this problem is an active-set method published by Lawson and Hanson in their 1974 book Solving Least Squares Problems.^[5]^: 291 In pseudocode, this algorithm looks as follows:^[1]^[2]

Inputs:
- a real-valued matrix $A$ of dimension $m \times n$ ,
- a real-valued vector $y$ of dimension $m$ ,
- a real value $ε$ , the tolerance for the stopping criterion.
Initialize:
- Set $P = \emptyset$ .
- Set $R = {1, ..., n$ }.
- Set $x$ to an all-zero vector of dimension $n$ .
- Set $w = A T (y - A x)$ .
- Let $w R$ denote the sub-vector with indexes from R
Main loop: while R ≠ ∅ and max(w^R) > ε:
- Let $j$ in $R$ be the index of $max(w R)$ in $w$ .
- Add $j$ to $P$ .
- Remove $j$ from $R$ .
- Let $A P$ be $A$ restricted to the variables included in $P$ .
- Let $s$ be vector of same length as $x$ . Let $s P$ denote the sub-vector with indexes from P, and let $s R$ denote the sub-vector with indexes from R.
- Set $s P = ((A P) T A P) -1 (A P) T y$
- Set $s R$ to zero
- While min(s^P) ≤ 0:
  - Let $α = min .mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num{display:block;line-height:1em;margin:0.0em 0.1em;border-bottom:1px solid}.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0.1em 0.1em}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}⁠xi/xi − si⁠ for i in P where si ≤ 0$ .
  - Set $x$ to $x + α (s - x)$ .
  - Move to $R$ all indices $j$ in $P$ such that $x j \leq 0$ .
  - Set $s P = ((A P) T A P) -1 (A P) T y$
  - Set $s R$ to zero.
- Set $x$ to $s$ .
- Set $w$ to $A T (y - A x)$ .
Output: x

This algorithm takes a finite number of steps to reach a solution and smoothly improves its candidate solution as it goes (so it can find good approximate solutions when cut off at a reasonable number of iterations), but is very slow in practice, owing largely to the computation of the pseudoinverse $((A P) T A P) -1$ .^[1] Variants of this algorithm are available in MATLAB as the routine lsqnonneg^[8]^[1] and in SciPy as optimize.nnls.^[9]

Many improved algorithms have been suggested since 1974.^[1] Fast NNLS (FNNLS) is an optimized version of the Lawson–Hanson algorithm.^[2] Other algorithms include variants of Landweber's gradient descent method,^[10] coordinate-wise optimization based on the quadratic programming problem above^[7], and an active set method called TNT-NN.^[11]

Non-negative least squares

Quadratic programming version

Algorithms

See also

References

Wikiwand - on