The solution for the problem is:

where:

We can give this result the interpretation, that Z part of the premium is based on the information that we have about the specific risk, and (1-Z) part is based on the information that we have about the whole population.
Proof
The following proof is slightly different from the one in the original paper. It is also more general, because it considers all linear estimators, while original proof considers only estimators based on average claim.[2]
- Lemma. The problem can be stated alternatively as:
![{\displaystyle f=\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-m(\vartheta )\right)^{2}\right]\to \min }](//wikimedia.org/api/rest_v1/media/math/render/svg/8be9fffcc0f7973c35516392fdc65093bc4445c2)
Proof:
![{\displaystyle {\begin{aligned}\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-m(\vartheta )\right)^{2}\right]&=\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)^{2}\right]+\mathbb {E} \left[\left(m(\vartheta )-\Pi \right)^{2}\right]-2\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)\left(m(\vartheta )-\Pi \right)\right]\\&=\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)^{2}\right]+\mathbb {E} \left[\left(m(\vartheta )-\Pi \right)^{2}\right]\end{aligned}}}](//wikimedia.org/api/rest_v1/media/math/render/svg/2c1097237ddb9ebf2f0196333d4647a1b2cc97af)
The last equation follows from the fact that
![{\displaystyle {\begin{aligned}\mathbb {E} \left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)\left(m(\vartheta )-\Pi \right)\right]&=\mathbb {E} _{\Theta }\left[\mathbb {E} _{X}\left.\left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)(m(\vartheta )-\Pi )\right|X_{i1},\ldots ,X_{im}\right]\right]\\&=\mathbb {E} _{\Theta }\left[\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-\Pi \right)\left[\mathbb {E} _{X}\left[(m(\vartheta )-\Pi )|X_{i1},\ldots ,X_{im}\right]\right]\right]\\&=0\end{aligned}}}](//wikimedia.org/api/rest_v1/media/math/render/svg/bb00af5d102e9463a2b86fc78b44fab1ea40ec10)
We are using here the law of total expectation and the fact, that ![{\displaystyle \Pi =\mathbb {E} [m(\vartheta )|X_{i1},\ldots ,X_{im}].}](//wikimedia.org/api/rest_v1/media/math/render/svg/620217d4422158509323e1511c7ce48d13f8d6fd)
In our previous equation, we decompose minimized function in the sum of two expressions. The second expression does not depend on parameters used in minimization. Therefore, minimizing the function is the same as minimizing the first part of the sum.
Let us find critical points of the function
![{\displaystyle {\frac {1}{2}}{\frac {\partial f}{\partial a_{i0}}}=\mathbb {E} \left[a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-m(\vartheta )\right]=a_{i0}+\sum _{j=1}^{m}a_{ij}\mathbb {E} (X_{ij})-\mathbb {E} (m(\vartheta ))=a_{i0}+\left(\sum _{j=1}^{m}a_{ij}-1\right)\mu }](//wikimedia.org/api/rest_v1/media/math/render/svg/43da49ffdaec023f76b4a9177d57ec3b25000102)

For
we have:
![{\displaystyle {\frac {1}{2}}{\frac {\partial f}{\partial a_{ik}}}=\mathbb {E} \left[X_{ik}\left(a_{i0}+\sum _{j=1}^{m}a_{ij}X_{ij}-m(\vartheta )\right)\right]=\mathbb {E} \left[X_{ik}\right]a_{i0}+\sum _{j=1,j\neq k}^{m}a_{ij}\mathbb {E} [X_{ik}X_{ij}]+a_{ik}\mathbb {E} [X_{ik}^{2}]-\mathbb {E} [X_{ik}m(\vartheta )]=0}](//wikimedia.org/api/rest_v1/media/math/render/svg/9ca5557712ee024b29c8e321c388f898af01b94f)
We can simplify derivative, noting that:
![{\displaystyle {\begin{aligned}\mathbb {E} [X_{ij}X_{ik}]&=\mathbb {E} \left[\mathbb {E} [X_{ij}X_{ik}|\vartheta ]\right]=\mathbb {E} [{\text{cov}}(X_{ij}X_{ik}|\vartheta )+\mathbb {E} (X_{ij}|\vartheta )\mathbb {E} (X_{ik}|\vartheta )]=\mathbb {E} [(m(\vartheta ))^{2}]=v^{2}+\mu ^{2}\\\mathbb {E} [X_{ik}^{2}]&=\mathbb {E} \left[\mathbb {E} [X_{ik}^{2}|\vartheta ]\right]=\mathbb {E} [s^{2}(\vartheta )+(m(\vartheta ))^{2}]=\sigma ^{2}+v^{2}+\mu ^{2}\\\mathbb {E} [X_{ik}m(\vartheta )]&=\mathbb {E} [\mathbb {E} [X_{ik}m(\vartheta )|\Theta _{i}]=\mathbb {E} [(m(\vartheta ))^{2}]=v^{2}+\mu ^{2}\end{aligned}}}](//wikimedia.org/api/rest_v1/media/math/render/svg/fac1d5cae6dd90e1a8a36783973e1eb54f2020cf)
Taking above equations and inserting into derivative, we have:


Right side doesn't depend on k. Therefore, all
are constant

From the solution for
we have

Finally, the best estimator is
