Squared deviations from the mean

In the situation where data is available for k different treatment groups having size n_i where i varies from 1 to k, then it is assumed that the expected mean of each group is

\operatorname {E} (\mu _{i})=\mu +T_{i}

and the variance of each treatment group is unchanged from the population variance $\sigma ^{2}$ .

Under the Null Hypothesis that the treatments have no effect, then each of the $T_{i}$ will be zero.

It is now possible to calculate three sums of squares:

Individual

I=\sum x^{2}

\operatorname {E} (I)=n\sigma ^{2}+n\mu ^{2}

Treatments

T=\sum _{i=1}^{k}\left(\left(\sum x\right)^{2}/n_{i}\right)

\operatorname {E} (T)=k\sigma ^{2}+\sum _{i=1}^{k}n_{i}(\mu +T_{i})^{2}

\operatorname {E} (T)=k\sigma ^{2}+n\mu ^{2}+2\mu \sum _{i=1}^{k}(n_{i}T_{i})+\sum _{i=1}^{k}n_{i}(T_{i})^{2}

Under the null hypothesis that the treatments cause no differences and all the $T_{i}$ are zero, the expectation simplifies to

\operatorname {E} (T)=k\sigma ^{2}+n\mu ^{2}.

Combination

C=\left(\sum x\right)^{2}/n

\operatorname {E} (C)=\sigma ^{2}+n\mu ^{2}

Sums of squared deviations

Under the null hypothesis, the difference of any pair of I, T, and C does not contain any dependency on $\mu$ , only $\sigma ^{2}$ .

\operatorname {E} (I-C)=(n-1)\sigma ^{2}

total squared deviations aka total sum of squares

\operatorname {E} (T-C)=(k-1)\sigma ^{2}

treatment squared deviations aka explained sum of squares

\operatorname {E} (I-T)=(n-k)\sigma ^{2}

residual squared deviations aka residual sum of squares

The constants (n − 1), (k − 1), and (n − k) are normally referred to as the number of degrees of freedom.

Example

In a very simple example, 5 observations arise from two treatments. The first treatment gives three values 1, 2, and 3, and the second treatment gives two values 4, and 6.

I={\frac {1^{2}}{1}}+{\frac {2^{2}}{1}}+{\frac {3^{2}}{1}}+{\frac {4^{2}}{1}}+{\frac {6^{2}}{1}}=66

T={\frac {(1+2+3)^{2}}{3}}+{\frac {(4+6)^{2}}{2}}=12+50=62

C={\frac {(1+2+3+4+6)^{2}}{5}}=256/5=51.2

Giving

Total squared deviations = 66 − 51.2 = 14.8 with 4 degrees of freedom.

Treatment squared deviations = 62 − 51.2 = 10.8 with 1 degree of freedom.

Residual squared deviations = 66 − 62 = 4 with 3 degrees of freedom.

Two-way analysis of variance

In statistics, the two-way analysis of variance (ANOVA) is used to study how two categorical independent variables effect one continuous dependent variable.^[2] It extends the One-way analysis of variance (one-way ANOVA) by allowing both factors to be analyzed at the same time. A two-way ANOVA evaluates the main effect of each independent variable and if there is any interaction between them.^[2]

Researchers use this test to see if two factors act independent or combined to influence a Dependent variable. Its used in fields like Psychology, Agriculture, Education, and Biomedical research.^[3] For example, it can be used to study how fertilizer type and water level together affect plant growth. The analysis produces F-statistics that indicate whether observed differences between groups are statistically significant.^[4]^[3]

Squared deviations from the mean

Background

Sample variance

Partition — analysis of variance

Sums of squared deviations

Example

Two-way analysis of variance

See also

References

Wikiwand - on