Analysis of Variance ANOVA- formula and example
Analysis of Variance, ANOVA
What is Analysis of Variance?
Analysis of Variance is a statistical test used to analyze data from experiments which have two or more groups. It is used in order to decide whether the difference between group 1 and group 2 is real or a chance event. Another way of saying this is: the ANOVA is used in order to decide whether the difference between mean of group 1 and mean of group 2 is reliable. In statistical jargon we say that we test to see whether the difference is significant.ANOVA made simple Here we will demystify ANOVA. Despite the frightening formulas and ANOVA summary tables of the mathematicians, ANOVA is just calculating variance. ANOVA is partitioning variance. The total variance of the scores in an experiment is partitioned into two: 1. Variance within, and 2. Variance between.
How to calculate variance within. Calculate the variance in group 1, group 2, group 3 ....and sum these variances. This is the Mean Square, MS within. MS within groups.
How to calculate variance between Calculate the variance of means. Treat the means as scores and calculate the variance. For example, if you have 3 groups, 3 means, calculate the mean of the means and the variance of the means. This is the Mean Square, MS between. MS between groups. Lastly we compare variance between to variance within, MSbetween/MSwithin, that comparison is the F ratio.
The layout of a single factor ANOVA
Group 1 | Group 2 |
---|---|
Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 |
Subject 6 Subject 7 Subject 8 Subject 9 Subject 10 |
In this example, Group 1 received a sugar pill, Group 2 received a new drug, Drugx.It is important to note that the subjects of group 1 do not receive Drugx. Similarly the subjects if group 2 do not receive the sugar pill. This is the concept of independence, an important concept in Statistics.
Analysis of Variance ANOVA -formula
The formula for ANOVA t is: $$F={{MS_{between}} \over {MS_{within}}}$$ We read this as follows: Mean square between over mean square within. What is mean square, you ask. It is the mean of squares. What is squares, you ask. Squares is the statistical term for squared deviations (of squared differences) of each score X from the mean. What are the squared differences, you ask. Remember the formula for variance? $$s^2 ={\sum{({X}-{\bar{X})}}^2 \over {n}}$$ Look at the numerator $${\sum({X}-\bar{X})}^2$$ These are the squared differences summed. To complete our reasoning, we go back to where we started, the F formula, or F ratio, the formula for ANOVA. Why mean sums of squares? Simple, because like all average, we divide by the number of scores. (If you are observant, you will notice that the F formula is a modified t formula).
Analysis of variance (ANOVA) -practice example
Group 1 | Group 2 | Group 3 |
---|---|---|
200 203 199 190 204 --------- n1: 5 df1 = n-1= 5- 1= 4 Mean1: 199.2 SS1: 122.8 \(s^2_1\)= SS1/(n - 1) = 122.8/(5-1) = 30.7 |
204 210 214 219 211 --------- n2: 5 df2 = n - 1 = 5 - 1 = 4 Mean2: 211.6 SS2: 121.2 \(s^2_2\) = SS2/(n - 1) = 121.2/(5-1) = 30.3 |
214 220 225 220 229 --------- n3: 5 df3 = n - 1 = 5 - 1 = 4 Mean3: 221.6 SS3: 129.2 \(s^2_3\) = SS3/(b - 1) = 129.2/(5-1) = 32.3 |
ANOVA SUMMARY TABLE
Source | SS | df | MS | F | p |
---|---|---|---|---|---|
Berween | 1259 | 2 | 629.6 | 20.24 | <0.05 |
Within | 373.2 | 12 | 31.10 | ||
Total | 1632 | 14 |
After we calculate the F, we go to the F tables and enter with the degrees of freedom we have, in this case 2 and 12. We first check the 0.05 level (level of significance). The F we see at df 2 and 12 is 19.41. The F of our calculations (above table) is 20.24. Because our F is greater than the one in the F table, we say p<0.05, p less than point o five. It has been accepted among scientists that at the 0.05 level we are allowed to say that we have significance, that the finding of our experiment is reliable.
Post hoc tests
In experiments with 3 or more groups, after ANOVA, we may wish to pinpoint our effect, that is which difference is significant, the one between mean 1 and mean 2, the one between mean mean 1 and mean 3 and so on. We then run a posteriori tests, also called post hoc tests Here is a partial list.. Duncan's new multiple range test (MRT)
Dunn's Multiple Comparison Test.
Fisher's Least Significant Difference (LSD)
Newman-Keuls.test
Scheffé's test
Which post hoc test do you choose? Duncan'a has a reputation of not being tough enough. I do not recommend it. Scheffé's test is at the other end of the continuum. It is too strict. I recommend Newman-Keuls.test.
Popular content