ANOVA 1
Objectives
What is the difference between a continuous variable and a categorical variable? Give a concrete example of each.
What is dummy [effect, orthogonal] coding?
What do the intercept and b weights mean for these models?
Why might I choose one model rather than another (that is, choose either dummy, effect or orthogonal coding) to analyze my data?
Is there any advantage to using regression rather than some ANOVA program to analyze designs with categorical IVs? What effect does unbalanced (unequal) cell size have on the interpretation of dummy [effect, orthogonal] coded regression slopes and intercepts?
Orthogonal coding  Unequal Sample Sizes 
Categorical IVs: Dummy, Effect, & Orthogonal Coding
What we are doing here is ANOVA with regression techniques; that is, we are analyzing categorical (nominal) variables rather than continuous variables. There are some advantages to doing this, especially if you have unequal cell sizes. The computer will be doing the work for you. However, I want to show you what happens with the 3 kinds of coding so you will understand it. You are already familiar with ANOVA, so I am not going to discuss it. We are going to cover lots of ground quickly here. This is designed merely to familiarize you with the correspondence between regression and analysis of variance. Both methods are specific cases of a larger family called the general linear model.
With this kind of coding, we put a '1' to indicate that a person is a member of a category, and a '0' otherwise. Category membership is indicated in one or more columns of zeros and ones. For example, we could code sex as 1=female 0=male or 1=male 0=female. If we did, we would have a column variable indicating status as male or female. Or we could code for marital status as 1=single 0=married or 1=married 0=single. Ordinarily if we wanted to test for group differences, we would use a ttest or an Ftest. But we can do the same thing with regression. Let's suppose we want to know whether people in general are happier if they are married or single. So we take a small sample of people shopping at University Square Mall and promise them some ice cream if they fill out our life satisfaction survey, which some do. They also fill out some demographic information, an item of which is marital status (Status), which we code 1=single 0=married. For fun, let's see what happens if we code it the other way (Status 2; 0=single 1=married) Our data:

Status 
Satisfaction 
Status2 



Single 
1 
25 
0 



S 
1 
28 
0 



S 
1 
20 
0 



S 
1 
26 
0 



S 
1 
25 
0 
M = 24.8 
SD = 2.95 
N=5 
Married 
0 
30 
1 



M 
0 
28 
1 



M 
0 
32 
1 



M 
0 
33 
1 

M 
0 
28 
1 
M = 30.20 
SD = 2.28 
N=5 
M 
.5 
27.5 
.5 



SD 
.53 
3.78 
.53 



Sat 
Grand mean 
Dev 
Dev^{2} 
Cell Mean 
Dev 
Dev^{2} 
25 
27.5 
2.5 
6.25 
24.8 
0.2 
.04 
28 
27.5 
0.5 
0.25 
24.8 
3.2 
10.24 
20 
27.5 
7.5 
56.25 
24.8 
4.8 
23.04 
26 
27.5 
1.5 
2.25 
24.8 
1.2 
1.44 
25 
27.5 
2.5 
6.25 
24.8 
0.2 
.04 
30 
27.5 
2.5 
6.25 
30.2 
.2 
.04 
28 
27.5 
0.5 
0.25 
30.2 
2.2 
4.48 
32 
27.5 
4.5 
20.25 
30.2 
1.8 
3.24 
33 
27.5 
5.5 
30.25 
30.2 
2.8 
7.84 
28 
27.5 
0.5 
0.25 
30.2 
2.2 
4.84 
Sum 
275 
0 
128.5 
275 
0 
55.60 
We have 10 people, 5 each in two groups. The sum of squared deviations from the grand mean is 128.5 (SStot); the sum of squared deviations from the cell means is 55.60 (SSwithin), and difference must be SSbetween = 128.555.60 = 72.90. To test for the difference we find the ratio of the two mean squares:
Or we could compute a ttest by
And if we square this result, we get 10.49, which is our value for F (recall that F = t^{2}).
To compute regressions, we find that:
X 
Mean X 
x 
x^{2} 
Y 
Mean Y 
y 
xy 
1 
0.5 
0.5 
0.25 
25 
27.5 
2.5 
1.25 
1 
0.5 
0.5 
0.25 
28 
27.5 
0.5 
0.25 
1 
0.5 
0.5 
0.25 
20 
27.5 
7.5 
3.75 
1 
0.5 
0.5 
0.25 
26 
27.5 
1.5 
0.75 
1 
0.5 
0.5 
0.25 
25 
27.5 
2.5 
1.25 
0 
0.5 
0.5 
0.25 
30 
27.5 
2.5 
1.25 
0 
0.5 
0.5 
0.25 
28 
27.5 
0.5 
0.25 
0 
0.5 
0.5 
0.25 
32 
27.5 
4.5 
2.25 
0 
0.5 
0.5 
0.25 
33 
27.5 
5.5 
2.75 
0 
0.5 
0.5 
0.25 
28 
27.5 
0.5 
0.25 
Sums 

5 
5 
0 
2.5 
275 
275 
0 
13.5 
Formula 
Status 
Status2 
13.5/2.5 = 5.4 
13.5/2.5 = 5.4 

27.5(5.4*.5)= 30.20 
27.5(5.4*.5) = 24.8 

Y' =30.205.4X 
Y'=24.8+5.4X 

5.4(13.5)=72.90 
5.4(13.5)=72.90 

128.572.90=55.6 
128.572.9=55.6 

55.6/8 = 6.95 
55.6/8=6.95 

Sqrt(6.95/2.5) = 1.667 
Sqrt(6.95/2.5) = 1.667 

5.4/1.667 = 3.239 
5.4/1.667 = 3.239 

72.9/128.5=.5673 
72.9/128.5=.5673 

(.57/1)/(.43/8)= 10.49 
(.57/1)/(.43/8)= 10.49 
Points to notice:
We can apply dummy coding to categorical variables with more than two levels. We can keep the use of zeros and ones as well. However, we will always need as many columns as there are degrees of freedom. With two levels, we need one column; with three levels, we need two columns. With C levels, we need C1 columns.
Suppose we have three groups of people, single, married, and divorced, and we want to estimate their life satisfaction. Note how the first vector selects (identifies) the single group, and the second identifies the married group. The divorced folks are left over. The overall results will be the same, however, no matter which groups we select.
Group 
Satisfaction 
Vector1 
Vector2 
Satis Group Mean 
Single 
25 
1 
0 
24.80 
S 
28 
1 
0 

S 
20 
1 
0 

S 
26 
1 
0 

S 
25 
1 
0 

Married 
30 
0 
1 
30.20 
M 
28 
0 
1 

M 
32 
0 
1 

M 
33 
0 
1 

M 
28 
0 
1 

Divorced 
20 
0 
0 
23.80 
D 
22 
0 
0 

D 
28 
0 
0 

D 
25 
0 
0 

D 
24 
0 
0 

Grand Mean 
26.27 
.33 
.33 

The descriptive statistics for the variables are:

Sat 
V1 
V2 
Satisfaction 
1 


Vector 1 
.28 
1 

Vector 2 
.74 
.50 
1 
Mean 
26.27 
.33 
.33 
SD 
3.88 
.49 
.49 
When we run the program with satisfaction as the DV and the two vectors as the IVs, we find that R^{2} is .5619. The significance of this is found by:
Note that there are three groups and thus two degrees of freedom between groups. There are 15 people and thus 12 df for error. The F test based on R^{2} gives us the same result we would get if we used the traditional ANOVA approach to analyze these data.
The parameter estimates for these data are:
Variable 
df 
Est 
Std Err 
t 
P > t 
Intercept 

23.8 
1.24 
19.18 
.0001 
V1 
1 
1 
1.75 
.57 
.5793 
V2 
1 
6.4 
1.75 
3.65 
.0033 
Thus, the regression equation using this particular dummy code is:
Y' = 23.8 + 1(V1) + 6.4(V2)
Points to notice:
The group that gets all zeros is the base group or comparison group. The regression coefficients present a contrast or difference between the group identified by the vector and the base or comparison group. For our example, the comparison group is the divorced group. The first b weight corresponds to the single group and the b represents the difference between the means of the divorced and single groups. The second b weight represents the difference in means between the divorced and married groups.
The tests of significance of the b weights are equivalent to ttests of the difference between the means of the identified and comparison groups.
Effect coding is similar to dummy coding. The difference in coding is that, in effect coding, the comparison group is identified by the symbol 1. Our example looks like this:
Group 
Satisfaction 
Vector1 
Vector2 
Satis Group Mean 

Single 
25 
1 
0 
24.80 

S 
28 
1 
0 


S 
20 
1 
0 


S 
26 
1 
0 


S 
25 
1 
0 


Married 
30 
0 
1 
30.20 

M 
28 
0 
1 


M 
32 
0 
1 


M 
33 
0 
1 

M 
28 
0 
1 


Divorced 
20 
1 
1 
23.80 

D 
22 
1 
1 


D 
28 
1 
1 


D 
25 
1 
1 


D 
24 
1 
1 


Grand Mean 
26.27 
0 
0 








Descriptives 
Sat 
V1 
V2 

Satisfaction 
1 



Vector 1 
.11 
1 


Vector 2 
.70 
.50 
1 

Mean 
26.27 
.0 
.0 

SD 
3.88 
.85 
.85 
The R^{2} for this model is also .5619. The estimates are somewhat different, however.
Variable 
Df 
Est 
Std Err 
t 
P > t 
Intercept 

26.27 
.72 
36.66 
.0001 
V1 
1 
1.47 
1.01 
1.45 
.17 
V2 
1 
3.93 
1.01 
3.88 
.002 
Note that regression equation is different.
Y' = 26.27 1.47(V1)+3.93(V2)
Points to notice:
Orthogonal coding is used to compute contrasts. You can use it if you have specific planned comparisons going into the analysis. Our example:
Group 
Satisfaction 
Vector1 
Vector2 
Satis Mean 

Single 
25 
1 
1 
24.80 

S 
28 
1 
1 


S 
20 
1 
1 


S 
26 
1 
1 


S 
25 
1 
1 


Married 
30 
1 
1 
30.20 

M 
28 
1 
1 


M 
32 
1 
1 


M 
33 
1 
1 

M 
28 
1 
1 


Divorced 
20 
0 
2 
23.80 

D 
22 
0 
2 


D 
28 
0 
2 


D 
25 
0 
2 


D 
24 
0 
2 


Grand Mean 
26.27 
0 
0 


Descriptives 
Sat 
V1 
V2 

Satisfaction 
1 



Vector 1 
.59 
1 


Vector 2 
.47 
.00 
1 

Mean 
26.27 
0.0 
0.0 

SD 
3.88 
.85 
1.46 
Take a look at the contrasts implied by the positive and negative numbers in the two vectors. In the first vector, we are comparing single and married people; we are ignoring divorced people. In the second vector, we are comparing the mean of both single and married people to the mean of the divorced group. Notice in the correlation matrix that V1 and V2 are not correlated, hence orthogonal coding. There are only as many orthogonal contrasts allowed in one analysis as there are degrees of freedom. In this case, there are two. Exactly which two are tested depends entirely upon the design and hypothesized effects. In our example, we would need to specify in advance that we expect differences between single and married and between both these and divorced. We could have chosen other contrasts  we could have hypothesized that the single would be different from the combined married and divorced, for example.
The R^{2} for this analysis is .5619, just as for dummy and effect coding.
Variable 
df 
Est 
Std Err 
t 
P > t 
Intercept 

26.27 
.72 
36.66 
.0001 
V1 
1 
2.70 
.88 
3.08 
.01 
V2 
1 
1.23 
.51 
2.43 
.03 
Points to notice:
Y' = 26.27 + 2.70(1) + 1.23(1).
People in the divorced group have predicted scores of
Y' = 26.27 + 2.70(0) + 1.23(2).
Designs in which the cells contain unequal frequencies introduce minor complications to the types of coding shown here. For effect coding, the meaning of the intercept changes. For effect coding, the intercept will no longer refer to the grand mean. Instead, it will denote the unweighted mean of the cell means.
That is, instead of
where Y_{i} is the individual scores on the DV, we have, say,
with three cells. Note that the results will not be the same if the cells do not have equal frequencies. In the case of equal frequencies, both values of mean Y, weighted and unweighted, are the same. With orthogonal coding, the intercept will still be the grand mean, but we have to change the values of the codes to maintain orthogonality. For dummy coding, the intercept still refers to the mean of the base or comparison group.
For orthogonal coding, you have to adjust the code numbers to keep the sums equal to zero and the vectors orthogonal. For example,
Group 
Satisfaction 
Vector1 
Vector2 
Satis Group Mean 
Single 
25 
4 
5 
24.33 
S 
28 
4 
5 

S 
20 
4 
5 

Married 
30 
3 
5 
30.75 
M 
32 
3 
5 

M 
33 
3 
5 

M 
28 
3 
5 

Divorced 
20 
0 
7 
23.80 
D 
22 
0 
7 

D 
28 
0 
7 

D 
25 
0 
7 

D 
24 
0 
7 

M 
26.27 
0 
0 

The correlation between V1 and V2 is still zero.
Otherwise, the results of the regressions are the same. All three types of codings give the same R^{2}. The interpretation of the b weights is what it was before (for dummy coding, the contrast between a cell and a comparison cell, for effect, the contrast between a cell and the (unweighted) mean, and for orthogonal, specific planned comparisons.