Annotated SAS input and output for computing intraclass correlations for interrater reliability.
Annotations will be shown in italics.
Input File
data d1;
input r1-r4;
*********************************************
* The data in this program were take from *
* Shrout & Fleiss, Psychological Bulletin, *
* 1979, 420-428, Table 2. The program and *
* output are to show you a computational *
* example of their analysis. *
*********************************************;
cards;
9 2 5 8
6 1 3 2
8 4 6 8
7 1 2 6
10 5 6 9
6 2 4 7
proc print;
|
|
I always print my data to verify that the numbers are correctly input and that the computer thinks variables are same ones I'm thinking of . It's easy to be off a column or two. |
proc corr;
|
|
This will show the correlation between judges, one measure of interjudge reliability. |
data d2; set d1;
|
|
This statement means create a dataset called d2; Grab the contents of dataset d1 and put them into d2. The data were input one row for each taret and one column for each judge. This is typically how we would collect and enter data for such a design. Unfortunately, SAS and most ANOVA programs want the data in a format for a regression program, like this |
|
Rating |
Judge |
Target |
|
9 |
1 |
1 |
|
2 |
2 |
1 |
|
5 |
3 |
1 |
|
8 |
4 |
1 |
|
5 |
1 |
2 |
|
1 |
2 |
2 |
|
|
etc. We could reenter the data. I'm going to have the computer do this for us, however, because you may find youself in a situation someday with a large dataset that needs to be rearranged and it would be impossible to do it by hand. The following statments will rearrange the data. |
array a r1-r4;
An array is a shorthand was for SAS to refer to a collection of variables. The statement says create an array (collection) of variables called a. Put the variables r1 through r4 into a.
do over a;
This is a do loop statement. It says do whatever follows once for each element in the array. So the first time thru the loop, a will be variable r1. The second time thru the loop a will be r2 and so forth untill all elements (in this case, 4) have been executed.
rating = a;
This says create a variable called rating and set it equal to the value of a. The first time thru the loop rating will be r1, the second time thru it will be r2, and so forth.
judge = _I_;
SAS has some internal counters that it uses to keep track of things. _I_ is the interal counter for the do loop. In our case it will index from 1 to 4. The statement says to create a variable called judge and set it equal to _I_. This will label each judge for us to use in subsequent analysis.
target = _N_;
The internal counter _N_ is the record or observation number. In this case, there are 6 records or observations in the dataset, so _N_ will vary from 1 to 6. The statement says to create a variable called target (for the ratee or person rated) and set it equal to _N_. This will label each target for us to use in subsequent analysis.
output;
This statement says to write (output) a record.
end;
This statment says to end the do loop.
proc print;
This shows what the new data set looks like.
proc glm;
GLM stands for the general linear model. This program is extremely powerful and flexible, and can be used to compute analysis of variance, regression, analysis of covariance, and handles unequal cell freqencies. It allows the user to specify error terms and construct linear contrasts. It slices, it dices...It does everything! I use it for most of my analyses. It isn't very effecient, so if you have space limitations on your computer, you can sometimes run other programs like ANOVA and REG if your data meet the requirements of these simpler, more efficient procedures (for example, you can use PROC ANOVA if you have equal numbers of observations in each cell). For most analyses, try GLM first and see if it works.
class judge target;
The default variable is continuous for GLM. You have to tell the program which variables are nominal (class) variables. This statement tells GLM that the variables judge and target are categorical.
model rating = judge target judge*target;
The model statment tells GLM what the linear model is. This statement says the dependent variable is rating, and the independent variables are judge, target and the judge by target interaction.
run;
The micro Windows version of SAS needs this statment to execute the last command.
Output File
|
OBS |
R1 |
R2 |
R3 |
R4 |
|
1 |
9 |
2 |
5` |
8 |
|
2 |
6 |
1 |
3 |
2 |
|
3` |
8 |
4 |
6 |
8 |
|
4 |
7 |
1 |
2 |
6 |
|
5 |
10 |
5 |
6 |
9 |
|
6 |
6 |
2 |
4 |
7 |
This output was produced by the first proc print statement. It shows the input data from Shrout and Fleiss. Check to verify its accuracy (I did. It is.)
Correlation Analysis
4 'VAR' Variables: R1 R2 R3 R4
Simple Statistics
|
Variable |
N |
Mean |
Std Dev |
Sum |
Minimum |
Maximum |
|
R1 |
6 |
7.667 |
1.633 |
46.00 |
6.00 |
10.00 |
|
R2 |
6 |
2.500 |
1.643 |
15.00 |
1.00 |
5.00 |
|
R3 |
6 |
4.333 |
1.633 |
26.00 |
2.00 |
6.00 |
|
R4 |
6 |
6.667 |
2.503 |
40.00 |
2.00 |
9.00 |
Pearson Correlation Coefficients / Prob > |R| under Ho: Rho=0 / N = 6
|
|
R1 |
R2 |
R3 |
R4 |
|
R1 |
1.00 0.0 |
.745 .089 |
.625 .103 |
.750 .086 |
|
R2 |
.745 .089 |
1.00 0.0 |
.894 .012 |
.629 .100 |
|
R3 |
.725 .103 |
.894 .012 |
1.00 0.0 |
.718 .108 |
|
R4 |
.750 .086 |
.729 .100 |
.718 .108 |
1.00 0.0 |
This output was produced by proc corr. This analysis provides a first snapshot of our results. Note that the means across judges are quite different, but the correlations among the judges are all rather high.
|
OBS |
R1 |
R2 |
R3 |
R4 |
RATING |
JUDGE |
TARGET |
|
1 |
9 |
2 |
5 |
8 |
9 |
1 |
1 |
|
2 |
9 |
2 |
5 |
8 |
2 |
2 |
1 |
|
3 |
9 |
2 |
5 |
8 |
5 |
3 |
1 |
|
4 |
9 |
2 |
5 |
8 |
8 |
4 |
1 |
|
5 |
6 |
1 |
3 |
2 |
6 |
1 |
2 |
|
6 |
6 |
1 |
3 |
2 |
1 |
2 |
2 |
|
7 |
6 |
1 |
3 |
2 |
3 |
3 |
2 |
|
8 |
6 |
1 |
3 |
2 |
2 |
4 |
2 |
|
9 |
8 |
4 |
6 |
8 |
8 |
1 |
3 |
|
10 |
8 |
4 |
6 |
8 |
4 |
2 |
3 |
|
11 |
8 |
4 |
6 |
8 |
6 |
3 |
3 |
|
12 |
8 |
4 |
6 |
8 |
8 |
4 |
3 |
|
13 |
7 |
1 |
2 |
6 |
7 |
1 |
4 |
|
14 |
7 |
1 |
2 |
6 |
1 |
2 |
4 |
|
15 |
7 |
1 |
2 |
6 |
2 |
3 |
4 |
|
16 |
7 |
1 |
2 |
6 |
6 |
4 |
4 |
|
17 |
10 |
5 |
6 |
9 |
10 |
1 |
5 |
|
18 |
10 |
5 |
6 |
9 |
5 |
2 |
5 |
|
19 |
10 |
5 |
6 |
9 |
6 |
3 |
5 |
|
20 |
10 |
5 |
5 |
9 |
9 |
4 |
5 |
|
21 |
6 |
2 |
4 |
7 |
6 |
1 |
6 |
|
22 |
6 |
2 |
4 |
7 |
2 |
2 |
6 |
|
23 |
6 |
2 |
4 |
7 |
4 |
3 |
6 |
|
24 |
6 |
2 |
4 |
7 |
7 |
4 |
6 |
This output was produced by the second proc print statement. Notice how the data have been rearranged so that they are now in a format for a regression problem. Instead of four colmns of dependent variable data, we now have one colmn called ratings, and categorical variables that label each judge and target. These data can be analyzed by ANOVA.
General Linear Models Procedure
Class Level Information
|
Class |
Levels |
Values |
|
JUDGE |
4 |
1 2 3 4 |
|
TARGET |
6 |
1 2 3 4 5 6 |
This output was produced by proc glm. It first tells you what the categorical variables are and the number of levels and values for each.
Number of observations in data set = 24
Dependent Variable: RATING
|
Source |
DF |
Sum of Squares |
Mean Square |
F Value |
Pr>F |
|
Model |
23 |
168.9583 |
7.3460 |
. |
. |
|
Error |
0 |
. |
. |
|
|
|
Corrected Total |
23 |
158.9583 |
|
|
|
|
|
R-Square |
C.V. |
Root MSE |
Rating Mean |
|
|
|
1.000 |
0 |
0 |
5.29167 |
|
|
|
|
|
|
|
|
|
Source |
DF |
Type I SS |
Mean Square |
F Value |
Pr > F |
|
JUDGE |
3 |
97.4583 |
32.4861 |
. |
. |
|
TARGET |
5 |
56.2083 |
11.24167 |
. |
. |
|
JUDGE*TARGET |
15 |
15.2917 |
1.0194 |
. |
. |
|
|
|
|
|
|
|
|
Source |
DF |
Type III SS |
Mean Square |
F Value |
Pr > F |
|
JUDGE |
3 |
97.4583 |
32.4861 |
. |
. |
|
TARGET |
5 |
56.2083 |
11.24167 |
. |
. |
|
JUDGE*TARGET |
15 |
15.2917 |
1.0194 |
. |
. |
This is the main output of GLM. GLM reports two major analyses, one for the model as a whole, and one for each term in the model. The model as a whole is reported at the top. It says that there are 23 df for the model and 0 for error. The sum of squares for the whole model is 168.95833333 and the mean square for the whole model is 7.34601449. The F value and associated p level are missing (SAS uses a period to denote a missing value). The program goes on to list R-square, C.V., Root MSE and the mean of the dependent variable. The reason for the missing values and R-square of 1.00 is that there is only one observation per cell in this design. This means that there is no within cell error term. Another way of saying this is that the interaction and error terms are not separately estimable. We will use the interaction as our error term in the analyses that follow.
The second part of the analysis shows an ANOVA source table. It says, for example, that the sums of squares for the judge effect is 97.46 and its mean square is 32.49. Always use the Type III sums of squares. The Type III sums of squares are regression or semi partial sums of squares. They are sums of squares that you get when you enter the variable into the equation last. The Type I and Type III sums of squares are equal when the independent variables are uncorrelated, that is, when there are equal numbers of observations in each cell. When there are unequal numbers of observations in each cell, the Type III sums of squares correspond to tests of the same hypotheses as when there are equal numbers of observations per cell. The Type I SS no longer correspond to the same hypothesis, and are not usually of interest.