Validation
Campbell and Fiske concluded that if our measures are valid, we should expect to see a certain pattern emerge among the correlations in the MTMM matrix. (And don't you want to know what it is?) They discussed four properties of the MTMM necessary to show validity. But they didn't talk about reliability much. The first thing we need to do before looking at the Campbell Fiske criteria for the MTMM is to look at the reliability diagonal. If the reliabilities are high and fairly even (e.g., all are .75 to .85) we can proceed with the Campbell and Fiske procedure. If they are low or very uneven (e.g., some are .45 some are .85) we have some very serious measurement problems and probably ought to go back and do some development work before coming back to the MTMM. It is also possible to statistically correct the correlations using the now familiar disattenuation for unreliability formula. However, in general you should avoid using such corrections if the reliability is less than about .50.
Properties
1. The entries in the validity diagonal should be large enough to mean something -- at a minimum they should be significantly different from zero. Note that in Table 8, all the entries in the validity diagonal are .8, which is pretty large and will be significant with any reasonable sample size. This condition is met by the example matrix.
2. Validities in the validity diagonal should be larger than the het-het block in the same row and column. This should happen to a "reasonable degree" depending on sample size. The appropriate comparisons are shown in Table 9. For example, in the upper left heteromethod block the correlation between clinical anger and psychological test anger is .8 We will compare this to A1G2 (anger by test guilt by clinician), A1D2 (anger by test, depression by clinical), A2G1 (anger by clinician, guilt by test), and A2D1 (anger by clinician, depression by test). The other comparisons are shown by the shaded portions of Table 9. We can see that this condition appears to be met for the example matrix. In each case the .8 validity diagonal values are greater than th other entriesl in the same row and column of the hets. The reason to expect such a relation is that SOMETHING SHOULD CORRELATE MORE HIGHLY WITH ITSELF THAN WITH SOMETHING ELSE. Anger should correlate more highly with anger than with guilt or depression, for example.
3. Entries in the validity diagonal should be larger than corresponding entries in the monomethod triangle (see Table 10). If we look at the VD entry for anger from psychological tests and clinical self ratings (A2A1), we should compare it to the other entries in the monos dealing with anger. So in the clinical ratings we would look at A2G2 (anger and guilt by clinical ratings) and A2D2 (anger and depression by clinical ratings). We would also compare it to A1G1 (anger and guilt by psychological test) and A1D1 (anger and depression by psychological test).. Note which entries in the monos correspond to which entries in the validity diagonal. Here again, the main principle is that SOMETHING SHOULD CORRELATE MORE HIGHLY WITH ITSELF THAN WITH SOMETHING ELSE. Here, however, the "something else" is measured with the same method instead of different methods. This condition is met by the example matrix.
4. Patterns of correlations should be the same in all the triangles, both in the heteromethod triangles and the monomethod triangles. That is, larger correlations should be larger correlations and smaller correlations should be smaller correlations, regardless of where they are found in the matrix. If you took the entries out of the matrix and ordered them in columns, one for each triangle, we should see larger correlations among the triangles. In the matrix shown in Tables 1 and 2, the correlation between guilt and depression is always larger than the correlations between anger and guilt and anger and depression (never mind the personality theory behind this) so this condition is met.
Table 1. MTMM Layout
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
A1
|
G1 |
D1 |
A2 |
G2 |
D2 |
A3 |
G3 |
Dep 3 |
|
Psych Test |
|
|
|||||||
|
A1 |
|
|
|
|
|
|
|
|
|
|
G1 |
|
|
|
|
|
|
|
|
|
|
D1 |
|
|
|
|
|
|
|
|
|
|
Clinical Rating |
|
||||||||
|
A2 |
|
|
|
|
|
|
|
|
|
|
G2 |
|
|
|
|
|
|
|
|
|
|
D2 |
|
|
|
|
|
|
|
|
|
|
Self Rating |
|
||||||||
|
A3 |
|
|
|
|
|
|
|
|
|
|
G3 |
|
|
|
|
|
|
|
|
|
|
D3 |
|
|
|
|
|
|
|
|
|
The traits are anger, guilt, and depression, each of which is measured with each of three methods, psychological test, clinical rating, and self rating.
Table 2. Main (Reliability) Diagonal
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
A1
|
G1 |
D1 |
A 2 |
G2 |
D2 |
A 3 |
G3 |
Dep 3 |
|
PT |
Psych Test |
|
|||||||
|
A1 |
1(.80) |
|
|
|
|
|
|
|
|
|
G1 |
|
1(.80) |
|
|
|
|
|
|
|
|
D1 |
|
|
1(.80) |
|
|
|
|
|
|
|
CR |
|
||||||||
|
A2 |
|
|
|
1(.80) |
|
|
|
|
|
|
G2 |
|
|
|
|
1(.80) |
|
|
|
|
|
D2 |
|
|
|
|
|
1(.80) |
|
|
|
|
SR |
|
||||||||
|
A3 |
|
|
|
|
|
|
1(.80) |
|
|
|
G3 |
|
|
|
|
|
|
|
1(.80) |
|
|
D3 |
|
|
|
|
|
|
|
|
1(.80) |
The main diagonal of any correlation matrix contains entries equal to 1.0. This is the correlation of the variable literally with itself, for example, the correlation of the psychological test of anger with itself is by definition 1.0. It is common practice to replace the main diagonal entries with estimates of reliability, such as alpha or other estimates as appropriate.
Table 3. Monomethod Blocks
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
# 1 |
|
|
|
|
|
|
|
|
|
G1 |
|
mono |
|
|
|
|
|
|
|
|
D1 |
|
|
|
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
|
|
|
#2 |
|
|
|
|
|
|
G2 |
|
|
|
|
Methd |
|
|
|
|
|
D2 |
|
|
|
|
|
|
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
|
|
|
|
|
|
#3 |
|
|
|
G3 |
|
|
|
|
|
|
|
blocks |
|
|
D3 |
|
|
|
|
|
|
|
|
|
The monomethod blocks contain correlations among traits within methods. The first block contains all the correlations of psychological tests with other psychological tests. The second block contains the correlations among the various clinical ratings. The third block contains correlations among the various self ratings.
Table 4. Heteromethod Blocks
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
|
|
|
|
|
|
|
|
|
|
G1 |
|
|
|
|
|
|
|
|
|
|
D1 |
|
|
|
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
#1 |
|
|
|
|
|
|
|
|
|
G2 |
|
hetero |
|
|
|
|
|
|
|
|
D2 |
|
|
|
|
|
|
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
#2 |
|
|
#3 |
|
|
|
|
|
|
G3 |
|
methd |
|
|
Blocks |
|
|
|
|
|
D3 |
|
|
|
|
|
|
|
|
|
The heteromethod blocks contain correlations among traits across methods. The first block contains corrections of traits measured by psychological tests and clinical ratings. The second block contains correlations between psychological tests and self ratings. The third block contains correlations between clinical ratings and self ratings.
Table 5. Monomethod Triangles
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
# 1 |
|
|
|
|
|
|
|
|
|
G1 |
.4 |
mono |
|
|
|
|
|
|
|
|
D1 |
.4 |
.5 |
|
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
|
|
|
#2 |
|
|
|
|
|
|
G2 |
|
|
|
.4 |
Methd |
|
|
|
|
|
D2 |
|
|
|
.4 |
.5 |
|
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
|
|
|
|
|
|
#3 |
|
|
|
G3 |
|
|
|
|
|
|
.4 |
blocks |
|
|
D3 |
|
|
|
|
|
|
.4 |
.4 |
|
The monomethod triangles contain the unique information among traits within methods. The monomethod blocks are symmetric, so we only show the bottom triangles.
Table 6. Validity Diagonal
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
|
|
|
|
|
|
|
|
|
|
G1 |
|
|
|
|
|
|
|
|
|
|
D1 |
|
|
|
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
.8 |
|
|
|
|
|
|
|
|
|
G2 |
|
.8 |
|
|
|
|
|
|
|
|
D2 |
|
|
.8 |
|
|
|
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
.8 |
|
|
.8 |
|
|
|
|
|
|
G3 |
|
.8 |
|
|
.8 |
|
|
|
|
|
D3 |
|
|
.8 |
|
|
.8 |
|
|
|
The validity diagonal is a particularly interesting part of the MTMM matrix. The VD entries show the correlation of the same trait across different methods. For example, the first block show the correlation of anger as measured by a psychological test with anger as measured by a clinical rating. The last of the entries shows depression as measured by a clinical rating with depression as measured by a self rating.
Table 7. Heterotrait Heteromethod (het-het) Triangles
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
|
|
|
|
|
|
|
|
|
|
G1 |
|
|
|
|
|
|
|
|
|
|
D1 |
|
|
|
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
|
.3 |
.3 |
|
|
|
|
|
|
|
G2 |
.3 |
|
.4 |
|
|
|
|
|
|
|
D2 |
.3 |
.4 |
|
|
|
|
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
|
.3 |
.3 |
|
.3 |
.3 |
|
|
|
|
G3 |
.3 |
|
.4 |
.3 |
|
,4 |
|
|
|
|
D3 |
.3 |
.4 |
|
.3 |
.4 |
|
|
|
|
The het het triangles contain correlations of different traits and different methods. They are not necessarily symmetrical (the correlation of psychological test anger with clinical ratings of guilt need not equal the clinical rating of anger with the psychological test of guilt). Now let's review the pieces of the MTMM.
Table 8. A Sample MTMM
|
|
Psychological Test |
Clinical Rating |
Self Rating |
||||||
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
|
Psych Test |
|
|
|
|
|
|
|
|
|
|
A1 |
1 |
|
|
|
|
|
|
|
|
|
G1 |
.4 |
1 |
|
|
|
|
|
|
|
|
D1 |
.4 |
.5 |
1 |
|
|
|
|
|
|
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
|
A2 |
.8 |
.3 |
.3 |
1 |
|
|
|
|
|
|
G2 |
.3 |
.8 |
.3 |
.4 |
1 |
|
|
|
|
|
D2 |
.3 |
.4 |
.8 |
.4 |
.5 |
1 |
|
|
|
|
Self Rating |
|
|
|
|
|
|
|
|
|
|
A3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.3 |
1 |
|
|
|
G3 |
.3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.4 |
1 |
|
|
D3 |
.3 |
.4 |
.8 |
.3. |
.4 |
.8 |
.4 |
.5 |
1 |
The heteromethod blocks contain correlations among traits across methods. The first block contains correlations of traits measured by psychological tests and clinical ratings. The second block contains correlations between psychological tests and self ratings. The third block contains correlations between clinical ratings and self ratings.
Table 9. Campbell & Fiske Criterion 2
|
|
Anger1 |
Guilt 1 |
Dep 1 |
Anger 2 |
Guilt 2 |
Dep 2 |
Anger 3 |
Guilt 3 |
Dep 3 |
||||||
|
Psych Test |
|
|
|
|
|
|
|
|
|
||||||
|
A1 |
1 |
|
|
|
|
|
|
|
|
||||||
|
G1 |
.4 |
1 |
|
|
|
|
|
|
|
||||||
|
D1 |
.4 |
.5 |
1 |
|
|
|
|
|
|
||||||
|
Clinical Rating |
|
|
|
|
|
|
|
|
|
||||||
|
A2 |
.8 |
.3 |
.3 |
1 |
|
|
|
|
|
||||||
|
G2 |
.3 |
.8 |
.3 |
.4 |
1 |
|
|
|
|
||||||
|
D2 |
.3 |
.4 |
.8 |
.4 |
.5 |
1 |
|
|
|
||||||
|
Self Rating |
|
|
|
|
|
|
|
|
|
||||||
|
A3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.3 |
1 |
|
|
||||||
|
G3 |
.3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.4 |
1 |
|
||||||
|
D3 |
.3 |
.4 |
.8 |
.3. |
.4 |
.8 |
.4 |
.5 |
1 |
||||||
The second criterion states that entries in the validity diagonal should be larger than those in the heteromethod block in the same row and column. This is indicated once for each block to show the proper comparisons. However, all three comparisons should be made within each block, for a total of 9 for this matrix.
Table 10. Campbell & Fiske Criterion 3
|
|
A1 |
G1 |
D1 |
A2 |
G2 |
D2 |
A3 |
G3 |
D3 |
||||||
|
A1 |
1 |
.4 |
.4 |
.8 |
.3 |
.3 |
.8 |
.3 |
.3 |
||||||
|
G1 |
.4 |
1 |
.5 |
.3 |
.8 |
.3 |
.3 |
.8 |
.3 |
||||||
|
D1 |
.4 |
.5 |
1 |
.3 |
.4 |
.8 |
.3 |
.4 |
.8 |
||||||
|
A2 |
.8 |
.3 |
.3 |
1 |
.4 |
.4 |
.8 |
.3 |
.3 |
||||||
|
G2 |
.3 |
.8 |
.3 |
.4 |
1 |
.5 |
.3 |
.8 |
.3 |
||||||
|
D2 |
.3 |
.4 |
.8 |
.4 |
.5 |
1 |
.3. |
.4 |
.8 |
||||||
|
A3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.3 |
1 |
.4 |
.4 |
||||||
|
G3 |
.3 |
.8 |
.3 |
.3 |
.8 |
.3 |
.4 |
1 |
.5 |
||||||
|
D3 |
.3 |
.4 |
.8 |
.3. |
.4 |
.8 |
.4 |
.5 |
1 |
||||||
Campbell and Fiske's third criterion states that the entries in the validity diagonal should be larger than the relevant entries in the monomethod blocks. A thing should correlate more highly with itself than with something else. The relevant entries are determined by the trait of interest. For example, when comparing Psychological test anger and Clinical rating anger (A1 vs. A2, .8 in the VD, green) we look first in the psychological test mono and find the two correlations that concern anger (shaded green). We notice that .8 is greater than .4 and .4, which is good. We then compare the VD entry to the entries in the mono for Clinical ratings that also are concerned with anger (shown again in green). Again we find that .8 is greater than .4, .4. Guilt by self rating is compared to guilt by clinical rating (G2 vs. G3, .8 shaded red). We compare the .8 entry to the relevant monos for clinical rating (shaded red) and find that .8 is greater than .4 and .5 as desired. The we proceed to the self rating mono and find the relevant entries (red) and find again that .8 is greater than .4 and .5. Finally we consider Clinical rating of depression and Self rating of depression (D2 vs. D3, yellow). We compare the VD .8 to the monos shaded yellow and find that .8 is larger than .4, .5 as desired by the Campbell & Fiske criteria. Although I've only shown three sets of comparisons here, you would actually carry out all nine, three traits for each of the three validity diagonals.
The last of the Campbell & Fiske criteria is that the pattern is the same throughout the matrix, so that large correlations appear in the same relative places throughout. You can string them out and look, or you can just look for the largest and smallest. In our example matrix, Guilt and Depression always correlate more highly with each other than with Anger, thus satisfying the final requirement.