Elementary Applications of the Variance Covariance Matrix
1. Variance and number of items.
The variance of test increases as the number of items increases. If the covariances among the items are positive, the variance of the test increases rapidly because the number of items increases by N, but (unique) covariances increase by N*(N-1)/2, and total covariances, those that count in the variance of the composite, increase by N*(N-1), that is, by N2-N.
2. Covariance and the meaning of test scores.
Most of the variance of the composite (total test score distribution) comes from the covariances of the items, not from the variances of the items. (This was suggested point 1 above by the relative numbers of variances and covariances in a composite.) Let's assume that each item has one unit of variance and that the covariance for each item is .3 (this is based on a typical cognitive test inter-item correlation matrix). Let's look a the variance of several tests of various lengths and note the variance contributed by the item variances and covariances:
|
Test length in items |
Variance due to item variances |
Variance due to item covariances |
Proportion of total due to covariances |
|
5 |
5 |
6 |
.54 |
|
10 |
10 |
27 |
.73 |
|
25 |
25 |
180 |
.88 |
|
50 |
50 |
735 |
.94 |
|
100 |
100 |
2970 |
.97 |
This table shows that with tests of reasonable length, the vast majority of the variance in the total test score is due to item covariances. The exact proportions will depend on the test length and the actual values of variance and covariance. The results here pertain to cognitive ability tests (e.g., the SAT). Attitude scales usually attain high proportions of total variance due to item covariance in fewer items because the item covariances tend to be much larger in attitude scales than in cognitive ability tests.
This means that test scores are given meaning by item covariances, that is, what the test items have in common. It is the common part of the items that gets counted over and over again, while the unique part to the various items gets averaged out over items and disappears.
3. Nominal and Effective Weights.
Nominal weights are coefficients applied to item scores, usually unit weights. Weights can be numbers like 1, 2, 4 and 10 that indicate the relative importance of the items. For example, in assigning grades in class, a teacher may weight 2 tests 40 and a paper 20 for a total of 100 points.
Effective weights reflect the relative contribution of the item to the position on the composite. In other words, effective weights are proportional to the correlation or regression of the item to the total score. Variance as a weight -- Suppose there are three test scores added to make final score in class. The final grade will be based on the curve or distribution of the sum of the three tests. Suppose everyone scores the same (70) on the first two tests, and there is a range of scores on the third, where the mean is 70 and the standard deviation is 5.
Q1: What is the mean on the final score?
Q2: What is the standard deviation of the final score?
--> Note that the final grade is determined solely by the third test, the first two don't count because they add nothing to determining the final rank order of people in the class. Here the nominal weights are 1,1,1, but the effective weights are essentially 0, 0, 1. Effective weights are influenced by (are a function of) the variance of items. Items with larger variances have greater effective weights, other things being equal.
A. Picking weights by judgment. Instead of unit weights, you can use other weights that you pick through one method or another. The simplest way is through judgment. For example, you could weight the tests above, 1, 2, 3. The variance of a distribution where each element is multiplied by k is k-squared the original variance. Notice in the above example, changing the nominal weights would have no effect on the effective weights, although the mean and variance of the final score would be different.
Suppose we changed the nominal weights from 1, 1, 1 to 1, 2, 3.
Q3: What would the mean of the new final distribution be?
Q4: What would the standard deviation be?
Remember: Effective weights depend upon item variances.
Effective weights also depend on item covariances.
Q5: If we have two items with variance 1 and perfect correlation 1.0, what will the variance of the composite be?
Suppose we have three items with variance 1, and two of the items are perfectly correlated with each other, but uncorrelated with the third. The variance of the composite will be 5. But note that the three items could be considered two items, one with variance 1, and the other with variance 4. In the final composite, either item 1 or 2 has a strong impact on the rank order, and item 3 has a lesser impact. Item 2 has no impact above and beyond item 1, nor does item 1 have impact above and beyond item 2, but item three breaks all the ties, has impact beyond items 1 and 2.
B. Another method for picking weights: multiple regression - this explicitly considers both variances and covariances of the items (variables). Unfortunately, you can't use this without some criterion variable (Y). Example of use: Ray Christal and job evaluation. Regression makes the independent variables unit variance and uncorrelated using R-1, the inverse of R, the correlation matrix of the independent variables. R times its inverse is I, the identity matrix, which has 1.0 on the main diagonal and zero elsewhere. I is the matrix implicitly assumed by people when they assign nominal weights. Nominal weights only work as intended when the variables to be transformed have a structure that is essentially the identity matrix.
C. Research with weights -- as the variables become increasingly positively correlated, weights become increasingly meaningless (a) they do not affect the rank order on the composite, which is at the limit, any positive monotonic transformation of any of the variables. As they become less correlated, and negatively correlated, weights matter. People are usually surprised about the outcome rank order if they pick nominal weights according to subjective importance, because the effective weights depend on quantities which people do not ordinarily consider, that is, the variance and the covariance.
Remember: Effective weights depend upon the variance covariance matrix of the things to be weighted. If the covariances are all large and positive, weights don't make any difference. All weights result in the same rank ordering. If the VCV is essentially an identity matrix, nominal weights will work properly. If the variances and covariances do not fit either of the two preceding patterns (often the case in practice) nominal weights will not usually work well.
A1: 210
A2: (5, no covariance terms to add)
A3: 70 + 140 + 210 / 3 = 140.
A4: SD = 5, var = 25, k = 3, ksq = 9 , new var = 25*9 = 225, SD = sqrt(225) = 15 or 5*3 = 15.
A5: 1 + 1 + 2rhosig1sig2 = 4