Variability
We are interested in variables, things that vary. Variables can take on two or more values over a group of people at one or more times; that is, variables have distributions.
Distributions can be graphed, or they can be summarized by a few numbers. Such summary numbers are called statistics or parameters. The two most important summary numbers describe the two most important characteristics of a distribution, central tendency and variability. Central tendency tells us about the middle of the distribution. Variability tells us about the spread of the distribution about the middle.

Two distributions with the same middle and different spread:
Perhaps perceptions of taste in dress in other people as rated by men and women (women perceive a greater range, both positive and negative). Perhaps two types of sales salary differing on percentage of salary due to commissions. Perhaps yield in strains of corn in bushels per acre -- one type is more sensitive to soil conditions than is the other. Perhaps the lifetime of two types of lightbulbs made with different filaments. They have the same expected life of use, but one is more of a gamble than the other is.
Deviation from the mean (Spread)

Little X is found by subtracting the mean from the score. Little x is the deviation from the mean. We subtract the mean from the score instead of the score from the mean so that when the score is larger than the mean, the deviation is positive and when the score is smaller than the mean, the deviation is negative.
Raw scores (X)
|
|
|
9 |
|
|
|
|
8 |
9 |
10 |
|
|
7 |
8 |
9 |
10 |
11 |
The mean is 9 (81/9). If we subtract 9, we get:
Deviation scores (x)
|
|
|
0 |
|
|
|
|
-1 |
0 |
1 |
|
|
-2 |
-1 |
0 |
1 |
2 |
The mean has the property that the sum of deviations from the mean is zero, that is

The mean always has this property. Deviations from the median and mode only sum to zero when the median and mode equal the mean. (That is, by accident; the mode and median usually don't have this property).
Variability aka Dispersion
We will describe 4 measures: the range, the average deviation, the variance, and the standard deviation.
The Range
Range = high score - low score
12 14 14 16 16 18 20 -- range = 20-12 = 8
Usually mention extreme ages of graduating seniors at commencement (e.g., 20 vs. 71, range = 51)
Average Deviation
The sum of deviations about the mean is zero, that is
. But we can take the absolute values of the deviations, and that value will be larger than zero for distributions that vary. This is defined as the average deviation. For ungrouped data:

and for grouped data:
.
The average deviation has intuitive meaning, but it is not widely used because it is not friendly to math types. Instead of taking the absolute value of deviations, math types square the deviations to deal with the problem of negative deviations.
The Variance
The population variance is defined as

where s 2 is the population variance, m is the population mean, and S , X and N have their customary meaning. This says that the variance is equal to the average squared deviation from the mean. To find the variance,
1) take each score and subtract the mean.
2) square the result.
3) find the average over scores.
Example:
Raw scores (X)
|
|
|
9 |
|
|
|
|
8 |
9 |
10 |
|
|
7 |
8 |
9 |
10 |
11 |
The mean is 9 (81/9). If we subtract 9, we get:
Deviation scores (x)
|
|
|
0 |
|
|
|
|
-1 |
0 |
1 |
|
|
-2 |
-1 |
0 |
1 |
2 |
Squared Deviations (in original places)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
|
|
|
|
|
|
|
|
1 |
0 |
1 |
|
|
|
|
4 |
|
|
1 |
0 |
1 |
|
|
4 |
Squared Deviations (arranged in sequence)
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
0 |
1 |
|
|
|
|
|
|
|
|
0 |
1 |
|
|
4 |
|
|
|
|
|
0 |
1 |
|
|
4 |
Average = 12/9 = 1.33
The variance of the raw scores is 1.33. The variance is the average squared deviation from the mean.
The Standard Deviation
The variance is, again, the average squared deviation from the mean. The unit of measure is the squared deviation from the mean. So instead of a deviation of two units, we deal with a squared deviation of 4. Instead of a deviation of 3, we have a squared deviation of 9, and so forth. It would be nice to deal with deviations in their original metric instead of its square (as did the average deviation). To do this, we can take the square root of the variance. This puts the unit back to its unsquared size.
The population standard deviation is:

(The formula for the SD is found by taking the square root of both sides of the equation for the variance).
This says that the standard deviation, s , is equal to the square root of the average squared deviation from the mean. The standard deviation is also known as the root-mean-square deviation from the mean. Sounds like a rap song. The root-mean-square deviation from the mean actually tells you how to compute it from the outside in. First you take a deviation from the mean (X-m ), then you square it (X-m )2, then you mean it (average it,
), then you root it (
). That's all there is to it. The standard deviation tells the average distance to the mean. If you recall the formula for the distance between two points in geometry, it looked something like this:

(the square root of a sum of squared differences)
The standard deviation takes the deviation from the mean for each point in the distribution and squares that. It takes the average and then the square root. It is conceptually analogous to the average distance in geometry. Let's recall our example:
Deviation scores (x)
|
|
|
0 |
|
|
|
|
-1 |
0 |
1 |
|
|
-2 |
-1 |
0 |
1 |
2 |
Squared Deviations (in original places)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
|
|
|
|
|
|
|
|
1 |
0 |
1 |
|
|
|
|
4 |
|
|
1 |
0 |
1 |
|
|
4 |
Squared Deviations (arranged in sequence)
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
0 |
1 |
|
|
|
|
|
|
|
|
0 |
1 |
|
|
4 |
|
|
|
|
|
0 |
1 |
|
|
4 |
Average = 12/9 = 1.33 = the Variance. The Standard Deviation is
= 1.15. This is the average distance from the mean for this distribution.
Example of mean and standard deviation: Age distribution.
Example 2: Test scores
|
Score |
Mean |
Deviation |
Dev*Dev |
|
|
78 |
86 |
-8 |
64 |
|
|
80 |
86 |
-6 |
36 |
|
|
80 |
86 |
-6 |
36 |
|
|
82 |
86 |
-4 |
16 |
|
|
84 |
86 |
-2 |
4 |
|
|
84 |
86 |
-2 |
4 |
|
|
86 |
86 |
0 |
0 |
|
|
86 |
86 |
0 |
0 |
|
|
86 |
86 |
0 |
0 |
|
|
88 |
86 |
2 |
4 |
|
|
88 |
86 |
2 |
4 |
|
|
90 |
86 |
4 |
16 |
|
|
90 |
86 |
4 |
16 |
|
|
92 |
86 |
6 |
36 |
|
|
96 |
86 |
10 |
100 |
|
|
|
|
|
|
|
|
86 |
86 |
0 |
22.4 |
Variance |
|
|
|
|
4.73286 |
SD |
