home

Taylor Russell Tables

Cronbach: Validity is not "the degree to which a test measures what it purports to measure." Validity is a property of the inferences based on test scores. Validity refers to the quality of decisions or judgments we make given test scores. For example, tests are used to decide who to hire for a job. The validity of a test for that decision is how useful the test is in hiring. A test may be used to decide whether a person is schizophrenic, obsessive-compulsive, or alcoholic. The validity of the test is evaluated in terms of how well it classifies clients into their respective categories.

Several early psychologists, including a fellow named Hull (who wrote a book called Aptitude testing, published in 1928), noted that the correlations between test scores and measures of performance were not very large.  The small correlations become even smaller when they are squared to indicate the amount of shared variance between the test and the performance measure.

Taylor and Russell (1939) answered Hull and others in what is one of the most famous papers in industrial and organizational psychology. Their answer was that sometimes tests could be very useful in selecting people even though the correlation between test scores and job performance was not very high. They also noted that sometimes tests are not very useful even when the correlation between test scores and job performance is very high.

According to Taylor and Russell, there are three important factors to consider when judging the usefulness (i.e., validity by Cronbach's definition). They are (a) the correlation between the test score and job performance, (b) the base rate of success on the job, and (c) the selection ratio.

But first, let's consider Taylor and Russell' s quadrants:

 

On the horizontal axis, we have test scores, which are low on the left and high on the right. On the vertical axis, we have job performance, which is low toward the bottom and high toward the top. Taylor and Russell dichotomized job performance such that anyone above some threshold is considered a success, but below the threshold, they are considered a failure. Applicants are tested, and based on their test scores, they are either hired or rejected. Take a look at the diagram. We want to hire people who will be successful (top right). We want to reject people who would fail if hired (bottom left). We would like to avoid hiring people who fail (bottom right). We would like to avoid rejecting people who would be successful (top left).

The correlation. Other things being equal, the larger the correlation between test scores and job performance, the more useful the test will be. If the correlation is 0, then the test is useless because people with high test scores are no more likely to be successful on the job than people with low test scores. If the correlation is 1.0, then the test predicts job performance perfectly, and picking people with high test scores will always give better job performance.

As the correlation increases, the ellipse becomes thinner. Notice that as the correlation increases, a greater proportion of the cases fall into the two good quadrants and a smaller proportion of the cases fall into the two bad quadrants.

The base rate. Taylor and Russell divided incumbents by their job performance into two groups: successes and failures. If you are doing well enough on the job, you are a success; otherwise you are a failure. This ratio is taken of current incumbents who have not been selected by the test. It is the base rate of success by applicants without using the test. It answers the question "if we hired everyone who applied, what proportion of people would be successful on the job?" It turns out that other things being equal, tests are most useful when the ratio of success to failure is 50/50, and tests get less useful as the ratio moves toward 100/0 or 0/100. Consider 100/0, where everyone who applies is successful. Then the test is useless because it cannot improve on perfect success. On the other hand, consider 0/100, where everyone who applies fails. The test cannot be useful because it cannot pick anyone who will succeed on the job. On the other hand, if the ratio is about 50/50, and the test can pick better people, it can improve the 50/50 ratio of success on the job considerably.

 

 

 Selection ratio. The selection ratio is the number hired divided by the number who applied. If 100 people apply and 50 are hired, the selection ratio is .5. If 100 people apply and 10 are hired, the selection ratio is .1. We will assume that the top people (i.e., those who score highest on the test) will be selected. For example if we are selecting 10 of 100, we will take the top 10 scorers. In general, other things being equal, the smaller the selection ratio, the more useful the test becomes.

 

 

 

 

Notice how as the selection ratio gets smaller (we select fewer people off the top) that the ratio of success to hire goes up. The number of people hired but who would fail gets smaller; the end result is more success using the test. The price that you pay for this is rejecting people who would have succeeded if they were hired. But this is usually more of a problem for applicants than hiring managers or businesses.

Some very abbreviated Taylor-Russell Table entries:

Proportion of Employees Considered Satisfactory = .20 (Base rate.)

 

Selection Ratio

r

.10

.30

.50

.90

.00

.20

.20

.20

.20

.25

.34

.29

.26

.21

.50

.52

.38

.31

.22

.95

.97

.64

.40

.22

Proportion of Employees Considered Satisfactory = .50 (Base rate.)

 

Selection Ratio

r

.10

.30

.50

.90

.00

.50

.50

.50

.50

.25

.67

.62

.58

.52

.50

.84

.74

.67

.54

.95

1.00

.99

.90

.56

There is one table for each base rate (two tables are shown here, one for base rate = .20, and one for base rate = .50). The correlation is shown as rows; the selection ratio as columns. The entries in the table are the proportion expected to be successful if you use the test. Notice

  1. When r = .00, using the test results in a success rate equal to the base rate, which is the same thing as not using the test. If there is no correlation between the test and success on the job, then using the test will not improve selection.
  2. As the correlation gets larger, the success rates go up. For example, in the first column of entries in the first table, the base rate is .20, and the selection ratio is .10. When the correlation is .25, the proportion successful is .34, which is up .14 from .20. When the correlation is .50, the success rate is .52, which is up .32 from .20.
  3. When the selection ratio is small, changes in the size of the correlation make a lot of difference in the success rate. When the selection ratio is large, however, changes in the size of the correlation make little difference. For example, in the first table when the selection ratio is .9 and the correlation is .25, the expected success rate is .21, which is up .01 from .20. When we move from a correlation of .25 to a correlation of .95, the success rate goes from .21 to .22, which is not much. This happens because when the selection ratio is large, we basically have to hire anyone who applies; we cannot be selective.

Two major points:

(1) Validity changes with the decision or context of the use of the test. A test which is valid in one situation may not be valid in another situation. Q: Imagine testing for factory jobs in different parts of the country. Can you think of a test which might be valid in one part of the country but not in another?

(2) Tests that do not predict job performance well can be extremely valid if the base rate of success is near .5 and the selection ratio is small. Consider the use of the SAT and GRE.