Philosophy of Science; Scaling vs. Psychometrics
An attempt to define Science
Philosophers have had lengthy discussion of what science is. There are two main approaches to defining science: contents and methods. Contents refer to the object of study, such as optical properties of materials. Methods refer to the techniques or means of study, such as collecting observations subject to being blind to the study's hypotheses. Some years ago, there was a new building being developed to house scientific research here at USF. The dean of the college went to the provost to ask for space for psychology in the building. The provost said "I don't mean to insult you, but the space is reserved for scientific research. You don't do science." This is an example of defining science by content. Most psychologists consider psychology to be a scientific enterprise (at least in part) because they view science to be defined by methods. Psychology is an empirical enterprise. Empiricism means supported by empirical effort, that is, supported by observation, experiment, and experience. Empiricists believe that knowledge comes from observation rather than expert opinion, authority or inspiration. The beauty of observation is that anyone can do it; this is rather different than a vision or a dream, both of which are quite personal and are not subject to direct observation by others.
A fable (I don't know where it came from; I didn't create it).
Once upon a time there was a monastery and the monks were all sitting down to dinner after a long day of prayer and work. They were arguing about the number of teeth in a cow's mouth. One group said "A cow has 28 teeth. It says so right in Aristotle." The other group said, "No, no, you have it all wrong. A cow has 30 teeth. It says so in Plato." A young monk listened to the debate with interest. He finally said "You know we have some cows out in our pasture. Why don't we go count the number of teeth they have to settle this?" At this point, all the other monks got together and thrashed the young monk soundly, sending him to bed without any supper for such a dumb idea. The point of the story is that there are several different ways of knowing things. The young monk was the empiricist of the bunch.
Methodological Aspects of the Scientific Method
| ASPECT | PURPOSE |
| Records of Observations | Helps to minimize bias of memory and attention |
| Replication | Needed to be sure that experience is reliable |
| Representative Design | Needed for generalizability |
| Margin (quantification) of Error | Needed to communicate confidence of predictions; variance of outcomes |
Schematic of Basic Units of Scientific Entities
Figure 1

Figure 2

We are typically interested in things that we cannot directly observe, such as problem solving ability, feelings of guilt, or empathy. Such things are labeled theoretical entities, hypothetical constructs, or more generally, T-terms (see Figure 1). Some fairly concrete entities such as weight are not directly observable. Length or height appears to be directly observable, but our perceptions of it are often wrong (for example, the vertical-horizontal illusion). [Bishop Berkeley became notorious for his proof that "material substances do not exist," so it is safe to say that T-terms or hypothetical constructs are ubiquitous or at least common enough to catch our interest.]
We take measurements of aspects of objects (that is, measures). Measurement is the assignment of numbers to represent attributes of objects [by rules so that some characteristics of numbers correspond to the attributes...]. Note that we do not measure people or cars or whatever per se. We measures attributes of objects; attributes of people include such items as height or the ability to carry a tune. The measures (numbers) are directly observable, and are referred to as O-terms (see Figure 1). It is our intention to let the observed measure stand for the construct of interest. That is, the O-term will stand for the T-term in our scientific investigation. It is one of life's little difficulties (or something like that) that the O-terms never correspond exactly to the T-term. Instead, variance in the O-term is always attributable in part to things other than the T-term. The extraneous variance is always due to some aspect of the measurement procedure, such as the instrument (e.g., the Heisenberg uncertainty principle when measuring atomic particles; using a thermometer to measure something very cold), the person to be measured (e.g., social desirability and research on sexual practices) or some aspect of the testing situation (e.g., the local environment during the administration of the SAT). The problem is illustrated in Figure 2.
Scientific Understanding and the Nomological Net
Figure 3 shows a schematic of the so called nomological net. The net consists of a series of connected theoretical and observed terms, that is, it represents a theory. For example, in organizational psychology there is a theory that participation in decision making leads to a reduction in role ambiguity and this leads to greater job satisfaction. In other words, the reason that employee participation in decision making leads to greater job satisfaction is because employees understand their jobs and their place in the organization better through participation.
Figure 3.

The nomological net allows interplay between theory and data. The observed variables support or create a net for the unobserved variables. Some of the theoretical terms may be undefined operationally. The theoretical part of the nomological net allows us to make predictions about the relations of observed variables even before we create the variable, that is, before we operationally define an observed variable. This will be important in understanding validity and the process of validation.
Figure 4 shows another example.

A theory is a set of constructs and relations among them. Theories are interesting in themselves to many of us, but they also have some very practical uses:
1) They serve to reduce observations. The facilitate storage and retrieval of information -- they promote understanding. We don't need to gather information that is not theoretically relevant. For example, in studying hostility among basketball teams, we can ignore aspects of the equipment and focus on the interpersonal relations. Much of the interpersonal stuff wouldn't need to be coded, either.
2) They serve to deduce new effects -- to generate hypotheses. In Figure 3, a change in T1 should produce changes in all other variables in the system. A change in variable 3 should only produce a change in variable 4. In the participation in decision making example, we would predict that reducing role ambiguity through job descriptions, films or meetings would increase job satisfaction, but would not influence perceptions in participation in decision making.
Scientific Explanation
According to the philosophers, all scientific explanation follows the form:
|
Rule |
Example: Why does an oar in the water appear bent? |
|
Law |
Optical density |
|
Conditions |
Air, water, angle of view |
|
Therefore Effect |
Oar appears bent |
For another example, Gay-Lussac's Law governing gasses states that (if you check out observations in a balloon)
![]()
where V refers to volume and T refers to temperature. This assumes, of course, that pressure is constant. Pressure is a boundary condition for the operation of Gay-Lussac's Law. This is so because Gay-Lussac's law is correct when pressure is constant.
The main point to notice about all scientific explanation is that there is no scientific explanation without prediction. Scientific explanation comes in the form that there are general rules, and that the particular instance to be explained falls under those rules within specific conditions. This means that we can make predictions ahead of time and check to see whether the predictions are correct. This distinction was developed in Europe when the intellectuals were most interested in ideas from both physics (relativity) and Freudian psychoanalysis. They noticed that psychoanalysts could "explain" why a person acted in a certain way but could not predict how a person would react to an event. They decided that such explanation was not scientific. For example, a person smiles at another. A Freudian might say that the smile was explained by a defense against a hostile feeling. However, the same analyst could not predict how the person would act when feeling hostile impulses. Therefore the explanation of the smile is not scientific because it doesn't follow the form of laws and conditions. (It could be made to be scientific so by developing specific laws and conditions for smiling and perhaps other emotional behavior. However, psychoanalysts have not been very successful in doing so.) Scientific explanations are falsifyable, that is, capable of being proven wrong. As we shall see next, however, it is unfortunate that nothing can be certain to be proven right or correct (that is, science cannot provide the truth).
Rational approaches to proof
There are two main methods for generating proof: induction and deduction. Induction is reasoning from the specific to the general, or from examples to the rule. For example, we might reason that because the sun has come up every morning we can remember, then it will come up tomorrow morning. We might reason that because all the birds we know can fly, therefore all birds can fly (of course this is not true, but most people believe it to be true when they are young children). The problem with induction is that the conclusion or rule doesn't necessarily follow. Just because all the birds you know can fly doesn't necessarily mean that you won't ever find a flightless bird.
Deduction is reasoning from the general to the specific, or from the rule to the example. For example, if all cars have wheels and this is a car, then it must have wheels. The nice thing about deduction is that if the premise (assumption) is true and we use sound reasoning, the conclusion must follow. This is good because the conclusion will then apply to lots of things that we don't have to go and verify. In other words, deduction applies generally, and this is very powerful. The problem with deduction is that the premise or assumption or axiom set (that is, the stuff you assume to be true to start) can be mistaken. For example, the premise might be "it always rains on Tuesday." Therefore if we know it is Tuesday, we know it will rain. However, the premise is mistaken, so the conclusion will not hold.
Notice that there are two rational approaches to proof and that neither of them is certain to be correct. Both can be shown to be wrong. On the other hand, if they are right in our experience this doesn't mean that they will always be right. This is a difficulty for using science as a cornerstone of a personal philosophy of life. Nothing is ever certainly true, only provisionally true, just waiting to be falsified.
Connection of Scientific Theory to Experiential Understanding
We understand concepts like length and speed because of our extensive experience with them. We have a subjective feeling of understanding because the concepts are embedded in our own nomological nets. Although length can be perceived fairly readily, our understanding of it comes from daily activity with it. If you know the length of an object, you know whether you can fit it in your car or put it in an elevator. You will have a pretty good idea whether you will be able to pick it up or use it to knock a ball out of a tree. Because of vast experience with the concept, you understand it. Concepts such as atoms, intelligence or job satisfaction, however, are not so familiar. Science provides a way to understand natural phenomena that is very efficient. That is, we set up specific observations to test rules of behavior of the interesting concept. We could do the same thing with ordinary objects such as investigating their length.
Psychologists Devise a Measure of Weight
I think about measurement by analogy to physical quantities quite a bit. It helps me to know whether I'm on the right track. This example might help you to think about psychological tests.
Imagine that we have affixed a slender metal rod to a plywood base so that the rod is vertical, that is, upright. At the top of the rod is a fairly flat cup holding a marble. Halfway down the rod a pedal is connected by a ring to the rod. The pedal is connected by a hinge to the plywood base. The apparatus is very much like a bass drum pedal except that instead of a drum pad that gets slammed into a drum, we have a plate with a marble in it. What we do is to build several of these rod-and-pedal gizmos where the thickness (diameter) of the rod varies. To measure weight, we have a person step on each pedal. When the pedal is stepped on, the rod will bend; if it bends far enough, the marble will fall out. What we do to measure weight is to count up the number of marbles that fall for one trial of each and every pedal-and-rod set.
A single apparatus might look something like this:

The rods might vary in thickness like this:

Note that instead of counting the total number of marbles as a measure of weight, we could let one specific rod "closest" to a person stand for that person's weight.
Connections between Science and Psychometrics
1. Observation and replication. Psychometrics is concerned with the meaning of numbers as they are used to stand for attributes of objects.
2. Precision of prediction and error. Precise statements of prediction are rarely possible without quantification. The amount of error is also difficult to specify without quantitative measures. The study of reliability is concerned with the amount of error in an observed variable.
3. Theory development and evaluation. It is difficult to make statements about functional relations among variables without measures (e.g., f = ma; e = mc2). It is very difficult to evaluate theories without measures (e.g., ANOVA, factor analysis, structural equation modeling).
Comparison of Scaling and Psychometric Approaches to Measurement
|
Scaling |
Psychometrics |
|
History |
|
|
Laws of Behavior
|
Individual Differences |
|
Psychophsyics
|
Intelligence |
|
Analytical, Theoretical
|
Clinical, Practical |
|
Object of Measurement |
|
|
Objects other than people
|
People |
|
Brightness, loudness, color
|
Aptitudes, attitudes |
|
Place of items (stimuli) in measurement |
|
|
Central (e.g., place of cars in JD Powers ratings of automobile satisfaction)
|
Peripheral or incidental (e.g., I hate my car as a measure of hostility) |
|
Place of people in measurement |
|
|
People are peripheral, replicates, error (e.g., differences in people in judgments of loudness of a tone)
|
People are central, differences in people are the object of study (e.g., differences in people in hostility) |