1.  

Spring 1999, Exam 1

  1. What is the function of the Fisher r to z transformation?
  2. Why might you prefer DFBETA rather than leverage as a regression diagnostic?
  3. What is the assumption of homoscedasticity? What is the problem if it is not met?
  4. Describe a concrete example that would lead to a large standard error for a b weight in a regression equation (describe 1 DV and 2 IVs and say why at least 1 IV would have a large standard error). Relate your concrete example to the formula for the standard error of the b weight to show why your example is correct.
  5. How (why) is it possible to have a significant R2 and nonsignificant b weights?

6. For what purpose(s) do we prefer the b weights and for what purpose(s) do we prefer beta (standardized b) weights?

7. Describe the pieces in a linear model. Describe how the regression model allocates the variance (or sum of squares, or mean square) in Y to each piece. Include R-square in your description.

8. With only a single independent variable, under what circumstances will b = r?

9. Draw a graph of a scatterplot with a regression line. Describe in both words and symbols what the slope and intercept are.

10. When might we prefer to report correlation coefficients rather than regression coefficients? When might we prefer to report regression coefficients rather than correlation coefficients.

Exam 2, Spring 1999

  1. Give a concrete example (names of variables, context) in which it makes sense to compute a semipartial correlation. Why a semipartial rather than a partial?
  2. Why is the squared semipartial always less than or equal to the partial correlation? Draw a Venn diagram and show how the picture relates to the formulas for the partial and semipartial.
  3. Describe the variance inflation factor. What does it tell you?
  4. What are variable selection routines such as stepwise good for? What are they not suitable for?
  5. What is commonality analysis? What can it be used to accomplish?
  6. What is collinearity in multiple regression? Why is it a problem?
  7. What is dummy coding? What do a and b mean for this model (that is, how are they interpreted after the analysis)?
  8. What are orthogonal polynomials? What is the advantage of using orthogonal polynomials for curvilinear data analysis over ordinary polynomial regression?
  9. What is shrinkage in multiple regression? Why is it important?
  10. What is the difference between a moderator and a mediator?

Exam 3, Spring 1999

  1. When we model a continuous and a categorical independent variable at the same time (e.g., during ANCOVA), what is the equation for the first model we fit to the data? What is the meaning of each of the three b weights in such models?
  2. Why should we avoid dichotomizing continuous IVs? Describe three reasons.
  3. Consider the following path diagram. Define the correlation between 2 and 3 in terms of path coefficients. Group and label the paths into effects that are unanalyzed, direct, indirect and spurious, as appropriate for the correlation between 2 and 3.

  1. What is a path coefficient? How are path coefficients and regression coefficients related?
  2. Describe the lens model equation in regression terms. Define ra, the achievement index. Then describe G, Rs and Re as they relate to regressions on the two sides of the lens.
  3. Why is r typically a biased estimator of r ?
  4. Why do we compute a cross validation R-square? What does this procedure tell us about our regression equation and its usefulness?
  5. What is the difference between a b weight and a beta weight? When might you prefer b to beta and beta to b?
  6. What is polynomial regression? Describe a concrete example (names of IV(s), DV) in which it would make sense to use polynomial regression and how you would use it.

10. Suppose we have 3 independent variables and according to our simultaneous regression results, all three b weights are statistically significantly different from zero. We could use hierarchical regression to test if 3 adds significant variance when 1 and 2 are entered first (that is, test whether the increase in R-square due to 3 is significant), whether 2 adds after 1 and 3 are entered first, and whether 1 adds after 2 and 3 are entered first. Why don't we need to carry out theses hierarchical F tests?