Psychology 6140

The readings and problems assigned here are meant to cover the weeks until the start of next term.

- Cliff, Chapter 7-9

- Modelling and interpreting interactions in multiple regression
- A discussion of why interaction terms in a regression model should
use (products of)
**centered**variables.

You can do the calculations for this problem with APL or SAS. With APL, you can use the function REGRES in the library 6140 GLM for the regressions and SCAT for the plots With SAS, you can use PROC REG and/or PROC GLM, together with PROC PLOT. The data are available in two forms on the class disk:

- The file
**SURVEY SAS**creates the data set SALARY with the variables SALARY, EXPRNC, EDUC, and MGT. You can copy this file to your A-disk and add SAS statements to it to answer the questions below. - The data are also in the APL library 6140 DATA as the variable
SURVEY, which you can get with the command,
COPY '6140 DATA SURVEY'

- Fit a linear regression predicting salary from years of
experience alone.
- Find the overall F* value for this model, and the observed t* for the hypothesis, H sub 0 : beta sub 1 = 0. In what sense are these values equivalent?
- Find the fitted values and residuals from this model. Make a scatter plot of residual vs. years of experience, identifying the 6 different education - management groups with different plotting symbols. [In APL, you will have to color or identify the points by hand; In SAS, use PLOT yvar * xvar = GROUP.] Is there any evidence in this plot that education and/or management group predicts salary after years of experience have been taken into account? Examine a univariate display of the residuals as one batch. Is there any evidence of violations of assumptions of the model or of unusual observations?

- Construct two dummy (indicator) variables for level of
education, and add education and management to the
prediction equation.
- Test the hypothesis that education and management variables together add significantly (beyond experience) to the prediction of salary.
- Find 95% confidence intervals for each of the regression weights in this model. Describe verbally what each of the raw regression weights mean in terms of the problem situation.
- Find fitted values and residuals. Scatter plot residuals against years of experience, identifying the 6 groups by different plotting symbols. Is there any evidence in this plot that there remains systematic variation in salaries which is not accounted for in this model? Any indication of unusual observations?

- Construct variables to represent the interaction between
education and management and add these to the model.
- Do the interaction variables provide a significant
improvement in the goodness of fit of the model? Test
the marginal (partial) increase in regression SS due
to
*each*interaction variable; state formally the hypothesis which is being tested. - Obtain residuals from this model and plot against experience as before. Is there now any evidence of systematic variation related to group? Any unusual observations? Any high-leverage observations?

- Do the interaction variables provide a significant
improvement in the goodness of fit of the model? Test
the marginal (partial) increase in regression SS due
to
- Write a short (1-2 page) description summarizing the results of these analyses. One's normal expectations are that salary would increase with experience (career progress), education and management responsibility. Given this, which of the three models provides the best account of the data, and how should this model be interpreted?

* © 1995 Michael Friendly*