Problem

Use the data in ELEM94_95 to answer this question. See also Computer Exercise.Use the data...

Use the data in ELEM94_95 to answer this question. See also Computer Exercise.

Use the data in ELEM94_95 to answer this question. The findings can be compared with those in Table 4.1. The dependent variable lavgsal is the log of average teacher salary and bs is the ratio of average benefits to average salary (by school).

(i) Run the simple regression of lavgsal on bs. Is the estimated slope statistically different from zero? Is it statistically different from _1?

(ii) Add the variables lenrol and lstaff to the regression from part (i). What happens to the coefficient on bs? How does the situation compare with that in Table 4.1?

(iii) How come the standard error on the bs coefficient is smaller in part (ii) than in part (i)? (Hint: What happens to the error variance versus multicollinearity when lenrol and lstaff are added?)

(iv) How come the coefficient on lstaff is negative? Is it large in magnitude?

(v) Now add the variable lunch to the regression. Holding other factors fixed, are teachers being compensated for teaching students from disadvantaged backgrounds? Explain.

(vi) Overall, is the pattern of results that you find with ELEM94_95.RAW consistent

with the pattern in Table 4.1?

(i) Using all of the data, run the regression lavgsal on bs, lenrol, Istaff, and lunch. Report the coefficient on bs along with its usual and heteroskedasticity-robust standard errors. What do you conclude about the economic and statistical signifi¬cance of fihp.

(ii) Now drop the four observations with bs >.5, that is, where average benefits are (supposedly) more than 50% of average salary. What is the coefficient on bs? Is it statistically significant using the heteroskedasticity-robust standard error?

(iii) Verify that the four observations with bs >.5 are 68, 1,127, 1,508, and 1,670. Define four dummy variables for each of these observations. (You might call them d68, d1127, d1508, and d1670.) Add these to the regression from part (i), and verify that the OLS coefficients and standard errors on the other variables are identical to those in part (ii). Which of the four dummies has a t statistic statistically different from zero at the 5% level?

(iv) Verify that, in this data set, the data point with the largest studentized residual (largest t statistic on the dummy variable) in part (iii) has a large influence on the OLS estimates. (That is, run OLS using all observations except the one with the large studentized residual.) Does dropping, in turn, each of the other observations with bs >.5 have important effects?

(v) What do you conclude about the sensitivity of OLS to a single observation, even with a large sample size?

(vi) Verify that the LAD estimator is not sensitive to the inclusion of the observation identified in part (iii).

Step-by-Step Solution

Request Professional Solution

Request Solution!

We need at least 10 more requests to produce the solution.

0 / 10 have requested this problem solution

The more requests, the faster the answer.

Request! (Login Required)


All students who have requested the solution will be notified once they are available.
Add your Solution
Textbook Solutions and Answers Search
Solutions For Problems in Chapter 9