1)
i)
Assumptions
Linear relationship: The model is a roughly linear one. This is slightly different from simple linear regression as we have multiple explanatory variables. This time we want the outcome variable to have a roughly linear relationship with each of the explanatory variables, taking into account the other explanatory variables in the model.
Homoscedasticity: Ahhh, homoscedasticity - that word again (just rolls off the tongue doesn't it)! As for simple linear regression, this means that the variance of the residuals should be the same at each level of the explanatory variable/s. This can be tested for each separate explanatory variable, though it is more common just to check that the variance of the residuals is constant at all levels of the predicted outcome from the full model (i.e. the model including all the explanatory variables).
Independent errors: This means that residuals should be uncorrelated.
As with simple regression, the assumptions are the most important issues to consider but there are also other potential problems you should look out for:
Outliers/influential cases: As with simple linear regression, it is important to look out for cases which may have a disproportionate influence over your regression model.
Variance in all predictors: It is important that your explanatory variables... well, vary! Explanatory variables may be continuous, ordinal or nominal but each must have at least a small range of values even if there are only two categorical possibilities.
Multicollinearity: Multicollinearity exists when two or more of the explanatory variables are highly correlated. This is a problem as it can be hard to disentangle which of them best explains any shared variance with the outcome. It also suggests that the two variables may actually represent the same underlying factor.
Normally distributed residuals: The residuals should be normally distributed.
ii)
Source | SS | df | MS | F | p-value |
regression | 1009.92 | 3 | 336.64 | 8.732774 | 0.00024 |
Residual | 1195.02 | 31 | 38.54903 | ||
Total | 2204.94 | 34 |
Formulas in Excel
Source | SS | df | MS | F | p-value |
regression | 1009.92 | 3 | =B2/C2 | =D2/D3 | =F.DIST.RT(E2,3,31) |
Residual | =B4-B2 | =C4-C2 | =B3/C3 | ||
Total | 2204.94 | 34 |
MS = SS/df
F = MS regression / MS Error
a)
y^ = 2.96 -11.02 x1 + 5.13 x2 -1.15 x3
are the assumptions behind any multiple regression model? (b). For a multiple regression model Y-Bo + βιΧ. + β2X2 +β3Xs + € where is the error term, to represent the relationship between Y and th...
mail/u/3/inbox?projector=1 For a multiple regression model Y = B. B.X.+ B.X.-B.X, BX, BX,+ € where is the error term, to represent the relationship between Y and the four X-variables. We got the following results from the data: Source Sum of Squares degrees of freedom mean squares 110.92 Regression Residual Total 215.94 And also given: Variable B. values S(B) Degrees of freedom 0.02 0.056 -0.13 0.021 0.207 -0.05 0.21 0.067 0.001 0.067 Y-intercept is B. = 2.96 d. Find the regression...
Consider a multiple linear regression model Y; = Bo + B1Xi1 + B22:2 + 33213 + Blog(x14) + Ej. We have the following statistics for the regression Call: 1m formula = y “ x1 + x2 + x3 + log(x4) Coefficients: Estimate Std. Error t value Pr(>1t|) (Intercept) 154.1928 194.9062 0.791 0.432938 x1 -4.2280 2.0301 -2.083 0.042873 * x2 -6.1353 2.1936 -2.797 0.007508 ** x3 0.4719 0.1285 3.672 0.000626 *** x4 26.7552 9.3374 2.865 0.006259 ** Signif. codes: O '***'...
Consider a multiple linear regression model Y; = Bo + B1Xi1 + B22:2 + 33213 + Blog(x14) + Ej. We have the following statistics for the regression Call: 1m formula = y “ x1 + x2 + x3 + log(x4) Coefficients: Estimate Std. Error t value Pr(>1t|) (Intercept) 154.1928 194.9062 0.791 0.432938 x1 -4.2280 2.0301 -2.083 0.042873 * x2 -6.1353 2.1936 -2.797 0.007508 ** x3 0.4719 0.1285 3.672 0.000626 *** x4 26.7552 9.3374 2.865 0.006259 ** Signif. codes: O '***'...
7.22. In the regression model Y; = Bo + B1Xi + B2(3X} – 2) +Ei, i = 1,2,3, with X1 = -1, X2 = 0, and X3 = 1, what happens to the least squares estimates of Bo and B1 when B2 = 0? Why?
Consider the multiple regression with three independent variables under the classical linear model assumptions: y Bo+BBx,+B,x, +u 1. You would like to test the hypothesis: H0: B-3B, 1 What is the standard error of B-3B,? (i Write the t-statistic of B-3B ( Define 0,= B-3B.. Write a regression equation that allows you to directly obtain 0, and its standard error.
Consider the multiple regression model shown next between the dependent variable Y and four independent variables X1, X2, X3, and X4, which result in the following function: Y = 33 + 8X1 – 6X2 + 16X3 + 18X4 For this multiple regression model, there were 35 observations: SSR= 1,400 and SSE = 600. Assume a 0.01 significance level. What is the predictions for Y if: X1 = 1, X2 = 2, X3 = 3, X4 = 0
4. Testing for significance Aa Aa Consider a multiple regression model of the dependent variable y on independent variables x1, x2, X3, and x4: Using data with n = 60 observations for each of the variables, a student obtains the following estimated regression equation for the model given: 0.04 + 0.28X1 + 0.84X2-0.06x3 + 0.14x4 y She would like to conduct significance tests for a multiple regression relationship. She uses the F test to determine whether a significant relationship exists...
Consider a multiple regression model of the dependent variable y on independent variables x1, X2, X3, and x4: Using data with n 60 observations for each of the variables, a student obtains the following estimated regression equation for the model given: y0.35 0.58x1 + 0.45x2-0.25x3 - 0.10x4 He would like to conduct significance tests for a multiple regression relationship. He uses the F test to determine whether a significant relationship exists between the dependent variable and He uses the t...
1. In order to test whether the multiple linear regression model y bo +b,x1 + b2X2 is better than the average model (lazy model), which of the following null hypotheses is correct: a. Ho' b1 = b2 = 0 Но: B1 B2-0 с. We have a dataset Company with three variables: Sales, employees and stores. To build a multiple linear regression model using Sales as dependent variable, number of stores and number of employees as independent variables, which of the...
[For questions 10-15] Consider the following multiple regression model with two right-hand-side variables. Y; = bo + by Xli + b2 X 2 +e Question 10 1 pts Please answer whether the following equation is true or false. Y; = bo + b1 X1 + b2 X2i True False D Question 11 1 pts Please answer whether the following equation is true or false. et =Y - Ý True O False