Question

A statistician wishes to estimate the price of a used car of a certain brand using linear regression based on the variables Age, Mileages, Crash history, and Owner’s plate number. To do so, she randomly selected a certain number of used cars of the brand and measures AGE, MILEAGE, CRASH HISTORY, and NUMBER OF DIGITS ON PLATE NUMBER. The following is the output from R. Based on the R output, answer the following questions.

Call: Im(formula = PRICE - AGE + MILEAGE + CRASH + PLATE_DIGITS) Residuals: Min 1Q Median 3Q Max -1.42685 -0.66292 0.00514 0.

f) Which variable would you first eliminate to build a parsimonious model?

g) If we eliminate “PLATE DIGITS”, we get the R output on the next page. So, can we say this model is better than the preceding one? If so, what are the features that make the new model better than the preceding one?

h) Do you think it is reasonable to eliminate another predictor variable from the new model? If so, which predictor should be eliminated?

- Call: Im(formula = PRICE AGE + MILEAGE + CRASH) Residuals: Min 1Q Median 3Q Max -1.6105 -0.6479 0.1210 0.3997 1.7880 Coeffi

i) Common sense tells us “AGE”, “MILEAGE”, and “CRASH HISTORY” are important variables for PRICE prediction so why might we eliminate one of them from the model?

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Result:

f) Which variable would you first eliminate to build a parsimonious model?

“PLATE DIGITS is the variable would first eliminate because it is not significant and has largest p value.

g) If we eliminate “PLATE DIGITS”, we get the R output on the next page. So, can we say this model is better than the preceding one? If so, what are the features that make the new model better than the preceding one?

R square adjusted for the first model is 0.8906 and the second model is 0.894. Second model have largest adjusted R square than the first model. Therefore the new model is better than the preceding one.

h) Do you think it is reasonable to eliminate another predictor variable from the new model? If so, which predictor should be eliminated?

Age is the non significant variable in the model( p value 0.30396 which is > 0.05 level). Therefore it is reasonable to eliminate AGE variable from the new model.

i) Common sense tells us “AGE”, “MILEAGE”, and “CRASH HISTORY” are important variables for PRICE prediction so why might we eliminate one of them from the model?

AGE is correlated with MILEAGE. Large AGE associated with large MILEAGE. Therefore we may eliminate one of them from the model.

Add a comment
Know the answer?
Add Answer to:
A statistician wishes to estimate the price of a used car of a certain brand using...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Using R output provided 1). Perform hypothesis testing for B(beta)1=2 using A(alpha)=0.05 > summary(ls) Call: Residuals:...

    Using R output provided 1). Perform hypothesis testing for B(beta)1=2 using A(alpha)=0.05 > summary(ls) Call: Residuals: Min 1Q Median 3Q Max 0.20283 -0.14691 -0.02255 0.06655 0.44541 Coefficients: (Intercept) 0.365100.099043.686 0.003586 ** Signif. codes: 0 '***' 0.001 '0.01 '*'0.05 '.' 0.1''1 Estimate Std. Error t value Pr>Itl) 0.96683 0.18292 5.286 0.000258** Residual standard error: 0.1932 on 11 degrees of freedom Multiple R-squared 0.7175, Adjusted R-squared: 0.6918 F-statistic: 27.94 on 1 and 11 DF, p-value: 0.0002581 anovaCLs) Analysis of Variance Table Response:...

  • > summaryCls) Call: Lm(formula y X) Residuals: -0.20283 -0.146910.02255 0.06655 0.44541 Coefficients: (Intercept) 0.36510 0.09904 3.686...

    > summaryCls) Call: Lm(formula y X) Residuals: -0.20283 -0.146910.02255 0.06655 0.44541 Coefficients: (Intercept) 0.36510 0.09904 3.686 0.003586 ** Min 1Q Median 3Q Max Estimate Std. Error t value Pr(>ltl) 0.96683 0.18292 5.286 0.000258*** Signif. codes: 00.001*0.010.050.11 Residual standard error: 0.1932 on 11 degrees of freedom Multiple R-squared 0.7175, Adjusted R-squared: 0.6918 F-statistic: 27.94 on 1 and 11 DF, p-value: 0.0002581 > anovaCls) Analysis of Variance Table Response : y Df Sum Sq Mean Sq F value PrOF) 1 1.04275 1.04275...

  • please show your explanation thanks! ## ## Call: ## Im(formula = mpg ~ disp + hp...

    please show your explanation thanks! ## ## Call: ## Im(formula = mpg ~ disp + hp + wt + osec, data = mtcars.train.df) ## ## Residuals: Min 1Q Median ## -4.3442 -1.1687 -0.4033 3Q Max 1.0519 5.9623 ## ## Coefficients: Estimate Std. Error t value Pr>t) ## (Intercept) 31.204891 10.909916 2.860 0.00967 ** ## disp 0.009432 0.012308 0.766 0.45245 ## hp -0.032908 0.025528 -1.289 0.21208 ## wt -4.978374 1.434757 -3.470 0.00242 ** ## qsec 0.434043 0.576267 0.753 0.46011 ## ---...

  • It has been established for a long time that height has a positive correlation with weight....

    It has been established for a long time that height has a positive correlation with weight. As people gets taller their weight increases. In a research study, a linear regression model was proposed to predict weight based on height. R output below provides the analysis. Interpret it, list any strengths and limitations of the result. Call: lm(formula = Weight ~ Height) Residuals:     Min      1Q Median      3Q     Max -6.7104 -2.9217 0.4276 2.3973 7.8586 Coefficients:             Estimate Std. Error t value...

  • How do I interpret the p-values in terms of rejecting or failing to reject H0 at...

    How do I interpret the p-values in terms of rejecting or failing to reject H0 at a 95% confidence level? What does the intercept column mean in terms of p-value? How does the p-value of the F test compare and what does it mean? In the simple linear regression I'd conclude age isn't related to pulmonary disease (what does intercept p-value mean) but for the multiple regression I'd say age and height aren't related to pulmonary disease but smoking is...

  • Consider the dataset in the proj2-3.txt file on BlackBoard. In this problem, focus is on high systolic blood pressure (sbp) and possible explanatory variables Body Mass Index (bmi), and scale (scl)....

    Consider the dataset in the proj2-3.txt file on BlackBoard. In this problem, focus is on high systolic blood pressure (sbp) and possible explanatory variables Body Mass Index (bmi), and scale (scl). Consider the linear regression model with response high SBP and scale as explana- tory variables. Explain the coefficients in the model? Explain the null hypotheses that the estimated slope equals 0? Write a summary of your findings. What is your conclusion? Residuals: Min 1Q Median 3Q Max -72.64 -27.55...

  • Call: lm(formula = launch_speed ~ launch_angle, data = muncy) Residuals:     Min      1Q Median      3Q     Max...

    Call: lm(formula = launch_speed ~ launch_angle, data = muncy) Residuals:     Min      1Q Median      3Q     Max -64.802 -9.009   2.401 10.821 20.709 Coefficients:              Estimate Std. Error t value Pr(>|t|)    (Intercept) 86.95164    0.78064 111.385 < 2e-16 *** launch_angle 0.20804    0.02865   7.261 1.77e-12 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 13.74 on 438 degrees of freedom Multiple R-squared: 0.1074, Adjusted R-squared: 0.1054 F-statistic: 52.72 on 1 and 438 DF, p-value:...

  • We fit a GARCH (1, 1) model and display the MLE of the fitted model belovw > summary(dax.garch) C...

    Please show work We fit a GARCH (1, 1) model and display the MLE of the fitted model belovw > summary(dax.garch) Call: garch(x- dax) Model: GARCHC1,1) Residuals: 1Q Median Max 12.18398 -0.47968 0.04949 0.65746 4.48048 Min 3Q Coefficient(s): Estimate Std. Error t value Pr>Itl) a0 4.639e-06 7.560e-07 6.137 8.42e-10** a1 6.833e-02 1.125e-02 b1 8.891e-01 1.652e-02 53.817 <2e-16* 1.25e-09 Signif. codes: 0 '***' 0.001 0.010.05 '.' 0.1 ''1 What is the t-value for a1? We fit a GARCH (1, 1) model...

  • Interpreting regression results 2. This is the result of a regression where goals is the dependent...

    Interpreting regression results 2. This is the result of a regression where goals is the dependent variable and minutes played is the explanatory variable. a. Write out the simple linear regression equation that predicts goals based on time played using the output displayed here. If the average soccer player played one additional game (90 minutes), how many additional goals would you predict them to have scored? b. Call: 1m(formula goalstimeplayed, data -data) Residuals: Min 1Q Median 3Q Max 5.0572-1.6294 -0.3651...

  • Regression Analysis Question: is there any relationship between the economic growth rate(GDP) and unemployment rate(Unemp_Rates), Poverty...

    Regression Analysis Question: is there any relationship between the economic growth rate(GDP) and unemployment rate(Unemp_Rates), Poverty (Poverty_Rates), Technological and science workforce(ech_Scien_Wforce), high school graduate (HS_Grad_Rates) , and housing cost( Housing_Cost) This is a regression analysis result : Call: lm(formula = GDP_Rate ~ Unemp_Rates + Poverty_Rates + Tech_Scien_Wforce + HS_Grad_Rates + Housing_Cost) Residuals: Min 1Q Median 3Q Max -1.5562 -0.3988 -0.1126 0.4971 1.6748 Coefficients: Estimate Std. Error t value Pr(>|t|)    (Intercept) 15.822486 5.424619 2.917 0.00555 ** Unemp_Rates -0.053450 0.136427 -0.392...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT