. The data set below contains information about the gasoline mileage performance for 32 au- tomob...

Question

Question

please answer the following using the r code provided

. The data set below contains information about the gasoline mileage performance for 32 au- tomobiles. We are interested in d

i. Give the prediction equation ii. Construct the ANOVA table for this model Source sum of squares df mean sum of squares F R

(e) Applying the above transformation, we observe that the residuals are more randomly scattered, and we choose to use the tr

. The data set below contains information about the gasoline mileage performance for 32 au- tomobiles. We are interested in developing a model to predict the miles per gallon () using related predictor variables. The variables in the study are Dependent variable: Miles per gallon (v) Independent variables: ri horsepower (ft-lb) ra: torque (ft-lb) r: horsepower+torque (ft-lb) rs: carburetor (barrels) (a) We first start by fitting a model using y and ri,,s,zs.rs, r. However, the regression fails to fit the model due to perfect multicollinearity, since z is a linear combination of r2 and What is a possible remedy for multicollinearity in this example? (b) Dropping r from the model, we fit a model using y and rs and r that is, we fit the following model where the description of the response and the predictors is given above. Below is some R out put: > model1-ln(y x1+x2+x3+x5+x6) > sumnary (nodel1) Call Im(fornula x1x2x3x5x6) Residuals 3Q Max 6.780-1429-0.332 1.586 6.296 Min 1Q Median Coofficients Estimate Std. Error t value Pr>ItI) (Intercept) 33.5412568 3.2028581 10.472 8.040-11* x2 x3 x5 x6 -0.0876880 0.0424834-2.064 0.0491 -0.0553033 0.0740766 -o.747 0.4620 0.0758799 0.0737334 1.029 0.3129 1.3299840 1.1131248 1.195 0.2429 -0.0001946 0.0017454 -0.111 0.9121 Signif. codes: 0 ? ? 0.001 ?*#7 0.01 ?#7 0.05 ?,? 0.1 ? ? 1 Residual standard error 3.122 on 26 degrees of freedom Multiple R-squared: 0.7952,Adjusted R-squared: 0.7558 F-statistic: 20.19 on 5 and 26 DF, p-value: 3.29e-08 > anova (nodel1) Analysis of Variance Table Response : y x1 x2 x3 x5 x6 Residuals 26 253.47 Dt Sum Sq Mean Sq F vlue PrOF) 955.34 955.34 97.9968 2.616e-10 16.37 6.37 0.6531 0.4263 1 2.39 2.39 0.2450 0.6248 1 19.86 19.86 2.0373 0.1654 1 0.12 0.12 0.0124 0.9121 9.75
i. Give the prediction equation ii. Construct the ANOVA table for this model Source sum of squares df mean sum of squares F Regression Error Total ii. Test for the significance of the regression using a-0.05 (c) Using the following output and plots, comment about the model assumptions > shapiro.test (res) Shapiro-Wilk normality test data: res 0.98429, p-value 0.9101 Normal QQ Plot Residuals against Stned values (d) Using the following output and plot, what possible transformation will you suggest? Give the equation of the transformed model result-boxcox(y~x1+x2+x3+x5+x6, lambda-seq(-1,5,1,by-o.01)) > result$x[resultsy -nax(result$y)) C1)-0.41 1544
(e) Applying the above transformation, we observe that the residuals are more randomly scattered, and we choose to use the transformed model. However, we canl observe some outliers in the plots of residuals against fitted values. We would like to further investigate those outliers. Using the plot of residuals against leverage we observe that ob- servations 2 and 17 have leverage value grater than the cut-off point. We further use the other measurements (DFFITS, DFFITS, Cook's Distance, COVRATIO) and conclude that we should investigate more those two observations. We refit the model excluding those observations and compare some statistics. Below is a table with the comparisons. Model full model PRESS .148 4.7880-04 6.82-04 6.62-046.783-033-060.820 000022 0.01001 without o. 072-04 7.113-047.2520-04-6.837e-03 8.349-060.824 0.000230.01020 without no. 17 | 0.146 4.636e-04 5474e-04 | -5.9lks04 .5.105.03 5.0Hk-06 0.816 0.00€ 23 0.01014 without io, 2 and i7 0.140| 4.624AM 5866e-04 -6,575·AM 1-53c-03-318 -06-.820-0(Kas 0.01010 MS Comment whether those two points are influential or not.

math Statistics-And-Probability

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

(i)

y = 33.5413 - 0.0877(x1) - 0.0553(x2) = 0.0759(x3) + 1.3299(x5) - 0.0002(x6)

(ii)

ANOVA Source of variation DF Sum Of Squares Mean Sum of Squares F value 955.34 6.37 2.39 19.86 0.12 253.47 1237.55 955.3497.9

(iii)

Regression X1 X2 X3 X5 x6 Significance(at 0.05) Significant Not significant Not significant Not significant Not significant p

d) In most cases we use Log Transformation to increase the model efficiency.

e)

Model Full Model Without no. 2 iihoui r>. 1./ Without no. 2 and 17 R2 0.82 0.824 0.82

Without Number 17 R² is reduced from 0.82 to 0.816 but without number 2 R² becomes 0.824.Though there is only slight variation reducing coefficient of determination becomes serious so we can say that the number 17 is more influential than the number 2.

Add a comment

Answer 2

. The data set below contains information about the gasoline mileage performance for 32 au- tomob...

Homework Answers

Add Answer to:
. The data set below contains information about the gasoline mileage performance for 32 au- tomob...

Post as a guest

Earn Coins

The following data were collected on a simple random sample of 20 patients with hypertension: Y=m...

The following data were collected on a simple random sample of 20 patients with hypertension: Y=mean...

Two linear regression models are fitted using software and below is their R2 and adjusted R2...

4. The anscombe data set in the datasets R package (should automatically be loaded) contains 4 pa...

3. Description of each X and data for 27 franchise stores are given below The data (X1, X2, X3, X4, X5, X6) are for each franchise store. X1 annual net sales/$1000 X2 number sq. ft/1000 X3 - inv...

2. Suppose Y ~ Exp(a), which has pdf f(y)-1 exp(-y/a). (a) Use the following R code to generate data from the model Yi...

Please include the R code for each individual question. Save PDF to My Note The article...

Question 2: Suppose that we wish to fit a regression model for which the true regression...

Help is needed on question 1. The second picture is the data set “Showtime.xlsx” needed to...

. The data set below contains information about the gasoline mileage performance for 32 au- tomob...

Homework Answers

Add Answer to: . The data set below contains information about the gasoline mileage performance for 32 au- tomob...

Post as a guest

Earn Coins

Add Answer to:
. The data set below contains information about the gasoline mileage performance for 32 au- tomob...