Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Question

Question

Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Consider the R builtin dataset cars: data(mtcars)

– Divide the data into training and test data such that 80% of the data is randomly assigned to the training data and the remaining 20% is assigned to the test data. Use set.seed(100) in your code before performing the split to main reproducibility of results. (Hint: use the R function sample)
– Fit dist vs speed (as the independent variable) using a linear model on the training data and print a summary of this fit.
– Display the residual plots from the fitting
– Obtain the mean square error of prediction using the test data.

math Statistics-And-Probability

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

PFA the complete R code for all parts

library(CARS)
library(Metrics) # It has MSE function

data1=mtcars
set.seed(100)

speed=data1$mpg
distance=data1$disp

n=length(speed) #Total length = 32
training_size=as.integer(0.8*n) #Trainingt sample size = 25
training_index=sample(1:length(speed),size=training_size) #Index vector for training sample
tr_speed=speed[training_index] #training sample speed
tr_distance=distance[training_index] #training samle distance
tr_df=data.frame(tr_speed,tr_distance) #training dataset

test_speed=speed[-training_index]
test_distance=distance[-training_index]
test_df=data.frame(test_speed,test_distance)

fit=lm(tr_distance~tr_speed,data=tr_df) #linear model
summary(fit)

plot(tr_speed,fit$residuals) #residual plot

#predictions
pred_distance=predict.lm(fit,newdata = data.frame(tr_distance=test_distance,tr_speed=test_speed))

pmse=mse(test_distance,pred_distance)
pmse

------------------------------------------------------------------------------------------------------------------------------------------------------

The indices for training samle obtained are (10 8 17 2 14 28 22 32 27 4 24 19 6 31 26 12 23 20 15 9 7 21 18 30 16)

The summary of fit is

summary(fit)

Call:
lm(formula = tr_distance ~ tr_speed, data = data.frame(tr_speed,
tr_distance))

Residuals:
Min 1Q Median 3Q Max
-89.503 -36.637 -7.361 42.058 120.703

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 568.594 37.888 15.007 2.26e-13 ***
tr_speed -16.959 1.767 -9.599 1.64e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 58.09 on 23 degrees of freedom
Multiple R-squared: 0.8002, Adjusted R-squared: 0.7916
F-statistic: 92.14 on 1 and 23 DF, p-value: 1.645e-09

The residual plot is

The mean square error for prediction is 8144.912

Add a comment

Answer 2

Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Homework Answers

Add Answer to:
Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Post as a guest

Earn Coins

Answer the following question by showing the codes in R 2. Consider the dataset mtcars and...

Split it into training and test data. Apply logistic regression and KNN algorithm. You can use R ...

The Motor Trend Car Road Tests dataset mtcars, in faraway R package, was extracted from the...

Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm...

2. Suppose Y ~ Exp(a), which has pdf f(y)-1 exp(-y/a). (a) Use the following R code to generate data from the model Yi...

The Book of R (Question 20.2) Please answer using R code. Continue using the survey data...

Submit the following as a R document as usual. Load the library MASS. Type the following: set.see...

4. Two new mathematics learning techniques are being tested. Twenty students were randomly select...

PLEASE ANSWER ALL parts . IF YOU CANT ANSWER ALL, KINDLY ANSWER PART (E) AND PART(F)...

4. Two new mathematics learning techniques are being tested. Twenty students were randomly selected from a...

Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Homework Answers

Add Answer to: Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...

Post as a guest

Earn Coins

Add Answer to:
Consider the R builtin dataset cars: data(mtcars) – Divide the data into training and test data...