The set in stat3_prob2 presents data collected during a solar energy project at Georgia Tech. (USE R-studio to work on this!)
(a) Fit a simple linear regression model relating total heat flux y (kilowatts) to the radial deflection of the deflected rays x (milliradians) and give the fitted regression line.
(b) Construct the analysis-of-variance table and test for significance of regressionusing α = 0.05.
(c) Find a 99% confidence interval for the slope.
(d) Calculate R2.
(e) Find a 95% CI of the mean heat flux when the radial deflection is 16.5 milliradians.
(f) Suppose that we wish to predict the heat flux obtained when the radial deflection is 16.5
milliradians. Find a 95% prediction interval on the heat flux.
(g) Compare the two intervals obtained in parts e and f. Explain the difference between
them. Which one is wider, and why?
(h) Plot the 95% confidence and prediction bands (Hint: the r-code for this can be found in the lecture notes).
(i) Give a point estimate for σ2 and calculate a 95% confidence interval.
Set :
y=c(271.8, 264, 238.8, 230.7, 251.6, 257.9, 263.9, 266.5, 229.1, 239.3, 258, 257.6, 267.3, 267, 259.6, 240.4, 227.2, 196, 278.7, 272.3, 267.4, 254.5, 224.7, 181.5, 227.5, 253.6, 263, 265.8, 263.8)
x=c(16.66, 16.46, 17.66, 17.5, 16.4, 16.28, 16.06, 15.93, 16.6, 16.41, 16.17, 15.92, 16.04, 16.19, 16.62, 17.37, 18.12, 18.53, 15.54, 15.7, 16.45, 17.62, 18.12, 19.05, 16.51, 16.02, 15.89, 15.83, 16.71)
(a)
Loaded the given data in R studio.
y=c(271.8, 264, 238.8, 230.7, 251.6, 257.9, 263.9, 266.5, 229.1, 239.3, 258, 257.6, 267.3, 267, 259.6, 240.4, 227.2, 196, 278.7, 272.3, 267.4, 254.5, 224.7, 181.5, 227.5, 253.6, 263, 265.8, 263.8)
x=c(16.66, 16.46, 17.66, 17.5, 16.4, 16.28, 16.06, 15.93, 16.6, 16.41, 16.17, 15.92, 16.04, 16.19, 16.62, 17.37, 18.12, 18.53, 15.54, 15.7, 16.45, 17.62, 18.12, 19.05, 16.51, 16.02, 15.89, 15.83, 16.71)
Ran the below code to fit a simple linear regression model.
model = lm(y~x)
The output of the model is,
> model
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
607.1 -21.4
So, the regression line is y = 607.1 - 21.4x
(b)
Analysis-of-variance table can be found with the anova command.
anova(model)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x 1 10578.7 10579 69.609 5.935e-09 ***
Residuals 27 4103.2 152
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Since the p-value ( Pr(>F) ) is less than the significance level of 0.05, we conclude that there is significant evidence that the regression model is significant.
(c)
The summary of the model is,
summary(model)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-26.2487 -4.5029 0.5202 7.9093 24.5080
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 607.103 42.906 14.150 5.24e-14 ***
x -21.402 2.565 -8.343 5.94e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 12.33 on 27 degrees of
freedom
Multiple R-squared: 0.7205, Adjusted R-squared:
0.7102
F-statistic: 69.61 on 1 and 27 DF, p-value: 5.935e-09
Store the summary of the model in s variable.
s = summary(model)
We see that the estimate of the slope and its standard error are (2,1) and (2,2) elements of the coefficient vector. Thus slope is s$coefficients[2,1] and its standard error lies in s$coefficients[2,2].
Degree of freedom = n - 2
So, 99% confidence interval for the slope is,
c(s$coefficients[2,1] - qt(0.995, length(x)-2) *
s$coefficients[2,2], s$coefficients[2,1] + qt(0.995, length(x)-2) *
s$coefficients[2,2])
[1] -28.50995 -14.29497
99% confidence interval for the slope is (-28.50995, -14.29497)
(d)
From the output of the summary command, Multiple R-squared is 0.7205
So, R2 = 0.7205
(e)
95% CI of the mean heat flux when the radial deflection is 16.5 milliradians is found as
predict(model, data.frame(x=16.5),
interval="confidence")
fit lwr upr
1 253.9627 249.1468 258.7787
So, 95% CI of the mean heat flux when the radial deflection is 16.5 milliradians is
(249.1468, 258.7787)
(f)
95% prediction interval of the mean heat flux when the radial deflection is 16.5 milliradians is found as
predict(model, data.frame(x=16.5),
interval="predict")
fit lwr upr
1 253.9627 228.214 279.7114
So, 95% prediction interval of the mean heat flux when the radial deflection is 16.5 milliradians is
(228.214, 279.7114)
(g)
Confidence interval tells about the mean heat flux for a given radial deflection. Prediction interval tells about the predicted heat flux for a given radial deflection. The width of prediction interval is greater than the confidence interval. Thus, prediction interval is wider.
(h)
Run the below command for confidence and prediction interval bands.
newx <- seq(summary(x)[1], summary(x)[6], by=0.05)
ci = predict(model, data.frame(x=newx),
interval="confidence")
plot(x,y)
lines(newx, ci[,2], col="blue", lty=2)
lines(newx, ci[,3], col="blue", lty=2)
abline(model, col="lightblue")
Then run the below command for prediction interval.
pi = predict(model, data.frame(x=newx),
interval="prediction")
plot(x,y)
lines(newx, pi[,2], col="blue", lty=2)
lines(newx, pi[,3], col="blue", lty=2)
abline(model, col="lightblue")
The set in stat3_prob2 presents data collected during a solar energy project at Georgia Tech. (USE...