--------------------------------------------------------------------------------------------------------------------------------------------------------------
Output:
Code in text format:
install.packages("alr4")
library(alr4)
data(wblake)
names(wblake)
# Part (a) solution:
X = wblake$Age; Y = wblake$Length
xbar = mean(X); ybar = mean(Y); n = length(Y)
SSxy = sum((X-xbar)*(Y-ybar))
SSxx = sum((X-xbar)^2); SSyy = sum((Y-ybar)^2)
(b1 = SSxy/SSxx); (b0 = ybar-b1*xbar) # Reg Estimates
Y.hat = b0+b1*X # Reg of length on age
SSE = sum((Y-Y.hat)^2); SSR = sum((Y.hat-ybar)^2); SST =
SSR+SSE
(R.Sq = SSR/SST) # Coefficient of Deter
(MSE = SSE/(n-2)) # Estimate of Variance
(SE.b1 = sqrt(MSE/SSxx)) # SE of slope b1
(SE.b0 = sqrt(MSE*((1/n)+(xbar^2/SSxx)))) # SE of intercept b0
#------------------------------------------------------------------------
#
# Part (b) solution:
# Calculate a 90% confidence interval for beta1.
a = 1-0.90 # Confidence level = 0.90
t.star = round(qt(1-a/2,n-2),3) # t critical value
ME = t.star*SE.b1
b1-c(ME,-ME) # 90% CI for beta1
#-------------------------------------------------------------------------
#
# Part(c) solution:
# Cal a prection and a 90% preidction interval for a small mouth at
age 2
x0 = 2
(Y.Pred = b0+b1*x0) # Predicted Y when X=2
ME = t.star*sqrt(MSE)*sqrt(1+(1/n)+((x0-xbar)^2)/SSxx)
Y.Pred - c(ME,-ME)
3. This problem uses the wblake data set in the alr4 package. This data set includes...
3. R programming 3. This problem uses the wblake data set in the alr4 package. This data set includes samples of small mouth bass collected in West Bearskin Lake, Minnesota, in 1991 Interest is in predicting length with age. Complete this problem without using Im( in R (a) Do the regression of length on age, and report the estimates, their standard errors and the estimate of variance. Interpret Bo and (b) Obtain a 900% confidence interval for βί fron the...
1. The data set UN11 in the alr4 package contains several variables, including ppgdp, per capita gross domestic product in US dollars, and fertility, number of children per woman, from the year 2009-2011. The data are for 199 localities, and we will study the regression of ppgdp on fertility. (a) Draw the scatterplot of ppgdp against fertility and describe the relationship between these two variables. Is the trend linear? (b) Replace both variables by their natural logarithms and draw another...
2. The data set prostate in the faraway package is from a study on 97 men with prostate cancer who were due to receive a radical prostatectomy. We are interest is in predicting lpsa (log prostate specific antigen) with lcavol (log cancer volume). (a) Draw a scatterplot - does a simple linear regression model seem reasonable? (b) Without using the R function Im(0, compute the values , Y,Sxx, Syy and Sxy. Com pute the ordinary least squares estimates of the...
2. R programming 2·The data set prostate in the faraway package is froma study on 97 men with prostate cancer who were due to receive a radical prostatectomy We are interest is in predicting lpsa (log prostate specific antigen) with Icavol (log cancer volume). (a) Draw a scatterplot -does a simple linear regression model seem reasonable? (b) Without using the R function Im), compute the values T,Y, Sxx, Syy and Sxy. Com- pute the ordinary least squares estimates of the...
Please use RStudio, thanks! 3. This problem uses the prostate data set in the faraway package. (a) Plot lpsa against lcavol. Use the R function lm() to fit the regressions of lpsa on lcavol and lcavol on lpsa. (b) Display both regression lines on the plot. At what point do the two lines intersetct? Give a brief explanation.
Exercise 2. [Data analysis, requires R] For this questions use the bac data set from the openintro library. To access this data set first install the package using install.packages ("openintro") (this only needs to be done once). Then load the pack- age into R with the command library(openintro). You can read about this data set in the help menu by entering the command ?openintro or help(openintro). Many people believe that gender, weight, drinking habits, and many other factors are much...
Please help me with the problem 4.7! The reference problem 1.20 is attached, and the data is the full data in question 1.20. Thanks! 20 2 60 4 46 3 41 2 12 1 137 10 68 5 89 5 4 1 32 2 144 9 156 10 93 6 36 3 72 4 100 8 105 7 131 8 127 10 57 4 66 5 101 7 109 7 74 5 134 9 112 7 18 2 73 5...
Using R code: A researcher collects data on the relationship between the amount of daily exercise a person gets and their percent of body fat. She is trying to see if exercise (X) can predict percentage of body fat (Y). The following data were recorded: Individual 1 2 3 4 5 Daily Exercise (min) (X) 10 18 26 33 44 % Fat (Y) 30 25 18 17 14 a. Draw a scatterplot that represents this data set with linear and lowess...
3. (40 points) Use the graph, an output of the least squares prediction equation for the starting salary data (in thousands of dollars) given a graduated student's cumulative GPA, and the table of sampled data below to do the following Student ID GPA(x) 3.26 Starting Salary (y) 33.8 2.60 29.8 3.35 33.5 2.86 30.4 3.82 36.4 2.21 27.6 3.47 35.3 Regression Plot Y= 14.8156 + 5.70657x R-Sq 0.977 寸 853 4.0 2.0 2.5 3.0 3.5 GPA (a) Identify and interpret...
R STUDIO Create a simulated bivariate data set consisting of n 100 (xi, yi) pairs: Generate n random a-coordinates c from N(0, 1) Generate n random errors, e, from N(0, o), using o 4. Set yiBoB1x; + , Where Bo = 2, B1 = 3, and eN(0, 4). (That is, y is a linear function of , plus some random noise.) (Now we have simulated data. We'll pretend that we don't know the true y-intercept Bo 2, the true slope...