Question

Please Use R programming language to answers these question and please show me the code as...

Please Use R programming language to answers these question and please show me the code as well. Thank You

1. Problem: dataset: savings; package : faraway
Use R, perform the calculations and answer the following questions
(a) Calculate the design matrix X, and all regression coefficients estimates, as shown in (3).
(b) Calculate the Residuals standard error , as in (5).
(c) ANOVA table: Calculate SST, SSE, SSR, ?2, as in (6).
     Calculate the ANOVA F-statistic and p-value.
(d) Build a reduced model. Which predictors you take out? Why? (8)
     Use anova() to compare 2 models. Conclusions.
(e) Perform 1 prediction.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

(a)

library(faraway)
data(savings)
X=savings
RM=lm(sr ~ pop15 + pop75 + dpi + ddpi, savings)
RM

Call:
lm(formula = sr ~ pop15 + pop75 + dpi + ddpi, data = savings)

Coefficients:
(Intercept)          pop15            pop75             dpi                ddpi
28.5660865   -0.4611931   -1.6914977   -0.0003369    0.4096949

(b)

summary(RM)

Call:
lm(formula = sr ~ pop15 + pop75 + dpi + ddpi, data = savings)

Residuals:
    Min      1Q Median      3Q     Max
-8.2422 -2.6857 -0.2488 2.4280 9.7509

Coefficients:
                     Estimate    Std. Error    t value Pr(>|t|)  
(Intercept) 28.5660865 7.3545161   3.884 0.000334 ***
pop15       -0.4611931   0.1446422 -3.189 0.002603 **
pop75       -1.6914977 1.0835989 -1.561 0.125530  
dpi            -0.000336    0.0009311 -0.362 0.719173  
ddpi         0.4096949    0.1961971   2.088 0.042471 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.803 on 45 degrees of freedom
Multiple R-squared: 0.3385,    Adjusted R-squared: 0.2797
F-statistic: 5.756 on 4 and 45 DF, p-value: 0.0007904

The Residuals standard error is 3.803

(c)

anova(RM)

Analysis of Variance Table

Response: sr
                  Df Sum Sq Mean Sq F value    Pr(>F)  
pop15        1   204.12   204.118   14.1157 0.0004922 ***
pop75        1   53.34     53.343     3.6889   0.0611255 .
dpi             1   12.40     12.401     0.8576   0.3593551  
ddpi           1   63.05     63.054     4.3605   0.0424711 *
Residuals 45 650.71    14.460                    
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(d) As we see in the previous result that pop15 and ddpi significantly effect the savings rate. So, we ignoring the other insignificantly variable, do the analysis

RM1=lm(sr ~ pop15 + ddpi, savings)
summary(RM1)

Call:
lm(formula = sr ~ pop15 + ddpi, data = savings)

Residuals:
    Min      1Q Median      3Q     Max
-7.5831 -2.8632 0.0453 2.2273 10.4753

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 15.59958    2.33439   6.682 2.48e-08 ***
pop15       -0.21638    0.06033 -3.586 0.000796 ***
ddpi         0.44283    0.19240   2.302 0.025837 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.861 on 47 degrees of freedom
Multiple R-squared: 0.2878,    Adjusted R-squared: 0.2575
F-statistic: 9.496 on 2 and 47 DF, p-value: 0.0003438

anova(RM1)

Analysis of Variance Table

Response: sr
          Df Sum Sq Mean Sq F value    Pr(>F)  
pop15      1 204.12 204.118 13.6942 0.0005633 ***
ddpi       1 78.96 78.959 5.2973 0.0258374 *
Residuals 47 700.55 14.905                    
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Add a comment
Know the answer?
Add Answer to:
Please Use R programming language to answers these question and please show me the code as...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • I'm confused on how to solve this, it needs to be coded in R programming language: Use the data s...

    I'm confused on how to solve this, it needs to be coded in R programming language: Use the data set `airquality` (a built in dataset in Rstudio). Compare means of temperature between May and July. Go through the entire procedure: state the null and alternative hypotheses, state the assumptions, calculate the test statistics, state the distribution of the test statistic if the null hypothesis is true, compute the rejection region, compute the p-value, and state your conclusions. Check whether the...

  • The Motor Trend Car Road Tests dataset mtcars, in faraway R package, was extracted from the...

    The Motor Trend Car Road Tests dataset mtcars, in faraway R package, was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). The data frame has 32 observation on 11 (numeric) variables: mpg: Miles/(US) gallon; cyl: Number of cylinders; disp: Displacement (cu.in.); hp: Gross horsepower; drat: Rear axle ratio; wt: Weight (1000 lbs); qsec: 1/4 mile time; vs: Engine (0 = V-shaped, 1 =...

  • R Programming Language This question uses the lattice package. Show the code you used to generate...

    R Programming Language This question uses the lattice package. Show the code you used to generate the graphs. Use the state.region and state.x77 data set (recall typing ?state.x77 into the console will provide a discription of this data set). Question 1a Investigate the population density of USA states within the four regions: Northeast, South, North Central and West. Observe the graph of State Population as a Function of Area. Create Figure 1. Question 1b There is one state in the...

  • In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt. Thi...

    In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt. This dataset consists of variables possibly relating to blood pressures of n = 39 Peruvians who have moved from rural high altitude areas to urban lower altitude areas. The variables in this dataset are: Age, Years, Weight, Height, Calf, Pulse, Systol and Diastol. Before reading the data intoMATLAB, it can be viewed in a text editor. This question involves the use of multiple linear regression...

  • Please double check answers. Thanks. Suppose an environmental agency would like to investigate the relationship between...

    Please double check answers. Thanks. Suppose an environmental agency would like to investigate the relationship between the engine size of sedans and the miles per gallon (MPG) they get. The accompanying table shows the engine size in cubic liters and rated miles per gallon for a selection of sedans. Use this information to complete parts a through f PG for 9 car models Table of Engine Size and MPG с. ar Model A Model B Model C Model D Model...

  • R programming question. Please use #comments too ! 1. The data set UN11 in the alr4...

    R programming question. Please use #comments too ! 1. The data set UN11 in the alr4 package contains several variables, including ppgdp, per capita gross domestic product in US dollars, and fertility, number of children per woman, from the year 2009-2011. The data are for 199 localities, and we will study the regression of ppgdp on fertility (a) Draw the scatterplot of ppgdp against fertility and describe the relationship between these two variables. Is the trend linear? nD the simple...

  • Please show formulas you use, it isn't multiple questions only 1. Teacher likes to break the...

    Please show formulas you use, it isn't multiple questions only 1. Teacher likes to break the question into multiple parts to help students with the steps. Thank you :)! 7. Suppose GNC, a vitamin and supplement supplier, would like to investigate the relationship between the size of an order and the age of the consumer who ordered it. This information could allow GNC to target its promotions to specific age groups. The following table shows the ages for seven randomly...

  • The Book of R (Question 20.2) Please answer using R code. Continue using the survey data...

    The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...

  • please show the steps and the code to solve this in R, thank you 11. (10 marks) (using dataset: "hpricel", in R:...

    please show the steps and the code to solve this in R, thank you 11. (10 marks) (using dataset: "hpricel", in R: data(hprice1, package-wooldridge')) Use the data to 5 estimate the model where price is the house price measured in thousands of dollars iWrite out the results in equation form. iiWhat is the estimated increase in price for a house with one more bedroom, holding square footage and lot size constant? iii What is the estimated increase in price for...

  • Please include the R code for each individual question. Save PDF to My Note The article...

    Please include the R code for each individual question. Save PDF to My Note The article "The Undrained Strength of Some Thawed Permafrost Soils" (Canadian Geotech. J., 1979: 420-427) contained the accompanying data on y shear strength of sandy soil (kPa), xl depth (m), and x2 water content (%) Obs Depth Content Strength 8.9 31.5 14.7 2 36.6 27.0 48.0 3 36.8 25.9 25.6 46.1 39.1 10.0 56.9 39.216.0 66.9 38.3 16.8 77.3 33.9 20.7 88.4 33.8 38.8 9 6.5...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT