Question

This problem uses the Wage dataset in ISLR package in R n this part of the...

This problem uses the Wage dataset in ISLR package in R

  1. n this part of the problem, we will find a polynomial function of age that best fits the wage data. For each polynomial function between p = 0, 1, 2, ...10:

    i. Fit a linear regression to predict wages as a function of age, age2, ... agep (you should include an intercept as well). Note that p = 0 model is an “intercept-only” model

    ii. Use 5-fold cross validation to estimate the test error for this model. Save both the test error and the training error.

    (c) Plot both the test error and training error (on the same plot) for each of the models estimated above as a function of p. What do you observe about the training error as p increases? What about the test error? Based on your results, which model should you select and why?

0 0
Add a comment Improve this question Transcribed image text
Know the answer?
Add Answer to:
This problem uses the Wage dataset in ISLR package in R n this part of the...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • USING R: x variable = income, y variable = sales; data set = Carseats how would...

    USING R: x variable = income, y variable = sales; data set = Carseats how would you code this? In this part of the problem, we will find a polynomial function of Income that best fits the Carseats data. For each polynomial function between p 0,1,2,..10: i. Fit a linear regression to predict Sales as a function of Income, Income2. IncomeP (you should include an intercept as wel. Note that p 0 model is an "intercept-only" model.

  • in R For the iris dataset, store the 50 sepal lengths for the 50 versicolor rises...

    in R For the iris dataset, store the 50 sepal lengths for the 50 versicolor rises in a vector x For the iris dataset, store the 50 sepal lengths for the 50 virginica irises in a vectory What are the means and the variances of x and y? The variances "seem" different. Perform Welch's t-test that is appropriate in such cases to check if the mean sepal lengths of Versicolor and Virginica irises are significantly different. What is the p-value...

  • Problem 1 (Logistic Regression and KNN). In this problem, we predict Direction using the data Weekly.csv....

    Problem 1 (Logistic Regression and KNN). In this problem, we predict Direction using the data Weekly.csv. a. i. Split the data into one training set and one testing set. The training set contains observations from 1990 to 2008 (Hint: we can use a Boolean vector train=(Year < 2009)). The testing set contains observations in 2009 and 2010 (Hint: since train is a Boolean vector here, should use ! symbol to reverse the elements of a Boolean vector to obtain the...

  • 4 MARKS QUEStION 3 Background You are part of a team working for the United Nations...

    4 MARKS QUEStION 3 Background You are part of a team working for the United Nations Environment Programme (UNEP) to investigate the deforestation process in Borneo. You are provided six images of the forest area in Borneo from 1950-2020 which comprise of historical and projection data. Forests are represented as green pixels and deforested areas as yellow pixels. Q3a In the Q3a.m file, use the imread() function to read the images. For each year (1950, 1985, 2000, 2005, 2010, 2020),...

  • For Questions 4-11, use the swiss dataset, which is built into R. Fit a multiple linear regression model with Fertility...

    For Questions 4-11, use the swiss dataset, which is built into R. Fit a multiple linear regression model with Fertility as the response and the remaining variables as predictors. You should use ?swiss to learn about the background of this dataset. 9. 1 Run Reset Report the value of the F statistic for the significance of regression test. Enter answer here point 10. 1 Run Reset 0.01. What decision do Carry out the significance of regression test using a you...

  • Question We will be analyzing the R dataset 'ho2' which is in the fpp2 package in...

    Question We will be analyzing the R dataset 'ho2' which is in the fpp2 package in Rstudio (Note that the 'O' in the name is a ZERO i.e. 'h-ZERO-2' We examine the h02 dataset by printing it to the console with h02 and using ?h02. Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1991 0.4297950 0.4009060 0.4321590 0.4925430 0.5023690 0.6026520 1992 0.6601190 0.3362200 0.3513480 0.3798080 0.3618010 ©.4105340 0.4833887 0.4754634 0.5347610 0.5686061 0.5952233 0.7712578 1993 0.7515028 0.3875543...

  • 2. R programming 2·The data set prostate in the faraway package is froma study on 97...

    2. R programming 2·The data set prostate in the faraway package is froma study on 97 men with prostate cancer who were due to receive a radical prostatectomy We are interest is in predicting lpsa (log prostate specific antigen) with Icavol (log cancer volume). (a) Draw a scatterplot -does a simple linear regression model seem reasonable? (b) Without using the R function Im), compute the values T,Y, Sxx, Syy and Sxy. Com- pute the ordinary least squares estimates of the...

  • Please Use R programming language to answers these question and please show me the code as...

    Please Use R programming language to answers these question and please show me the code as well. Thank You 1. Problem: dataset: savings; package : faraway Use R, perform the calculations and answer the following questions (a) Calculate the design matrix X, and all regression coefficients estimates, as shown in (3). (b) Calculate the Residuals standard error , as in (5). (c) ANOVA table: Calculate SST, SSE, SSR, ?2, as in (6).      Calculate the ANOVA F-statistic and p-value. (d)...

  • (Referencing problem 6.1 from 'Data Mining for Business Analytics Concepts, Techniques, and Applications in R' Shmueli,...

    (Referencing problem 6.1 from 'Data Mining for Business Analytics Concepts, Techniques, and Applications in R' Shmueli, et.al.) 6.1.d.iii. reads as follows: " Use stepwise regression with the three options (backward, forward, both) to reduce the remaining predictors as follows: Run stepwise on the training set. Choose the top from each stepwise run. Then use each of these models separately to predict the validation set. Compare RMSE, MAPE, and mean error, as well as lift charts. Finally, describe the best model."...

  • 2. Suppose you are interested in the relationship between weekly wage earnings in dol lars) and...

    2. Suppose you are interested in the relationship between weekly wage earnings in dol lars) and age (in years). You run a linear regression model where age is your dependent variable and earn is your independent variable. Answer the following questions about your regression results. earn = 239.16 5.20 × age (20.24) (0.57) Ip = 0.05, SER 287.21 (a) Interpret the coefficient for age. (b) Is the effect of age on earnings economically significant here? (Hint: think about how much...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT