Question

Exercise 1. For this exercise use the bdims data set from the openintro package. Type ?bdims to read about this data set in t

1 0
Add a comment Improve this question Transcribed image text
Answer #1

Please find below required r code

#--------installing and getting in to global environment
install.packages("openintro")
library(openintro)
names(bdims)
#---Creating Factor
bdims$sex2<-factor(bdims$sex,levels = c(0,1),labels = c('F','M'))
# ------------ ggplot
ggplot(bdims,aes(bdims$hgt,bdims$wgt,colour=bdims$sex2)) +
geom_point()+geom_smooth(method="lm")
#----------model1
model<-lm(bdims$wgt~bdims$sex+bdims$hgt)
summary(model)
# regression equation is(wgt = -56.94 + 8.36 sex + 0.71hgt)
#-----------model 2
mode2<-lm(bdims$wgt~bdims$sex+bdims$hgt+(bdims$sex*bdims$hgt))
summary(mode2)
# regression equation is(wgt = -43.81 - 17.13 sex + 0.63hgt+0.14 (sex*hgt))

120 100 bdims$sex2 80- 60- 40- 160 170 180 190 200 150 bdims$hgt

Add a comment
Know the answer?
Add Answer to:
Exercise 1. For this exercise use the bdims data set from the openintro package. Type ?bdims to r...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Exercise 2. Consider the iris data set. (a) Fit a linear regression model for Sepal.Width using S...

    Exercise 2. Consider the iris data set. (a) Fit a linear regression model for Sepal.Width using Sepal.Length and Species as predictors. Recall that Species is a categorical variable with 3 levels (setosa versicolor, and virginica). Use summary) to print the results. What is the base- line level for Species in the model? (b) Fit a linear regression model for Sepal.Width using Sepal.Length, Species, and the interaction between Sepal.Length and Species as predictors. Use summary ) to print the results. (c)...

  • Exercise 2. [Data analysis, requires R] For this questions use the bac data set from the...

    Exercise 2. [Data analysis, requires R] For this questions use the bac data set from the openintro library. To access this data set first install the package using install.packages ("openintro") (this only needs to be done once). Then load the pack- age into R with the command library(openintro). You can read about this data set in the help menu by entering the command ?openintro or help(openintro). Many people believe that gender, weight, drinking habits, and many other factors are much...

  • In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt. Thi...

    In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt. This dataset consists of variables possibly relating to blood pressures of n = 39 Peruvians who have moved from rural high altitude areas to urban lower altitude areas. The variables in this dataset are: Age, Years, Weight, Height, Calf, Pulse, Systol and Diastol. Before reading the data intoMATLAB, it can be viewed in a text editor. This question involves the use of multiple linear regression...

  • The Book of R (Question 20.2) Please answer using R code. Continue using the survey data...

    The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...

  • Solve the following problem in R and print out the commands and outputs. 3. The data...

    Solve the following problem in R and print out the commands and outputs. 3. The data set fancy (you need to library the fpp package to get the dataset) concern the monthly sales figures of a shop which opened in January 1987 and sells gifts, souvenirs, and novelties. The sales volume varies with the seasonal population of tourists. (a) Produce a time plot of the data and describe the patterns in the graph. Identify any unusual or unexpected fluctuations in...

  • Use the Eli Orchid data to extend your regression model in P2 with the dummy variable representing the weekend. 1. In column B calculate the values of the dummy variable representing weekend (w). The dummy variable w is set to 1 for Saturday or Sunday. O

    Use the Eli Orchid data to extend your regression model in P2 with the dummy variable representing the weekend. 1. In column B calculate the values of the dummy variable representing weekend (w). The dummy variable w is set to 1 for Saturday or Sunday. Otherwise it is set to 0. DO NOT type the values in - you must build a formula.2. Run the regression multiple analysis. Generate the regression output in a yellow cell below. 3. Use the...

  • For the following exercises you can use the 'Wooldridge' package in R to load the data 9. (7 marks) (using data...

    For the following exercises you can use the 'Wooldridge' package in R to load the data 9. (7 marks) (using dataset: "k401k") The data in 401K are a subset of data analyzed by Papke (1995) to study the relationship between participation in a 401(k) pension plan and the generosity of the plan. The variable prate is the percentage of eligible workers with an active account; this is the variable we would like to explain. The dummy variable sole represents whether...

  • 1. Choose a data set of your own:?Response or dependent variable (Y)?At least 3 or more...

    1. Choose a data set of your own:?Response or dependent variable (Y)?At least 3 or more independent variables (X1, X2, X3, ... etc.) that you believe has an influence on Y.?At least 40 observations or data points?If there are categorical variables, model them appropriately2. Fit a multiple regression model. ?Interpret the model equation?Are all the chosen variables significant? Discuss.?Check for model assumptions and make appropriate comments.?How good is the model? Comment on R2 , R , se, F-value etc and...

  • USE R STUDIO The stackloss data frame available in R contains 21 observations on four variables...

    USE R STUDIO The stackloss data frame available in R contains 21 observations on four variables taken at a factory where ammonia is converted to nitric acid. The first three variables are Air.Flow, Water.Temp, and Acid.Conc. The fourth variable is stack.loss, which measures the amount of ammonia that escapes before being absorbed. Read the help file for more information about this data frame. - Give a numerical summarization of each column of the dataset, then use boxplots to help illustrating...

  • Exercise 16.1 (parts a-e) (Applied predictive Modeling M. Kuhn, K. Johnson) The “adult” data set ...

    Exercise 16.1 (parts a-e) (Applied predictive Modeling M. Kuhn, K. Johnson) The “adult” data set at the UCI Machine Learning Repository is derived from census records. In these data, the goal is to predict whether a person’s income was large (defined in 1994 as more than $50K) or small. The predictors include educational level, type of job (e.g., never worked, and local government), capital gains/losses, work hours per week, native country, and so on (exclude variable fnlwgt from analysis). After...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT