Help with coding in R:
cyl<-factor(scan(text=
"6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8
4"))
am<-factor(scan(text=
"1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1
1"))
## 1)## Using the data `cyl` and `am` (transmission type) from Part II, group vehicles based into 8 cylinder and less than 8 cyl. Test whether there is evidence of association between many cylinders and automatic transmissions. (_Hint:_ use `levels()` to re-level `cyl` and then use `chisq.test()`). Every time I run the following code, I get an error: levels(cyl) <-c("8 cyl","less than 8") Error in `levels<-.factor`(`*tmp*`, value = c("8 cyl", "less than 8")) : number of levels differs ## 2)## The built in dataset `faithful` records the time between eruptions and the length of the prior eruption for 272 inter-eruption intervals (load the data with `data(faithful)`). Examine the distribution of each of these variables with `stem()` or `hist()`. Plot these variables against each other with the length of each eruption (`eruptions`) on the x axis. How would you describe the relationship? ## 3)## Fit a regression of `waiting` as a function of `eruptions`. What can we say about this regression? Compare the distribution of the residuals (`model$resid` where `model` is your lm object) to the distribution of the variables. ## 4)## Is this data well suited to regression? Create a categorical variable from `eruptions` to separate long eruptions from short eruptions (2 groups) and fit a model (ANOVA) of `waiting` based on this. (_Hint:_ use `cut()` to make the categorical variable, and `lm()` to fit the model). How did you choose the point at which to cut the data? How might changing the cutpoint change the results?
You are defining the new levels in the wrong way. Because, when you run only levels(cyl), you will get
> levels(cyl)
[1] "4" "6" "8"
In this first and second level is "4" and "6" which are less than 8 and the third level is "8" which is equal to 8.
Therefore, assign these levels in the sequence in which they are i.e for level "4" and "6", assign level "less than 8" two times in same order and then assign "8 cyl" as the third level as shown below:
levels(cyl) <- c("less than 8", "less than 8", "8 cyl");
The Output of the whole program is given below:
> cyl<-factor(scan(text="6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8
4 4 4 4 8 8 8 8 4 4 4 8 6 8 4"))
Read 32 items
>
>
> am<-factor(scan(text="1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
1 0 0 0 0 0 1 1 1 1 1 1 1"))
Read 32 items
> levels(cyl) <- c("less than 8", "less than 8", "8
cyl");
> levels(cyl)
[1] "less than 8" "8 cyl"
> cyl
[1] less than 8 less than 8 less than 8 less than 8 8
cyl less than 8
[7] 8 cyl less than 8 less than
8 less than 8 less than 8 8 cyl
[13] 8 cyl 8
cyl 8
cyl 8
cyl 8
cyl less than 8
[19] less than 8 less than 8 less than 8 8
cyl 8
cyl 8
cyl
[25] 8 cyl less than 8 less
than 8 less than 8 8 cyl less
than 8
[31] 8 cyl less than 8
Levels: less than 8 8 cyl
Help with coding in R: cyl<-factor(scan(text= "6 6 4 6 8 6 8 4 4 6...
Exercise 1. For this exercise use the bdims data set from the openintro package. Type ?bdims to read about this data set in the help menu. Of interest are the variables hgt (height in centimeters), wgt (weight in kilograms), and sex (dummy variable with 1-male, 0-female). Since ggplotO requires that a categorical variable be coded as a factor type in R, run the following code: library (openintro) bdíms$sex2 <-factor (bdins$sex, levels-c (0,1), labels=c('F', 'M')) (a) Use ggplot2 to make a...
Also: Based on the regression results, solve for the predicted MPGavg for 8 cylinder cars. and Based on the regression results, what is the best answer concerning average MPG for 4 cylinder SUVs. a. 4 cylinder SUVs have statistically higher average MPG when compared to 8 cylinder SUVs. b. The number of cylinders does not help explain average MPG. c. 6 cylinder SUVs do not have statistically higher average MPG when compared to 8 cylinder SUVs. d. 4 cylinder SUVs...
Need help with stats true or false questions Decide (with short explanations) whether the following statements are true or false a) We consider the model y-Ao +A(z) +E. Let (-0.01, 1.5) be a 95% confidence interval for A In this case, a t-test with significance level 1% rejects the null hypothesis Ho : A-0 against a two sided alternative. b) Complicated models with a lot of parameters are better for prediction then simple models with just a few parameters c)...
Please help me with these questions with R codes.. thank you!! Here’s the data I have obtained for the questions: Data: 9 students in total Height(cm) Head Circumference(cm) 179 60 161 55 162 57 155 60 158 56 172 57 191 60 179 57 163 58 2. Draw at most 3 plots to visually describe your data. Is your response variable approximately Normal? 3. Numerically describe the centre, spread and any unusual points of your variables/data. 4. Fit and describe...
2 4 6 Since we now have quadratic and power models that both model curved data, reconsider the previous question. Year PCB Conc. 1 1.0 1.9 a) Notice the power curve in the previous question is not that 4.2 great of a fit to the data (compare the fit of a quadratic 7.2 curve below to the power function above). Use regression 8 11.8 to find a quadratic equation to model the PCB Concentration in Lake Trout data. Round the...
This is a question about writting R code for a linear regression model. 8. . (13 marks) Given four points (1,0.8), (4,4.2), (5,4.7) and (7,7.8), write down your R code to Build the linear regression model. (a) (4 marks) Predict the results on the new data with a sequence of 51 numbers equally spaced (b) values starting from 0 to 8 (4 marks) Generate the plot in Figure 3, where the curved lines are the upper (upr) and lower (c)...
The Motor Trend Car Road Tests dataset mtcars, in faraway R package, was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). The data frame has 32 observation on 11 (numeric) variables: mpg: Miles/(US) gallon; cyl: Number of cylinders; disp: Displacement (cu.in.); hp: Gross horsepower; drat: Rear axle ratio; wt: Weight (1000 lbs); qsec: 1/4 mile time; vs: Engine (0 = V-shaped, 1 =...
Price (in K) Sqft Age Features CornerCODE Corner_Label 310.0 2650 13 7 0 NO 313.0 2600 9 4 0 NO 320.0 2664 6 5 0 NO 320.0 2921 3 6 0 NO 304.9 2580 4 4 0 NO 295.0 2580 4 4 0 NO 285.0 2774 2 4 0 NO 261.0 1920 1 5 0 NO 250.0 2150 2 4 0 NO 249.9 1710 1 3 0 NO 242.5 1837 4 5 0 NO 232.0 1880 8 6 0 NO...
The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...
Age Mem IQ Reading Ability 6.7 4.4 95 7.2 5.9 4 90 6 5.5 4.1 105 6 6.2 4.8 98 6.6 6.4 5 106 7 7.3 5.5 100 7.2 5.7 3.6 88 5.3 6.15 5 95 6.4 7.5 5.4 96 6.6 6.9 5 104 7.3 4.1 3.9 108 5 5.5 4.2 90 5.8 6.9 4.5 91 6.6 7.2 5 92 6.8 4 4.2 101 5.6 7.3 5.5 100 7.2 5.9 4 90 6 5.5 4.2 90 5.8 4 4.2 101 ...