3-The population in the city of Houston from 1900 to 2010 is given below: Year Population
1900 44,633
1910 78,800
1920 138,276
1930 292,352
1940 384,514
1950 596,163
1960 938,219
1970 1,233,505
1980 1,595,138
1990 1,631,766
2000 1,953,631
2010 2,100,263
a. Give a scatter-plot and residual plot of the data.
b. Based on the graphs in part a, propose a model for the data. Show me evidence to support your conclusion. Go through all necessary steps to construct a model of the type you chose
4-In R Studio use the data cars to determine the following. Hint: The data set is already in R studio use the quick reference guide to determine the following. [In R Studio, use command(file$column), such as mean(cars$speed)] Description: The data gives the speed of cars and the distances taken to stop. Note that the data were recorded in the 1920s. Format A data frame with 50 observations on 2 variables. speed numeric Speed (mph) dist numeric Stopping distance (ft) a. Give a scatter plot of the data. Determine the form, direction and strength of the relationship between speed and stopping distance (dist). b. Determine the LSRL for predicting stopping distance based on speed of the car. c. Interpret the slope of this LSRL equation. d. Determine the correlation. Give an interpretation of the correlation. e. Determine the coefficient of determination, R2 . Give an interpretation of R2 . f. One of the cars was going 25 mph and had a stopping distance of 85 feet. Determine the residual of this car.
5- The following two-way table describes the age and marital status of American women in 1991. The table entries are in thousands of women.
age | single | married | widowed | divorced |
18-25 | 9008 | 3352 | 8 | 257 |
25-39 | 6658 | 21769 | 248 | 3224 |
40-64 | 19 | 24462 | 2570 |
4755 |
65+ | 900 | 7255 | 8464 | 925 |
a. Construct the marginal distributions of this table in counts. b. Draw a bar chart to display the marginal distribution of the marital status for all adult women (use percentages). c. What percent of adult American women under the age of 25 have never married? d. What percent of 25-39 year old American women are divorced? e. Compare the conditional distributions of marital status for women aged 18 to 24 and women aged 40 to 64. Briefly describe the most important differences between the two groups of women, and back up your description with percentages.
For problems 7 – 10 circle the best answer.
7. In the least-squares regression line, the desired sum of the errors (residuals) should be a. positive b. negative c. zero d. maximized
8. Suppose that a least squares regression line equation is ˆy = 1.65 − 2.20x and the actual y value corresponding to x = 10 is −19, what is the residual value corresponding to y = −19? a. 1.35 b. −1.35 c. 2.10 d. −2.10
Done in R
Solution4A:
a. Give a scatter plot of the data. Determine the form, direction and strength of the relationship between speed and stopping distance (dist).
plot(cars$dist,cars$speed,main="scatterplot")
Form:linear
strength:strong
Direction:positive
b. Determine the LSRL for predicting stopping distance based on speed of the car. c. Interpret the slope of this LSRL equation.
LSRL <- lm(cars$dist~cars$speed)
coefficients(LSRL)
Output:
Intercept) cars$speed
-17.579095 3.932409
least squares regression line is
distance= -17.579095 +3.932409 *speed
slope=3.932409
y intercept=-17.579095
d. Determine the correlation. Give an interpretation of the correlation
cor(cars$speed,cars$dist)
output:
0.8068949
r=0.8068949
there exists a strong positive relationship between speed and distance
as distance covered speed is high and viceversa.
e. Determine the coefficient of determination, R2 . Give an interpretation of R2
summary(LSRL)
ANSWER:
Multiple R-squared: 0.6511
R SQ=0.6511
65.11% VARIATION IN DISTANCE IS EXPLAINED BY SPEED
Explained variance=65.11%
unexpalined variance=100-65.11= 34.89%
f. One of the cars was going 25 mph and had a stopping distance of 85 feet. Determine the residual of this car.
we have distance= -17.579095 +3.932409 *speed
=-17.579095 +3.932409 *25
=80.73113
Resiudal=observed distan e-predicted distance
=85-80.73113
= 4.26887
3-The population in the city of Houston from 1900 to 2010 is given below: Year Population...