For the two variables of interest: Create a scatter plot with Percent Time Asleep as the independent variable x and Longevity as the dependent variable y. The plot must include an informative title, along with correct labels for both axes. Include a plot of the least-squares equation (see #5 below). Calculate the correlation coefficient and the coefficient of determination. Identify any data points on the scatter diagram that appear to be influential. Use Cook's Distance > (4⁄√n) as the criterion for an influential data point. If there are no influential points, say so. Conduct a formal hypothesis test at α=0.05 to determine if there is evidence of linear correlation between the two variables. Present the results in four parts similar to those used for the hypothesis test in Section 4 (omit the distribution type): Your null and alternate hypotheses in the proper format. The P-value and its logical relationship to α (≤ or >). Your decision regarding the null hypothesis: reject or fail to reject. A statement regarding the sufficiency of the evidence for correlation. Construct the least-squares equation (must be in algebraic format for full credit). Determine if the equation you constructed in #5 above is a valid model. Justify your decision with a detailed analysis that includes an assessment of the coefficient of determination (#2 above), a discussion of the effect of the hypothesis test results (#4 above) on model validity, and an assessment of the residuals, to include a residual histogram, a Q-Q plot, and a plot of the residuals against Percent Time Asleep (with explanatory paragraphs for each graphic).
Pct Time Asleep | Longevity |
22 | 35 |
9 | 37 |
49 | 49 |
1 | 46 |
23 | 63 |
83 | 39 |
23 | 46 |
15 | 56 |
9 | 63 |
81 | 65 |
12 | 56 |
15 | 65 |
37 | 70 |
24 | 63 |
26 | 65 |
17 | 70 |
14 | 77 |
14 | 81 |
6 | 86 |
25 | 70 |
18 | 70 |
26 | 77 |
24 | 77 |
29 | 81 |
27 | 77 |
18 | 40 |
6 | 37 |
19 | 44 |
7 | 47 |
16 | 47 |
13 | 47 |
35 | 68 |
2 | 47 |
35 | 54 |
6 | 61 |
15 | 71 |
14 | 75 |
18 | 89 |
50 | 58 |
25 | 59 |
10 | 62 |
33 | 79 |
43 | 96 |
35 | 58 |
17 | 62 |
27 | 70 |
22 | 72 |
16 | 75 |
20 | 96 |
37 | 75 |
23 | 46 |
4 | 42 |
20 | 65 |
42 | 46 |
9 | 58 |
32 | 42 |
66 | 48 |
28 | 58 |
10 | 50 |
4 | 80 |
12 | 63 |
17 | 65 |
12 | 70 |
23 | 70 |
40 | 72 |
18 | 97 |
10 | 46 |
38 | 56 |
7 | 70 |
23 | 70 |
36 | 72 |
9 | 76 |
21 | 90 |
62 | 76 |
36 | 92 |
23 | 21 |
62 | 40 |
28 | 44 |
18 | 54 |
10 | 36 |
28 | 40 |
22 | 56 |
29 | 60 |
15 | 48 |
73 | 53 |
10 | 60 |
5 | 60 |
13 | 65 |
27 | 68 |
20 | 60 |
21 | 81 |
12 | 81 |
49 | 48 |
17 | 48 |
22 | 56 |
71 | 68 |
17 | 75 |
10 | 81 |
24 | 48 |
18 | 68 |
34 | 16 |
6 | 19 |
4 | 19 |
22 | 32 |
28 | 33 |
31 | 33 |
16 | 30 |
27 | 42 |
8 | 42 |
32 | 33 |
20 | 26 |
35 | 30 |
12 | 40 |
14 | 54 |
17 | 34 |
29 | 34 |
31 | 47 |
6 | 47 |
30 | 42 |
27 | 47 |
40 | 54 |
19 | 54 |
8 | 56 |
8 | 60 |
15 | 44 |
1.
Steps(excel): Enter data > selet data > insert > scatterplot > add trendline and display euqation and R-sq > OK
2. Regression:
Steps(excel): Enter data > data > Data anlysis > regression > enter y range and x range > ok
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.004178 | correlation | ||||
R Square | 1.75E-05 | determination | ||||
Adjusted R Square | -0.00811 | |||||
Standard Error | 17.63499 | |||||
Observations | 125 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 0.667661 | 0.667661 | 0.002147 | 0.963119 | > 0.05 (not significant) |
Residual | 123 | 38252.13 | 310.9929 | |||
Total | 124 | 38252.8 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 57.33157 | 2.822116 | 20.3151 | 4.09E-41 | 51.74536 | 62.91778 |
X Variable 1 | 0.004621 | 0.099735 | 0.046334 | 0.963119 | -0.1928 | 0.202039 |
Model is not significant.
only intercept is significant as it has p-value < 0.05.
3.
RESIDUAL OUTPUT | |||
Observation | Predicted Y | Residuals | Standard Residuals |
1 | 57.43323 | -22.4332 | -1.27725 |
2 | 57.37316 | -20.3732 | -1.15996 |
3 | 57.55801 | -8.55801 | -0.48725 |
4 | 57.33619 | -11.3362 | -0.64543 |
5 | 57.43786 | 5.562144 | 0.316683 |
6 | 57.71512 | -18.7151 | -1.06555 |
7 | 57.43786 | -11.4379 | -0.65122 |
8 | 57.40089 | -1.40089 | -0.07976 |
9 | 57.37316 | 5.62684 | 0.320367 |
10 | 57.70588 | 7.294119 | 0.415294 |
11 | 57.38702 | -1.38702 | -0.07897 |
12 | 57.40089 | 7.599113 | 0.432659 |
13 | 57.50255 | 12.49745 | 0.711548 |
14 | 57.44248 | 5.557523 | 0.31642 |
15 | 57.45172 | 7.548281 | 0.429765 |
16 | 57.41013 | 12.58987 | 0.71681 |
17 | 57.39627 | 19.60373 | 1.116148 |
18 | 57.39627 | 23.60373 | 1.34389 |
19 | 57.3593 | 28.6407 | 1.630672 |
20 | 57.4471 | 12.5529 | 0.714706 |
21 | 57.41475 | 12.58525 | 0.716547 |
22 | 57.45172 | 19.54828 | 1.112991 |
23 | 57.44248 | 19.55752 | 1.113517 |
24 | 57.46558 | 23.53442 | 1.339943 |
25 | 57.45634 | 19.54366 | 1.112728 |
26 | 57.41475 | -17.4148 | -0.99152 |
27 | 57.3593 | -20.3593 | -1.15917 |
28 | 57.41937 | -13.4194 | -0.76404 |
29 | 57.36392 | -10.3639 | -0.59007 |
30 | 57.40551 | -10.4055 | -0.59244 |
31 | 57.39164 | -10.3916 | -0.59165 |
32 | 57.49331 | 10.50669 | 0.598204 |
33 | 57.34081 | -10.3408 | -0.58876 |
34 | 57.49331 | -3.49331 | -0.19889 |
35 | 57.3593 | 3.640703 | 0.207285 |
36 | 57.40089 | 13.59911 | 0.774272 |
37 | 57.39627 | 17.60373 | 1.002277 |
38 | 57.41475 | 31.58525 | 1.798321 |
39 | 57.56263 | 0.437374 | 0.024902 |
40 | 57.4471 | 1.552902 | 0.088415 |
41 | 57.37778 | 4.622219 | 0.263168 |
42 | 57.48407 | 21.51593 | 1.22502 |
43 | 57.53028 | 38.46972 | 2.190292 |
44 | 57.49331 | 0.506691 | 0.028849 |
45 | 57.41013 | 4.589871 | 0.261327 |
46 | 57.45634 | 12.54366 | 0.714179 |
47 | 57.43323 | 14.56677 | 0.829366 |
48 | 57.40551 | 17.59449 | 1.001751 |
49 | 57.42399 | 38.57601 | 2.196344 |
50 | 57.50255 | 17.49745 | 0.996226 |
51 | 57.43786 | -11.4379 | -0.65122 |
52 | 57.35005 | -15.3501 | -0.87396 |
53 | 57.42399 | 7.576008 | 0.431344 |
54 | 57.52566 | -11.5257 | -0.65622 |
55 | 57.37316 | 0.62684 | 0.035689 |
56 | 57.47945 | -15.4794 | -0.88133 |
57 | 57.63656 | -9.63656 | -0.54866 |
58 | 57.46096 | 0.539039 | 0.03069 |
59 | 57.37778 | -7.37778 | -0.42006 |
60 | 57.35005 | 22.64995 | 1.289586 |
61 | 57.38702 | 5.612977 | 0.319578 |
62 | 57.41013 | 7.589871 | 0.432133 |
63 | 57.38702 | 12.61298 | 0.718126 |
64 | 57.43786 | 12.56214 | 0.715232 |
65 | 57.51641 | 14.48359 | 0.82463 |
66 | 57.41475 | 39.58525 | 2.253805 |
67 | 57.37778 | -11.3778 | -0.6478 |
68 | 57.50717 | -1.50717 | -0.08581 |
69 | 57.36392 | 12.63608 | 0.719441 |
70 | 57.43786 | 12.56214 | 0.715232 |
71 | 57.49793 | 14.50207 | 0.825682 |
72 | 57.37316 | 18.62684 | 1.060528 |
73 | 57.42861 | 32.57139 | 1.854468 |
74 | 57.61808 | 18.38192 | 1.046584 |
75 | 57.49793 | 34.50207 | 1.964392 |
76 | 57.43786 | -36.4379 | -2.07461 |
77 | 57.61808 | -17.6181 | -1.00309 |
78 | 57.46096 | -13.461 | -0.76641 |
79 | 57.41475 | -3.41475 | -0.19442 |
80 | 57.37778 | -21.3778 | -1.21715 |
81 | 57.46096 | -17.461 | -0.99415 |
82 | 57.43323 | -1.43323 | -0.0816 |
83 | 57.46558 | 2.534417 | 0.144298 |
84 | 57.40089 | -9.40089 | -0.53524 |
85 | 57.66891 | -4.66891 | -0.26583 |
86 | 57.37778 | 2.622219 | 0.149297 |
87 | 57.35468 | 2.645325 | 0.150613 |
88 | 57.39164 | 7.608355 | 0.433185 |
89 | 57.45634 | 10.54366 | 0.600308 |
90 | 57.42399 | 2.576008 | 0.146666 |
91 | 57.42861 | 23.57139 | 1.342048 |
92 | 57.38702 | 23.61298 | 1.344416 |
93 | 57.55801 | -9.55801 | -0.54419 |
94 | 57.41013 | -9.41013 | -0.53577 |
95 | 57.43323 | -1.43323 | -0.0816 |
96 | 57.65967 | 10.34033 | 0.588732 |
97 | 57.41013 | 17.58987 | 1.001488 |
98 | 57.37778 | 23.62222 | 1.344942 |
99 | 57.44248 | -9.44248 | -0.53761 |
100 | 57.41475 | 10.58525 | 0.602676 |
101 | 57.48869 | -41.4887 | -2.36218 |
102 | 57.3593 | -38.3593 | -2.18401 |
103 | 57.35005 | -38.3501 | -2.18348 |
104 | 57.43323 | -25.4332 | -1.44805 |
105 | 57.46096 | -24.461 | -1.3927 |
106 | 57.47482 | -24.4748 | -1.39349 |
107 | 57.40551 | -27.4055 | -1.56035 |
108 | 57.45634 | -15.4563 | -0.88001 |
109 | 57.36854 | -15.3685 | -0.87502 |
110 | 57.47945 | -24.4794 | -1.39375 |
111 | 57.42399 | -31.424 | -1.78914 |
112 | 57.49331 | -27.4933 | -1.56534 |
113 | 57.38702 | -17.387 | -0.98994 |
114 | 57.39627 | -3.39627 | -0.19337 |
115 | 57.41013 | -23.4101 | -1.33287 |
116 | 57.46558 | -23.4656 | -1.33602 |
117 | 57.47482 | -10.4748 | -0.59639 |
118 | 57.3593 | -10.3593 | -0.58981 |
119 | 57.4702 | -15.4702 | -0.8808 |
120 | 57.45634 | -10.4563 | -0.59534 |
121 | 57.51641 | -3.51641 | -0.20021 |
122 | 57.41937 | -3.41937 | -0.19468 |
123 | 57.36854 | -1.36854 | -0.07792 |
124 | 57.36854 | 2.631461 | 0.149824 |
125 | 57.40089 | -13.4009 | -0.76299 |
4.
Plots:
Q-Q plot:
Y is approximately Normal as the line is straight.
5. Residual vs Predicted:
Pattern is random so there is no heteroscedasticity.
6.
Cooks distance:
Now, the model is not significant at all there doesnt seem to be an observation/(s) that once removed will enhance the model. So, we do not reset the model or calculate Cook's distance as there doesnt seem to be any neither model seem to be significant for the changes.
Please rate my answer and comment for doubt.
For the two variables of interest: Create a scatter plot with Percent Time Asleep as the...
For each variable of interest – Percent Time Asleep and Longevity – create a grouped frequency histogram. For each histogram, use a class width of 10; use a lower limit of 0 for Percent Time Asleep and 15 for Longevity. Each histogram must include an informative title, along with correct labels for both axes. For each histogram, include a paragraph that answers each of the following questions: Is the histogram symmetric, skewed to the left, or skewed to the right?...
For each variable of interest, do the following: 1. Find the mean, five-number summary, range, variance, and standard deviation. Display these numbers in a format that is easy to understand. 2. For each variable of interest, use its five-number summary to construct a boxplot. Each boxplot must be constructed horizontally, and must be accompanied by a brief descriptive paragraph that assesses whether the data appear to be symmetrical, left-skewed, or right-skewed. Construct a 95% confidence interval for the mean μ...
Conduct a formal hypothesis test of the claim that the mean longevity is less than 57 days. Test at significance α=0.05. Your written summary of this test must include the following: Your null and alternate hypotheses in the proper format. The type of distribution you used to construct the interval (t or normal). The P-value and its logical relationship to α (≤ or >). Your decision regarding the null hypothesis: reject or fail to reject. A statement regarding the sufficiency/insufficiency...
Problem #1: Consider the below matrix A, which you can copy and paste directly into Matlab. The matrix contains 3 columns. The first column consists of Test #1 marks, the second column is Test # 2 marks, and the third column is final exam marks for a large linear algebra course. Each row represents a particular student.A = [36 45 75 81 59 73 77 73 73 65 72 78 65 55 83 73 57 78 84 31 60 83...
Problem 4: Variables that may affect Grades The data set contains a random sample of STAT 250 Final Exam Scores out of 80 points. For each individual sampled, the time (in hours per week) that the student spent participating in a GMU club or sport and working for pay outside of GMU was recorded. Values of 0 indicate the students either does not participate in a club or sport or does not work a job for pay. The goal of...
RANGES FREQUENCY RELATIVE FREQUENCY CUMULATIVE REL. FREQ. 1 - 10 11 - 20 21 - 30 31 - 40 41 - 50 51 - 60 61 - 70 71 - 80 81 - 90 91 - 100 '= 100 DATA VALUES?? SO, WHAT DOES A FREQUENCY TABLE TELL US? If you wrote each of the above data values on a ping pong ball,, put them in a jar and blindly pulled one out: What is the probability that this ball...
estimate the average age at which multiple sclerosis patients were diagnosed with the condition for the first time in a given city. How big should the sample be? Define your procedures for this estimate (if necessary, set your own values of unknown parameters, based on statistical theory). In Table 1 you will find all ages of this patient population. 54 58 56 48 62 59 55 56 60 52 53 61 56 56 53 37 71 62 39 61 54...
NUMBER OF PEOPLE 10.2 10.0 10.1 8.5 10.2 8.2 8 Source: United States Census. 11. In the Sanitary District of Chicago, operating engineers are hired on of a competitive civil-service examination. In 1966, there were 223 appl for 15 jobs. The exam was held on March 12; the test scores are s arranged in increasing order. The height of each bar in the histogram next page) shows the number of people with the correspondin examiners were charged with rigging the...
Calculate the range, mean, mode, median, Standard deviation Calculate the skewness and kurtosis for the above data and interpret the data. The following is data collected from the daily salary employees of ZZ COMPANY.. 68 19 43 11 37 30 19 67 65 34 96 23 93 73 46 39 21 12 89 52 33 21 18 57 80 56 91 62 56 48 84 23 78 96 49 36 90 42 65 15 43 36 65 59 34 71...
Problem 1: Confidence Interval for Percentage of B’s. The data set “STAT 250 Final Exam Scores” contains a random sample of 269 STAT 250 students’ final exam scores (maximum of 80) collected over the past two years. Answer the following questions using this data set. a) What proportion of students in our sample earned B’s on the final exam? A letter grade of B is obtained with a score of between 64 and 71 inclusive. Hint: You can do this...