11.38 Building a multiple linear regression model. Let’s now build a model to predict the life-satisfaction score, LSI.
(a) Consider a simple linear regression using GINI as the explanatory variable. Run the regression and summarize the results. Be sure to check assumptions.
(b) Now consider a model using GINI and LIFE. Run the multiple regression and summarize the results. Again be sure to check assumptions.
(c) Now consider a model using GINI, LIFE, and DEMOCRACY. Run the multiple regression and summarize the results. Again be sure to check assumptions.
(d) Now consider a model using all four explanatory variables. Again summarize the results and check assumptions.
Data:
Country | LSI | GINI | CORRUPT | DEMOCRACY | LIFE |
Algeria | 5.4 | 35.3 | 2.8 | 1.5 | 74.5 |
Argentina | 7.3 | 44.5 | 2.8 | 5.5 | 76.95 |
Armenia | 5 | 31.3 | 2.9 | 3 | 73.23 |
Australia | 7.7 | 35.2 | 8.8 | 6 | 81.81 |
Austria | 7.4 | 29.2 | 8.1 | 6 | 79.78 |
Azerbaijan | 5.3 | 33.7 | 2.1 | 1.5 | 67.36 |
Bangladesh | 5.3 | 32.1 | 1.7 | 3.5 | 69.75 |
Belarus | 5.2 | 26.5 | 2.1 | 1 | 71.2 |
Belgium | 7.3 | 33 | 7.1 | 5.5 | 79.51 |
Bolivia | 6.3 | 56.3 | 2.5 | 5 | 67.57 |
Brazil | 7.5 | 54.7 | 3.7 | 4 | 72.53 |
United Kingdom | 7.2 | 36 | 8.6 | 5.5 | 78.54 |
Bulgaria | 4.4 | 28.2 | 4.1 | 4.5 | 73.59 |
Canada | 7.8 | 32.6 | 8.7 | 6 | 81.38 |
Chile | 6.7 | 52.1 | 7.3 | 5 | 77.7 |
Colombia | 7.7 | 55.9 | 4 | 3 | 74.55 |
Denmark | 8.3 | 24.7 | 9.4 | 6 | 78.63 |
Dominican Republic | 7.5 | 47.2 | 3 | 5 | 77.31 |
Egypt | 5.7 | 30.8 | 3.4 | 1.5 | 72.66 |
El Salvador | 6.7 | 48.3 | 4.2 | 4.5 | 73.44 |
Estonia | 6 | 36 | 6.5 | 5.5 | 73.33 |
Finland | 7.9 | 26.9 | 9.4 | 6 | 79.27 |
France | 6.6 | 32.7 | 7.3 | 5.5 | 81.19 |
Germany | 7.1 | 28.3 | 7.8 | 5.5 | 80.07 |
Ghana | 5.2 | 42.8 | 3.5 | 4.5 | 61 |
Greece | 6.4 | 34.3 | 4.6 | 5 | 79.92 |
Honduras | 7 | 57 | 2.6 | 4 | 70.61 |
Hungary | 5.5 | 31.2 | 5.3 | 5.5 | 74.79 |
Iceland | 8.2 | 35.9 | 9.2 | 6 | 80.9 |
India | 5.5 | 33.9 | 2.9 | 4.5 | 66.8 |
Indonesia | 6.3 | 35.6 | 2.2 | 3.5 | 71.33 |
Iran | 5.9 | 38.3 | 2.9 | 1 | 70.06 |
Ireland | 7.6 | 34.3 | 7.5 | 6 | 80.19 |
Israel | 7 | 39.2 | 6.3 | 5 | 80.96 |
Italy | 6.7 | 36 | 5.2 | 5.5 | 81.77 |
Japan | 6.5 | 24.9 | 7.3 | 5.5 | 82.25 |
Jordan | 5.9 | 35.4 | 5.7 | 3 | 80.05 |
Kenya | 3.7 | 47.7 | 2.1 | 1.5 | 59.48 |
Latvia | 5.4 | 34.8 | 4.8 | 5.5 | 72.68 |
Lithuania | 5.5 | 37.6 | 4.8 | 5.5 | 75.34 |
Mali | 4.7 | 33 | 2.9 | 4.5 | 52.61 |
Mexico | 7.9 | 47.2 | 3.5 | 4.5 | 76.47 |
Moldova | 4.9 | 33 | 2.8 | 4 | 71.37 |
Morocco | 5.4 | 40.9 | 3.2 | 2.5 | 75.9 |
Netherlands | 7.6 | 30.9 | 9 | 6 | 79.68 |
New Zealand | 7.5 | 36.2 | 9.6 | 6 | 80.59 |
Nigeria | 5.7 | 42.9 | 1.9 | 3 | 47.56 |
Norway | 7.9 | 25.8 | 8.7 | 6 | 80.2 |
Pakistan | 5 | 30 | 2.1 | 1.5 | 65.99 |
Peru | 6.2 | 48.1 | 3.5 | 3.5 | 72.47 |
Philippines | 5.9 | 43 | 2.5 | 4.5 | 71.66 |
Poland | 6.4 | 34 | 4.2 | 5.5 | 76.05 |
Portugal | 5.7 | 38.5 | 6.5 | 6 | 78.54 |
Romania | 5.7 | 27.4 | 3.7 | 5 | 73.98 |
Russia | 5.5 | 40.1 | 2.3 | 2 | 66.29 |
Senegal | 4.5 | 40.3 | 3.2 | 3.5 | 59.78 |
Slovakia | 5.9 | 26 | 4.9 | 5.5 | 75.83 |
Slovenia | 6.9 | 31.2 | 6.6 | 5.5 | 77.3 |
South-Africa | 5.8 | 63.1 | 4.5 | 5.5 | 49.33 |
South-Korea | 6 | 31.6 | 5 | 5 | 79.05 |
Spain | 7.2 | 34.7 | 6.7 | 5.5 | 81.17 |
Sweden | 7.8 | 25 | 9.3 | 6 | 81.07 |
Switzerland | 8 | 33.7 | 9 | 6 | 81.07 |
Tanzania | 2.8 | 37.6 | 2.9 | 3 | 52.85 |
Turkey | 5.7 | 40 | 4.1 | 2.5 | 72.5 |
Uganda | 4.8 | 44.3 | 2.5 | 1.5 | 53.24 |
Ukraine | 5 | 25.6 | 2.7 | 3 | 68.58 |
Uruguay | 6.7 | 45.3 | 5.9 | 6 | 76.21 |
USA | 7.4 | 40.8 | 7.2 | 6 | 78.37 |
Uzbekistan | 6 | 36.7 | 2.2 | 0.5 | 72.51 |
Vietnam | 6.1 | 35.6 | 2.6 | 0.5 | 72.18 |
Zimbabwe | 3 | 50.1 | 2.6 | 1.5 | 49.64 |
a)
a simple linear regression using GINI as the explanatory variable
Regression Equation
LSI = 6.503 - 0.0071 GINI
Coefficients
Term Coef SE Coef
T-Value P-Value VIF
Constant 6.503 0.643
10.11 0.000
GINI -0.0071 0.0168
-0.42 0.675 1.00
Model Summary
S R-sq R-sq(adj)
R-sq(pred)
1.21649 0.25% 0.00% 0.00%
Analysis of Variance
Source DF Adj SS Adj
MS F-Value P-Value
Regression 1 0.262
0.2622 0.18 0.675
GINI 1 0.262 0.2622
0.18 0.675
Error 70 103.589
1.4798
Lack-of-Fit 60 89.586
1.4931 1.07 0.494
Pure Error 10 14.003
1.4003
Total 71 103.851
Assumption
1) From above graph data follows normality assumption.
2) residulals versus fitted value shows homoscedasticity assumption get satisfied.
b) a model using GINI and LIFE
Regression Equation
LSI = -3.82 + 0.0394 GINI + 0.1177
LIFE
Coefficients
Term Coef SE Coef
T-Value P-Value VIF
Constant -3.82 1.12
-3.40 0.001
GINI 0.0394 0.0119
3.32 0.001 1.19
LIFE 0.1177 0.0119
9.88 0.000 1.19
Model Summary
S R-sq R-sq(adj)
R-sq(pred)
0.788438 58.70% 57.50%
52.94%
Analysis of Variance
Source DF Adj SS Adj
MS F-Value P-Value
Regression 2 60.958
30.4791 49.03 0.000
GINI 1 6.844 6.8437
11.01 0.001
LIFE 1 60.696
60.6961 97.64 0.000
Error 69 42.893
0.6216
Total 71 103.851
Assumptions
1) From above graph data follows normality assumption.
2) residulals versus fitted value shows homoscedasticity assumption get satisfied.
c) model using GINI, LIFE, and DEMOCRACY
Regression Equation
LSI = -2.94 + 0.0366 GINI + 0.0945 LIFE
+ 0.2146 DEMOCRACY
Coefficients
Term Coef SE Coef
T-Value P-Value VIF
Constant -2.94 1.07
-2.75 0.008
GINI 0.0366 0.0110
3.32 0.001 1.19
LIFE 0.0945 0.0128
7.36 0.000 1.61
DEMOCRACY 0.2146 0.0607
3.53 0.001 1.39
Model Summary
S R-sq R-sq(adj)
R-sq(pred)
0.730038 65.10% 63.56%
59.68%
Analysis of Variance
Source DF Adj SS Adj
MS F-Value P-Value
Regression 3 67.610
22.5367 42.29 0.000
GINI 1 5.884 5.8841
11.04 0.001
LIFE 1 28.873
28.8733 54.18 0.000
DEMOCRACY 1 6.652
6.6519 12.48 0.001
Error 68 36.241
0.5330
Total 71 103.851
Assumptions
1) From above graph data follows normality assumption.
2) residulals versus fitted value shows homoscedasticity assumption get satisfied.
d) model using GINI, LIFE, DEMOCRACY and corrupt
Regression Equation
LSI = -2.31 + 0.0447 GINI + 0.0782 LIFE
+ 0.0526 DEMOCRACY + 0.1941 CORRUPT
Coefficients
Term Coef SE Coef
T-Value P-Value VIF
Constant -2.31 1.01
-2.28 0.026
GINI 0.0447 0.0105
4.25 0.000 1.26
LIFE 0.0782 0.0129
6.08 0.000 1.87
DEMOCRACY 0.0526 0.0739
0.71 0.479 2.37
CORRUPT 0.1941 0.0571
3.40 0.001 3.02
Model Summary
S R-sq R-sq(adj)
R-sq(pred)
0.679202 70.24% 68.46%
64.07%
Analysis of Variance
Source DF Adj SS Adj
MS F-Value P-Value
Regression 4 72.943
18.2357 39.53 0.000
GINI 1 8.317 8.3169
18.03 0.000
LIFE 1 17.040
17.0396 36.94 0.000
DEMOCRACY 1 0.234
0.2338 0.51 0.479
CORRUPT 1 5.333
5.3328 11.56 0.001
Error 67 30.908
0.4613
Total 71 103.851
Assumptions
1) From above graph data follows normality assumption.
2) residulals versus fitted value shows homoscedasticity assumption get satisfied.
11.38 Building a multiple linear regression model. Let’s now build a model to predict the life-satisfaction...
Parametirc test or not:Test statistic:p-value:decision:Is There A Difference Between the Means?6.7 6.2 3.1 310.3 10 5 5.56.9 5.5 3.3 3.110.5 6.3 4.3 5.44.5 4.6 1.8 25.6 5.6 2 2.65.9 6.1 2.1 2.58 11.7 4 4.68 7.4 3.3 3.15.8 5.2 3.1 2.96 7.3 3.0 3.28.7 5.3 2.7 36 5.5 2.1 2.27.2 6.3 3.5 3.25.9 4.6 2.9 3.46 7.4 3 3.37.2 7.8 3.7 3.48.6 9.4 5.1 5.77.2 8.1 2.8 3.15.8 5.4 2.2 1.83.3 4 1.7 1.86.8 5.1 2 1.83.7 3.5 2.2 2.112...
The data on the below shows the number of hours a particular drug is in the system of 200 females. Develop a histogram of this data according to the following intervals: Follow the directions. Test the hypothesis that these data are distributed exponentially. Determine the test statistic. Round to two decimal places. (sort the data first) [0, 3) [3, 6) [6, 9) [9, 12) [12, 18) [18, 24) [24, infinity) 34.7 11.8 10 7.8 2.8 20 9.8 20.4 1.2 7.2...
An object of weight 1 N is falling vertically. The time vs. speed data can be found here. In this case the effect of air-drag cannot be neglected. Use your critical thinking to estimate the air-drag coefficient . Make sure you include the units in your answer. 0 0 0.1 0.9992 0.2 1.993 0.3 2.978 0.4 3.948 0.5 4.898 0.6 5.826 0.7 6.728 0.8 7.599 0.9 8.438 1 9.242 1.1 10.01 1.2 10.74 1.3 11.43 1.4 12.09 1.5 12.7 1.6 ...
• 1. What are the quarterly growth rates (Percentage Change From Preceding Period in Real Gross Domestic Product) for the U.S. economy for the last six quarters? Report those numbers in your submission 2. What is the average of those 6 quarters? . 3. Is the average of those growth rates above or below the long-run U.S. annual growth rate of 3.5 percent? Bureau of Economke Analysis Table 1.1.1. Percent Change From Preceding Period in Real Gross Domestic Product Percent...
a. If you decided to fit the simple linear regression model to this data, what proportion of observed variation in maximum prevalence could be explained by the model relationship? (Round your answer to three decimal places.) b. If you decided to regress UV transparency index on maximum prevalence (i.e., interchange the roles of x and y), what proportion of observed variation could be attributed to the model relationship? (Round your answer to three decimal places.) c. Carry out a test...
Saved 2023 A sample of 100 bank customer waiting times are given in the following table: Waiting Times (in Minutes) for the Bank Custoner Waiting Tine Case 4.5 9.7 11.8 10.4 4.2 7.e 5.5 8.7 8.8 5.1 5.4 4.5 6.2 3.9 8.7 7.6 4.7 4.0 4.0 10.1 .3 2.8 5.4 6.5 9.7 5.2 7.0 5.8 4.5 8.2 1.9 5.2 4.1 7.9 5.3 8.8 6.2 4.1 4.1 4.5 10.4 .2 7.0 6.8 5.5 8.3 11.3 5.9 8.8 2.8 6.7 5.5 5.e...
can dba & standard... tics T test With Data Aation and Linear Reg... 184: Linear Regression ad: In Exercise, use Ta... About the correlatio... al Estate Agency Says.. The table below shows the number of hours per day 11 patients suffered from headaches before and after 7 weeks of soft tissue therapy. At 0.01, is there enough evidence to conclude that soft tissue therapy helps to reduce the length of time patients suffer from headaches? Assume the samples are random...
The maintenance manager at a trucking company wants to build a regression model to forecast the time (in years) until the first engine overhaul based on four explanatory variables: (1) annual miles driven (in 1,000s of miles), (2) average load weight (in tons), (3) average driving speed (in mph), and (4) oil change interval (in 1,000s of miles). Based on driver logs and onboard computers, data have been obtained for a sample of 25 trucks. A portion of the data...
Question 1: Is the magnitude of an earthquake related to the depth below the surface at which the quake occurs? In order to answer this question an analysis is conducted. Let we wish to explain depth (in kilometers) of the quake below the surface at the epicenterx by magnitude of an earthquake (on the Richter scale). Data are as follows: 1 2 3 4 5 6 7 8 9 10 3.9 4.3 3.3 4.6 3.9 3.2 3.4 4.5 5.1 2.6...
A realty company would like to develop a regression model to help set weekly rental rates for beach properties during the summer season. The independent variables for this model will be the size of the property in square feet, the number of bedrooms it has, the number of balthrooms it has, and its age. Use the accompanying data, which are from randomly selected rental properties, to complete parts a through d below EER Click the icon to view the data...