Regression and Correlation Methods: Correlation, ANOVA, and Least Squares
This is another way of assessing the possible association between a normally distributed variable y and a categorical variable x. These techniques are special cases of linear regression methods. The purpose of the assignment is to demonstrate methods of regression and correlation analysis in which two different variables in the same sample are related.
The following are three important statistics, or methodologies, for using correlation and regression:
In this assignment, solve problems related to these three methodologies.
Part 1: Pearson's Correlation Coefficient
For the problem that demonstrates the Pearson's coefficient, you will use measures that represent characteristics of entire populations to describe disease in relation to some factor of interest, such as age; utilization of health services; or consumption of a particular food, medication, or other products. To describe a pattern of mortality from coronary heart disease (CHD) in year X, hypothetical death rates from ten states were correlated with per capita cigarette sales in dollar amount per month. Death rates were highest in states with the most cigarette sales, lowest in those with the least sales, and intermediate in the remainder. Observation contributed to the formulation of the hypothesis that cigarette smoking causes fatal CHD. The correlation coefficient, denoted by r, is the descriptive measure of association in correlational studies.
Table 1: Hypothetical Analysis of Cigarette Sales and Death Rates Caused by CHD
State | Cigarette sales | Death rate |
1 | 102 | 5 |
2 | 149 | 6 |
3 | 165 | 6 |
4 | 159 | 5 |
5 | 112 | 3 |
6 | 78 | 2 |
7 | 112 | 5 |
8 | 174 | 7 |
9 | 101 | 4 |
10 | 191 | 6 |
Using the Minitab statistical procedure:
In addition to the above:
Refer to the Assignment Resources: Dot Plots and Correlation and Resources: Performing Regression Analysis to view an example of Pearson's correlation coefficient. This same resources are also available under lecture Correlation and Regression Methods.
Submission Details:
Part 2: ANOVA
Let's take hypothetical data presenting blood pressure and high fat intake (less than 3 grams of total fat per serving) or low fat intake (less than 1 gram of saturated fat) of an individual.
Table 2: Blood Pressure and Fat Intake
Individual | Blood Pressure | Fat Intake |
1 | 135 | 1 |
2 | 130 | 1 |
3 | 135 | 1 |
4 | 128 | 0 |
5 | 121 | 0 |
6 | 133 | 0 |
7 | 145 | 1 |
8 | 137 | 1 |
9 | 148 | 1 |
10 | 134 | 0 |
11 | 150 | 0 |
12 | 121 | 0 |
13 | 117 | 1 |
14 | 128 | 1 |
15 | 121 | 0 |
16 | 124 | 1 |
17 | 132 | 0 |
18 | 121 | 0 |
19 | 120 | 0 |
20 | 124 | 0 |
Using the Minitab statistical procedure:
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Visit the media Resources: One-Way ANOVA on lecture Correlation and Regression Methods to view an example of ANOVA.
Submission Details:
Part 3: Least Squares
The following are hypothetical data on the number of doctors per
10,000 inhabitants and the rate of prematurely delivered newborns
for different countries of the world.
Table 3: Number of Doctors Verses the Rate of Prematurely Delivered Newborns
Country | Doctors per 100,000 | Early births per 100,000 |
1 | 3 | 92 |
2 | 5 | 88 |
3 | 5 | 85 |
4 | 6 | 86 |
5 | 7 | 89 |
6 | 7 | 75 |
7 | 7 | 70 |
8 | 8 | 68 |
9 | 8 | 69 |
10 | 10 | 50 |
11 | 12 | 45 |
12 | 12 | 41 |
13 | 15 | 38 |
14 | 18 | 35 |
15 | 19 | 30 |
16 | 23 | 6 |
Using the Minitab statistical procedure:
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
Additional Materials
Dot Plots and Correlation
Performing Regression Analysis
Attachments
Submissions
{count} items shown{count} items selectedAll items selected.Clear Selection
No submissions yet. Drag and drop to upload your assignment below.
1)
Graph -> Scatter plot -> simple
Scatter plot
stat -> Basic stat -> correlation
Correlation: Cigarette sales, Death rate
Pearson correlation of Cigarette sales and Death rate =
0.826
P-Value = 0.003
r = 0.826
it means there is positive and strong correlation between Cigaratte
Sales and Death Rate
p-value = 0.003 < alpha
hence the relationship is significant
Please rate
Please post one question at a time
Regression and Correlation Methods: Correlation, ANOVA, and Least Squares This is another way of assessing the...
What type of analysis is this? ANCOVA Linear regression Logistic regression One-way ANOVA Two-way ANOVA Consider the follow-up study data from the Evans County (Georgia) Heart Disease Study described by Cassels (1971). The data pertain to a cohort of 609 healthy white males between the ages of 40 and 76 who were residents of Evans County in 1960. The cohort was followed for 7 were identified, The level of years, after which new cases of coronary heart disease (CHD) circulating...
Problem Definition: A wholesale supplier wants to predict the average cost associated with shipping orders of various sizes. The order sizes and shipping costs for the past twelve months are provided in the table below. Set alpha at .01. Order sizes and shipping costs for last twelve months Size of Orders =X Shipping Costs =Y 1068 4489 1026 5611 767 3290 885 4113 1156 4883 1146 5425 892 4414 938 5506 769 3346 677 3673 1174 6542 1009 5088 Review...
Consider the following Excel regression output of Data Analysis (picture is autornaic) SUMMARY OUTPUT six data points on a restaurant bill and corresponding tip Bill Line Fit Plot 0.828159148 R Square 0.685847574 0.607309468 Adjusted R Square Standerd Error Predicted Tip 15e 100 ANOVA 0.041756749 93.1383292 42.66200414 135.8003333 93 1383292 10.60550103 8.732672652 Total Upper 95% tener 95% Prok 0933934844 0.041756749 Error 10.58103503 0.288243157 1.27559337 0.008985139 0.347279172 0.148614148 3.936081495 0.050290571 -0.06822967 2.955109584 Bitl (e) Positive correlation of 0.83 -strong corlation. Percentage of...
Regression Analysis Problem #1 The following table shows the sales (in $100,000) of a certain product as a function of the past 10 months and the level of advertisement (in S1,000) for the corresponding months Sales 100 110 107 110 118 120 117122 120 125 Month 2 4 5 7 8 10 Advertisement 1.5 2.1 2,6 2.8 3.5 3.7 3.6 3.4 3.4 2.9 Part A: Considering a simple linear regression of sales vs, time (months), perform the following analysis manually,...
While conducting a one-way ANOVA comparing 6 treatments with 10 observations per treatment, the computed value for SS(Treatment)= 1, and SS(Error)=24. Calculate the value of F. Round off the answer to 2 decimal digits A study was conducted to determine the association between the maximum distance at which a highway sign can be read ( in feet) and the age of the driver ( in years). Fourty drivers of various ages were studied. The summary statistics for distance and age are...
I need questions 1-3 & 6-10 answered with shown calculated work. 201 Bivariate Regression and Correlation 6 PROBLEMS General Problems 1. For the following data on 10 persons, construct a scatterplot showing the relationship between age and number of children. and describe the relationship in verbal terms Number o Children (Y) Age Person (i) 42 26 38 23 21 19 79 25 75 67 9 10 2. Using data from 60 adults, a community sociologist wants to estimate an equation...
Bivariate Fit of NONFOOD PURCHASES By AGE 90 80 70 60 50 40 30 20 20 30 40 50 60 AGE -Linear Fit Linear Fit NONFOOD_PURCHASES = 12.956633 0.8136836 AGE Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.33852 0.336478 11.54086 39.1842 326 Lack Of Fit Analysis of Variance Sum of Source DF Squares Mean Square F Ratio 22084.6 165.8106 133.2 Prob > F .00011 Model 1 22084.562 Error 324 43154.032...
The following ANOVA model is for a multiple regression model with two independent variables: Degrees of Sum of Mean Source Freedom Squares Squares F Regression 2 60 Error 18 120 Total 20 180 Determine the Regression Mean Square (MSR): Determine the Mean Square Error (MSE): Compute the overall Fstat test statistic. Is the Fstat significant at the 0.05 level? A linear regression was run on auto sales relative to consumer income. The Regression Sum of Squares (SSR) was 360 and...