7. The following data summarizes the incidence of Coronary Heart Disease (CHD) for people who smoked regularly and for people who did not smoke regularly: CHD No CHD Smoked 84 87 Did not Smoke 2916 4913 (a) State the null and alternative hypothesis for testing if CHD status is independent of smoking status. (b) Calculate the Likelihood-Ratio test-statistic and its corresponding p-value. (c) State your conclusion in terms of the problem if α = 0.05. (d) Describe what “more extreme” would mean in terms of this null. What would it mean for our data to be “different” than the null. (e) Can we say with statistical evidence that smokers have a higher chance of having CHD based on this test? Explain.
Here we have to test the hypothesis that,
H0 : CHD status is independent of smoking status.
H1 : CHD status is dependent of smoking status.
Assume alpha = level of significance = 0.05
This is the test of independence.
The test statistic follows X^2-distribution with (R-1)(C-1) degrees of freedoms.
where R is number of rows
C is number of columns
The test statistic is,
where O is observed frequency and
E is expected frequency
We can do this test in MINITAB.
steps :
ENTER data into MINITAB sheet --> Stat --> Tables --> Chi-Square test for association --> CLick on summarized data in a two way table --> COlumns containing the table: Select all the data columns together --> ok
————— 30-01-2019 18:54:12 ————————————————————
Welcome to Minitab, press F1 for help.
Chi-Square Test for Association: Worksheet rows, Worksheet
columns
Rows: Worksheet rows Columns: Worksheet columns
C1 C2 All
1 84 87 171
64.1 106.9
2 2916 4913 7829
2935.9 4893.1
All 3000 5000 8000
Cell Contents: Count
Expected count
Pearson Chi-Square = 10.071, DF = 1, P-Value = 0.002
Likelihood Ratio Chi-Square = 9.772, DF = 1, P-Value = 0.002
Likelihood ratio test statistic = 9.772
P-value = 0.002
P-value < alpha
Reject H0 at 5% level of significance.
Conclusion : CHD status is dependent of smoking status.
There is statistical evidence that smokers have a higher chance of having CHD based on this test
7. The following data summarizes the incidence of Coronary Heart Disease (CHD) for people who smoked...
7. The following data summarizes the incidence of Coronary Heart Disease (CHD) for people who smoked regularly and for pe ople who did not smoke regularly : Smoked Did not Smoke CHD No CHD 87 4913 84 2916 (a) State the null and alternative hypothesis for testing if CHD status is independent of smoking status (b) Calculate theLikelihood-Ratio test-statistic and its corresponding p-value (c) State your conclusion in terms of the problem if 0.05 (d) Describe what "more extreme"would mean...
7. The following data summarizes the incidence of Coronary Heart Disease (CHD) for people who smoked regularly and for pe ople who did not smoke regularly : Smoked Did not Smoke CHD No CHD 87 4913 84 2916 (a) State the null and alternative hypothesis for testing if CHD status is independent of smoking status (b) Calculate theLikelihood-Ratio test-statistic and its corresponding p-value (c) State your conclusion in terms of the problem if 0.05 (d) Describe what "more extreme"would mean...
Edwin conducted a survey to find the percentage of people in an area who smoked regularly. He defined the label “smoking regularly” for males smoking 30 or more cigarettes in a day and for females smoking 20 or more. Out of 635 persons who took part in the survey, 71 are labeled as people who smoke regularly.Edwin wishes to construct a significance test for his data. He finds that the proportion of chain smokers nationally is 14.1%.What is the standard...
Consider the follow-up study data from the Evans County (Georgia) Heart Disease Study described by Cassels (1971). The data pertain to a cohort of 609 healthy white males between the ages of 40 and 76 who were residents of Evans County in 1960. The cohort was followed for 7 years, after which new cases of coronary heart disease (CHD) were identified. The level of circulating catecholamine (CAT) is the exposure variable of interest (defined as 1 = high and 0...
Are there any reference cells in this problem? If so, what are they? No, there are no reference cells. Yes, reference cells are high CAT, ever smoked, abnormal ECG, and high BP. Yes, reference cells are normal CAT, never smoked, normal ECG, and not high BP. Yes, reference cells are higher age, higher cholesterol, higher DBP, and higher SBP. Both b and d Both c and d How many dummy variables need to be included in the model for high...
What type of analysis is this? ANCOVA Linear regression Logistic regression One-way ANOVA Two-way ANOVA Consider the follow-up study data from the Evans County (Georgia) Heart Disease Study described by Cassels (1971). The data pertain to a cohort of 609 healthy white males between the ages of 40 and 76 who were residents of Evans County in 1960. The cohort was followed for 7 were identified, The level of years, after which new cases of coronary heart disease (CHD) circulating...
Indicate the one major type of limitation that is possible in each of the situations described below. Choose from the following: selection bias, information bias, confounding, random error, limited generalizability or no error. Choose only one per example. A case-control study was conducted to determine whether lower socioeconomic status was associated with a higher risk of cervical cancer. The cases consisted of 250 women with cervical cancer who had been referred to Massachusetts General Hospital for treatment for cervical cancer. They were referred from...