Question 1 (50 pts): Suppose that a client of yours measure the heights (in inches) of...
A client of yours wants to find out the best microbial environment for C. elegans. In previous meetings, the client told you that C. elegans feed on bacteria but may also be killed by certain bacteria. Therefore, it is important to figure out what bacteria are beneficial to C. elegans. In particular, the client was interested in studying the association between the density of Gluconobacter and the density of C. elegans. The client had collected some pilot data for this...
Question text Suppose that you have a five-point sample data set; the observations of (x, y) are given by (8, 3), (10, 3), (6, 2), (2, 0), and (2, 1). Fit a simple linear regression model to this data by first computing the least squares estimate of the slope parameter. Which of the following is the most accurate? Select one: a. 0.3438 b. 0.4728 d. 0.6712
2. The data set prostate in the faraway package is from a study on 97 men with prostate cancer who were due to receive a radical prostatectomy. We are interest is in predicting lpsa (log prostate specific antigen) with lcavol (log cancer volume). (a) Draw a scatterplot - does a simple linear regression model seem reasonable? (b) Without using the R function Im(0, compute the values , Y,Sxx, Syy and Sxy. Com pute the ordinary least squares estimates of the...
Please solve the question Simulation: Assume the simple linear regression model i = 1,... , n Ул 3D Во + B1; + ei, N(0, o2) for i = 1,...,n. where e Let's set Bo = 10, B1 = -2.5, and n = 30 (a) Set a = 100, and x; = i for i = 1,...,n. (b) Your simulation will have 10,000 iterations. Before you start your iterations, set a random seed using your birthday date (MMDD) and report the...
Question 2: Hypothesis testing (30 pts) Consider the following simple linear regression model with E[G-0 and var(G)-σ2. The output of linear where €1, €2, . . . ,en regression from R takes the form are i.i.d. errors Cal1: lm(formula y ~ x + 1) Residuals: Min 1Q Median 3Q Max 2.0606-0.3287-0.1148 0.5902 1.2809 Coefficients: Estimate Std. Error t value Prlt (Intercept) 0.507932 0.340896 1.49 0.147 0.049656 0.003455 14.37 1.89e-14 Signif. codes: 0.0010.010.05 .'0.1''1 Residual standard error: 0.7911 on 28 degrees...
If can't complete it all, I can post more questions just let me know!! There is an old saying in golf: "You drive for show and you putt for dough. "The point is that good putting is more important than long driving for shooting low scores and hence winning money. To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are examined. The average number of putts per hole for...
1. (55 points) The investigators are interested in asses the relationship between Systolic Blood Pressure (SBP) in mm Hg and Age in years among Hypertensive Patients. Specif- ically, whether a patient's SBP can be predicted from his or her age. They selected n=122 patients at random from a medical record database in a hospital. Assume that the simple linear regression model is appropriate. The following table shows regression output of a simple linear regression model relating the SBP to the...
QUESTION 1 Consider the following OLS regression line (or sample regression function): wage =-2.10+ 0.50 educ (1), where wage is hourly wage, measured in dollars, and educ years of formal education. According to (1), a person with no education has a predicted hourly wage of [wagehat] dollars. (NOTE: Write your answer in number format, with 2 decimal places of precision level; do not write your answer as a fraction. Add a leading minus sign symbol, a leading zero and trailing...
Question 1: (50 marks – 500 words) A colleague is planning to conduct a survey among a sample of university students on a sensitive health topic. He is considering the following methods of data collection: Face-to-face interviews A self-completion survey using a smartphone app. Recommend the method of data collection that you believe is most appropriate, giving reasons for your choice. Discuss the advantages and disadvantages of each of these methods for this survey. Question 2: (50 marks – 500...
5. (20 pts) Suppose that we have a dataset {(yi, x, Tt2, X;3), i,1,... ,n} together with some general belief on the data that higher (lower) value of each covariate x; (j = 1,2,3) will tend to result in higher (lower) y. In this study, we are interested in predicting y; from the total set of the regressors x;i, X;2, xt3. So, we apply the multiple linear regression yi = Bo+B1x1 +B2x52 + B3x43 + t to the data and...