Problem 1 (explore the data):
In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt (A NOTE for repeat students: The data is different from the data I shared last year.). This dataset consists of variables possibly relating to blood pressures of n = 30 Peruvians who have moved from rural high altitude areas to urban lower altitude areas. The variables in this dataset are: Age, Weight, Height, Pulse, Systol and Diastol. Before reading the data into MATLAB, it can be viewed in a text editor.
a) Use the readtable('peruvian.txt','Delimiter','tab'); function to read the data into MATLAB. Make sure that you have the directory set to the correct location for the data. Plot ‘Systol’ as y-axis and ‘Age’ as x-axis. Repeat this for other columns, keeping ‘Systol’ as y-axis.
b) Use MATLAB help to find out which function to use.
i. What is the range of each variable?
ii. What is the mean and standard deviation of each variable?
iii. Compute the matrix of correlations between the variables. State, which variables are highly correlated.
Problem 2 (linear regression ):
This question involves the use of simple and multiple linear regression methods on the Peru data set.
a) Use the fitlm() function to perform a multiple linear regression with Systol as the response and the other variables as predictors. Comment on the output. For example:
i. Is there a relationship between the predictors and the response?
ii. Which predictors appear to have a statistically significant relationship to the response?
iii. What does the coefficient for the Weight variable suggest?
iv. How well do the model fit the data?
b) For each predictor fit a model. Which model is the best among all those? How have you decided? Compare the best model to the model in (a).
c) Using the information from the correlation matrix you computed above, develop a rational approach to fit a model. Which predictors have you picked and why? How well does the model fit the data? Compare this model to the models in (a) and the best model in (b).
We need at least 10 more requests to produce the answer.
0 / 10 have requested this problem solution
The more requests, the faster the answer.
In this exercise use the Peruvian blood pressure data set, provided in the file peruvian.txt. This dataset consists of variables possibly relating to blood pressures of n = 39 Peruvians who have moved from rural high altitude areas to urban lower altitude areas. The variables in this dataset are: Age, Years, Weight, Height, Calf, Pulse, Systol and Diastol. Before reading the data intoMATLAB, it can be viewed in a text editor. This question involves the use of multiple linear regression...
For Questions 4-11, use the swiss dataset, which is built into R. Fit a multiple linear regression model with Fertility as the response and the remaining variables as predictors. You should use ?swiss to learn about the background of this dataset. 9. 1 Run Reset Report the value of the F statistic for the significance of regression test. Enter answer here point 10. 1 Run Reset 0.01. What decision do Carry out the significance of regression test using a you...
Exercise 1. For this exercise use the bdims data set from the openintro package. Type ?bdims to read about this data set in the help menu. Of interest are the variables hgt (height in centimeters), wgt (weight in kilograms), and sex (dummy variable with 1-male, 0-female). Since ggplotO requires that a categorical variable be coded as a factor type in R, run the following code: library (openintro) bdíms$sex2 <-factor (bdins$sex, levels-c (0,1), labels=c('F', 'M')) (a) Use ggplot2 to make a...
Exercise 2. Consider the iris data set. (a) Fit a linear regression model for Sepal.Width using Sepal.Length and Species as predictors. Recall that Species is a categorical variable with 3 levels (setosa versicolor, and virginica). Use summary) to print the results. What is the base- line level for Species in the model? (b) Fit a linear regression model for Sepal.Width using Sepal.Length, Species, and the interaction between Sepal.Length and Species as predictors. Use summary ) to print the results. (c)...
Please set eval = FALSE in the codechunk of your RMarkdown, as the output for this question will be too lengthy. In R, use set.seed (35135) and then the rnorm) command to generate 80,000 standard normally distributed observations. Put those values into a matrix with 400 rows and 200 columns. Now fit a multiple linear regression where the tenth column of that matrix is the 'response' variable, and the remainder of the columns are considered predictors. Note: the syntax lm(Y...
Explanation and complete code gets thumbs up. For this Matlab problem. For this problem, use Matlab to plot the data and the best fit. Show the written solutions and the Matlab graphs. Problem Use least squares regression to fit a straight line to r 0246911 121517 19 y567698710 12 12 Along with the slope and intercept, compute the correlation coefficient. Plot the data and the regression line. Then repeat the problem, but regress x versus y - that is, switch...
solve part B Note: Do not use MATLAB (or other programming languages) build-in functions for regression. (a) Write a MATLAB (or other programming languages) function that accepts n values of xi and Yi, perform Linear Regression and returns values of rand, the model parameters ao and a. (6) Write another MATLAB (or other programming languages) function that accepts n values of X, and y(provided as arrays), checks for Linear, Power (y = axBx) and Saturation growth- rate (y = a*)...
Need help with stats true or false questions Decide (with short explanations) whether the following statements are true or false a) We consider the model y-Ao +A(z) +E. Let (-0.01, 1.5) be a 95% confidence interval for A In this case, a t-test with significance level 1% rejects the null hypothesis Ho : A-0 against a two sided alternative. b) Complicated models with a lot of parameters are better for prediction then simple models with just a few parameters c)...
Decide (with short explanations) whether the following statements are true or false. e) In a simple linear regression model with explanatory variable x and outcome variable y, we have these summary statisties z-10, s/-3 sy-5 and у-20. For a new data point with x = 13, it is possible that the predicted value is y = 26. f A standard multiple regression model with continuous predictors and r2, a categorical predictor T with four values, an interaction between a and...
Solve the following problem in R and print out the commands and outputs. 3. The data set fancy (you need to library the fpp package to get the dataset) concern the monthly sales figures of a shop which opened in January 1987 and sells gifts, souvenirs, and novelties. The sales volume varies with the seasonal population of tourists. (a) Produce a time plot of the data and describe the patterns in the graph. Identify any unusual or unexpected fluctuations in...