The following table provides a set of training data that contains 10 observations, 2 characteristics and a qualitative response variable that tells us if the patient should be classified as sick and go immediately to emergency.
Observation | X1 | X2 | Y |
---|---|---|---|
1 | 4 | 1 | Healthy |
2 | 1 | 0 | Healthy |
3 | 2 | 4 | Sick |
4 | 3 | 2 | Healthy |
5 | 1 | 2 | Sick |
6 | 2 | 5 | Sick |
7 | 4 | 3 | Healthy |
8 | 3 | 4 | Sick |
9 | 2 | 3 | Sick |
10 | 6 | 2 | Healthy |
.Suppose we want to create an automatic predictor using a linear classifier for this data set. Using your classifier give the prediction for :
a. X 1 = 3 and X 2 = 4
b. X 1 = 4 and X 2 = 5
c. X 1 = 2 and X 2 = 2
d. What problem could be found with this linear classifier?
data stored in Book1.csv file below where yellow onces are the train observations and green are the prediction parts
Created the proper dataset above
Now done the analysis below in R
data prepared above. Now algo predict below
The predicted values are sick,Healty and sick respectively for a , b and c
d)The only problem we can find in the above naive bayes linear classifier is if we have an unknown level of X1 or X2 we will not be able to classify as this is bayesian method where in the train set we don't have that particular value
Hope the above answer has helped you in understanding the problem. Please upvote the ans if it has really helped you. Good Luck!!
The following table provides a set of training data that contains 10 observations, 2 characteristics and a qualitative r...
The table below provides a training data set containing nine observations. two pedictors α1 and X) and one qualitative response variable (Category) X1 X2 Below is scatter plot which shows the data, with Category indicated 1.0 -0.5 0.0 0.5 1.0 1.5 2.0 X1 Suppose we wish to use this data set to make a prediction for Category when X-0 and X-3 using K nearest neighbors. a Compute the Euclidean distance between each observation and the test point X b. What...
The data set "mtcars" in R has 11 variables with 32 observations. A data frame with 32 observations on 11 variables. [, 1] mpg Miles/(US) gallon [, 2] cyl Number of cylinders [, 3] disp Displacement (cu.in.) [, 4] hp Gross horsepower [, 5] drat Rear axle ratio [, 6] wt Weight (1000 lbs) [, 7] qsec 1/4 mile time [, 8] vs V/S [, 9) am Transmission (0 = automatic, 1 = manual) [,10] gear Number of forward gears...
A dataset with n = 7 observations in p = 2 dimensions is provided as follows. For each observation, there is an associated class label (Red or Blue). If we train this dataset with a maximal margin classifier, how many support vectors will be used in this classifier? Answers 2,3,4,5 observation X1 X2 Y 1 2 2 RED 2 2 3 RED 3 2 4 RED 4 4 4 REd 5 4 1 BLUE 6 3 1 BLUE 7 4...
You are presented with a data-set containing 6 observations, and wish to cluster them using hierarchal clustering. A similarity function s(x,x') is chosen which measures how similar observation r is to observation ', for each pair of observations. The distance function is then defined as d(x, x)-1- s(x,') The similarity scores are give by the below matrix, where the [ijlth entry is the similarty between observations r and a 123456 11.00 0.70 0.65 0.40 0.20 0.05 2 0.70 1.00 0.95...
Consider a data set consisting of values for three variables: x, y, and z. Three observations are made on each of the three variables. The following table shows the values of x, y, z, x2, y2, z2, xy, yz, and xz for each observation. Observation x y z x2 y2 z2 xy yz xz 6 6 2 36 36 4 36 12 12 4 3 8 16 9 64 12 24 32 2 6 5 4 36 25 12 30...
Consider the FLAG data set. The first 10 observations are given for informational purposes. CONTRACT COST DOTEST STATUS 1 1379.43 1386.29 1 2 134.03 85.71 1 3 202.33 248.89 0 4 397.12 467.49 0 5 158.54 117.72 1 6 1128.11 1008.91 1 7 400.33 472.98 1 8 581.64 785.39 0 9 353.96 370.02 0 10 138.71 174.25 0 Use Minitab to fit a model that predicts COST based on DOTEST. Paste your output here: Calculate a confidence and prediction interval...
Q. 3 (14 marks) Consider the following data set where y is the response variable and x is a predictor. 7 14 5 -1 2 -2 a) (1 mark) Write down a linear model for y, and with i = 1,2,3. b) (2 marks) Write out the coefficient matrix X c) (4 marks) Find the hat matrix H = X(X"X)-'XT. d) (4 marks) Find the LSEs of the coefficients. e) (3 marks) Find the residuals by using the hat matrix
2. [40 points) Consider the following pairs of observations: X 5 3 2 6 6 0 1 1 7 5 у 3 3 1 4 1 (a) (15 points] Use the method of least squares to fit a straight line to the seven data points in the table. (b) [5 pointsSpecify the null and alternative hypotheses you would use to test whether the data provide sufficient evidence to indicate that x contributes infor- mation for the (linear) prediction of y....
Year Annual Return Year Annual Return 9% 8% 12 14 10 4 4 12 Mean The mean, also known as the arithmetic mean, tells the The mean adds up all the observations and divides by the number of observations, as shown in the following formula value a variable is expected to take rt The mean (or average) return for this stock over the past 12 years is mean, while μ represents a or notational purposes, x-bar represents a However, they...
Do 2.3.4 Use the data "Orange" in R. You should include the r code as well as the output in your file, with appropriate answer to questions. Answer the following questions in your document 1. Fit a simple linear regression using "circumterence" as response and "age" as predictor. Is there a significant linear relationship between the two variables? State the null and alternative hypothesis, test statistic and p-value 2. Find which observation has the largest residual in absolute value) Give...