Question

In R load the tidyverse package Consider the `USArrests` dataset, which contains statistics, in a...

In R load the tidyverse package

Consider the `USArrests` dataset, which contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.

(a) Perform k-means clustering using all numerical variables in this dataset, scaling the variables before running the clustering algorithm

(b) Try two different values of $k$ and comment on your results.

(c) Visualize the results of the clustering using the variables `Murder` and `UrbanPop`

0 0
Add a comment Improve this question Transcribed image text
Answer #1
data("USArrests")
mydata <- USArrests
mydata <- na.omit(mydata)
mydata <- scale(mydata)
head(mydata, n=10)
set.seed(124)
ss <- sample(1:50,10)
df <- USArrests[ss, ]
df <- na.omit(df)
head(df,n=6)
df.scaled <- scale(df)
head(round(df.scaled, 2))
desc_stats <- data.frame(
  Min = apply(USArrests, 2, min),
  Max = apply(USArrests, 2, max),
  Med = apply(USArrests, 2, median),
  SD = apply(USArrests, 2, sd),
  Mean = apply(USArrests, 2, mean))
desc_stats <- round(desc_stats,1)
head(desc_stats)
library(stats)
eucl <- dist(df.scaled, method = "euclidean" )
round(as.matrix(eucl)[1:6,1:6],1)
cor <- cor(t(df.scaled), method = "pearson")
dist_cor <- as.dist(1 - cor)
round(as.matrix(dist_cor)[1:6,1:6],1)
#daisy() to compute dissimilarity matrices between observations 
library(cluster)
library(factoextra)
daisy(df.scaled, metric = c("euclidean", "manhattan", "gower"), stand = FALSE)
data("flower")
head(flower)
str(flower)
daisy_dist <- as.matrix(daisy(flower))
head(round(daisy_dist[1:6,1:6]),2)
library(corrplot)
corrplot(as.matrix(eucl), is.corr = FALSE, method = "color")
corrplot(as.matrix(eucl), is.corr = FALSE, method = "color", order = "hclust", type = "upper")
plot(hclust(eucl, method = "ward.D2"))
heatmap(as.matrix(eucl), symm = TRUE, distfun = function(x) as.dist(x))
Add a comment
Know the answer?
Add Answer to:
In R load the tidyverse package Consider the `USArrests` dataset, which contains statistics, in a...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • For the following exercises you can use the 'Wooldridge' package in R to load the data 9. (7 marks) (using data...

    For the following exercises you can use the 'Wooldridge' package in R to load the data 9. (7 marks) (using dataset: "k401k") The data in 401K are a subset of data analyzed by Papke (1995) to study the relationship between participation in a 401(k) pension plan and the generosity of the plan. The variable prate is the percentage of eligible workers with an active account; this is the variable we would like to explain. The dummy variable sole represents whether...

  • R studio #Exercise : Calculate the following probabilities : #1. Probability that a normal random variable...

    R studio #Exercise : Calculate the following probabilities : #1. Probability that a normal random variable with mean 22 and variance 25 #(i)lies between 16.2 and 27.5 #(ii) is greater than 29 #(iii) is less than 17 #(iv)is less than 15 or greater than 25 #2.Probability that in 60 tosses of a fair coin the head comes up #(i) 20,25 or 30 times #(ii) less than 20 times #(iii) between 20 and 30 times #3.A random variable X has Poisson...

  • ies yuu t pret and comimuhicate the findings of two linear regression models. The data is...

    ies yuu t pret and comimuhicate the findings of two linear regression models. The data is from an article that studies the relationship between salaries of legislators and representation of the working-classes in state legislatures in the US. Background If politicians in the United States were paid better, would more working-class people become politicians? It is often argued that if politicians are paid too little, then it is economically too difficult for lower-income citizens to hold positions of office. This...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT