Decision Trees and Random Forests (Programing language R)
To predict room occupancy using the decision tree classification algorithm.
(a) Load the room occupancy data and train a decision tree classifier. Evaluate the predictive performance by reporting the accuracy obtained on the testing dataset.
(b) Output and analyse the tree learned by the decision tree algorithm, i.e. plot the tree structure and make a discussion about it.
(c) Train a random forests classifier, and evaluate the predictive performance by reporting the accuracy obtained on the testing dataset.
(d) Output and analyse the feature importance obtained by the random forests classifier.
Use the following data:
Testing data:
Temperature,Humidity,Light,CO2,HumidityRatio,Occupancy
21.89,31.55,436.5,1047,0.00512966,No
21.89,31.36,434,1031,0.005098515,No
21.89,31.125,432.75,977.5,0.005059998,No
21.7,28.5,279.3333333,585,0.004576247,Yes
20.6,21.865,454,652.5,0.003274764,No
20.6,22.2,442.75,681.75,0.003325206,No
20.6,22.26,444,702.3333333,0.003334241,No
Training Data:
Temperature,Humidity,Light,CO2,HumidityRatio,Occupancy
23.18,27.272,426,721.25,0.004792988,Yes
23.15,27.2675,429.5,714,0.004783441,Yes
23.15,27.245,426,713.5,0.004779464,Yes
23.15,27.2,426,708.25,0.004771509,Yes
23.1,27.2,426,704.5,0.004756993,Yes
23.1,27.2,419,701,0.004756993,Yes
23.1,27.2,419,701.6666667,0.004756993,Yes
23.1,27.2,419,699,0.004756993,Yes
23.1,27.2,419,689.3333333,0.004756993,Yes
23.075,27.175,419,688,0.004745351,Yes
library(HistData) set.seed(1983) galton_heights <- GaltonFamilies %>% filter(gender == "male") %>% group_by(family) %>% sample_n(1) %>% ungroup() %>% select(father, childHeight) %>% rename(son = childHeight)
y <- galton_heights$son test_index <- createDataPartition(y, times = 1, p = 0.5, list = FALSE) train_set <- galton_heights %>% slice(-test_index) test_set <- galton_heights %>% slice(test_index)
In this case, if we were just ignoring the father’s height and guessing the son’s height, we would guess the average height of sons.
m <- mean(train_set$son) m #> [1] 69.2
Decision Trees and Random Forests (Programing language R) To predict room occupancy using the decision tree...