SCALA HELP
Use Scala to do your calculations: ( see notes on doing this for the first two segmentations)
a. Calculate the overall entropy of this set
b. Calculate the IG information gain by splitting on head shape
c. Calculate the IG information gain by splitting on body shape
d. Calculate the IG information gain by splitting on color.
d. Of these three, which is the best attribute to do segmentation with, why ?
object quiz {
println("Welcome to the Scala worksheet") //> Welcome to the Scala worksheet
import scala.math._
object entropyDStxtStickFigsFig32 {
println("Entropy calcs for Figure 3.2 of Data Science for Business ")
type D = Double
def log2(p: D) = log(p)/log(2.0)
//Brute force coding :-)
//total overall parent ones and zeros : seven '1's, five '0's , p = 7/12, q = 5/12
val parentEntropy = {
val p = 7.0/12.0
val q = 1 - p
-( p * log2(p) + q * log2(q) )
}
val squareHeadEntropy = {
val p = 5.0/9.0
val q = 1- p
-( p *log2(p) + q * log2(q) )
}
val roundHeadEntropy = {
val p = 2.0/3.0
val q = 1 - p
- ( p * log2(p) + q * log(q) )
}
// so IG(parent, headKids) =
//parentEntropy -
// ((probSquareHead)* squareHeadEntropy )
// +(probRoundHead) * roundHeadEntropy) )
parentEntropy -(9.0/12.0 * squareHeadEntropy + 3.0/12.0 * roundHeadEntropy)
//*************************************
// now based on body type attribute: , ellipsoid, vs rectangles
val parentPeopleList = List(1,1,1,1,1,1,1,0,0,0,0,0)
/**If I know the number of ones in a List and the size of that List,
*and assuming the other target values are zeros, I can calc the entropy ofthe List
* for example, for Fig 3.2 pg 49 , parentPeopleList shows the '1's and '0's
*/
def entropy(listOfOnesAndZeros : List[Int]) = {
val listSize = listOfOnesAndZeros.size
val numberOfOnes = listOfOnesAndZeros.count( i => i ==1)
println(s"numberOfOnes , $numberOfOnes")
val p = numberOfOnes.toDouble/listSize
val q = 1.0 - p
println(s" p, q , $p, $q")
-( p * log2(p) + q * log2(q) )
}//end entropy()
val test = entropy(parentPeopleList)
//********* check out IG by segmenting on the body type ***
val bodyRectangle = List(1,1,1,1,1,0)
val bodyEllipsoid = List(1,1,0,0,0,0)
val bodyRSize = bodyRectangle.size.toDouble
val bodyESize = bodyEllipsoid.size.toDouble
val totalSize = bodyRSize + bodyESize
val IGBodyType =
entropy(parentPeopleList) -
(bodyRSize/totalSize * entropy(bodyRectangle)+
bodyESize/totalSize * entropy(bodyEllipsoid))
}//end entropyDStxtStickFigsFig32
}
SCALA HELP Use Scala to do your calculations: ( see notes on doing this for the...