Question

Consider the following data set which will be used for a binary classification problem where the goal is to predict whether a

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Initial entropy for variable sold is given by

count number of 0's and 1's in sold column

So 0's are = 7 and 1's are = 5 so total = 12

Formula for entropy = E( S) =- Σ pilogpi

    

here log to the base 2 always SO here entropy is = -p1ogp1-p2logp2=-7/12log7/12-5/12log5/12 = 0.9798

Now we will form decision tree using id5 algorithm

before that we have choose which variable has highest possible information gain.

So let us start from sold and overpriced

Sp E ( S, overpriced)= E( S) - (sum of p(x)*H(x))

P(x) for overpriced=1 = 5/14 so h(x) = -5/5log5/5-0/5log0/5 = 0   and p(x) for overpriced=0 = 7/12 H (x) = -5/7log5/7-2/7log2/7 =0.8631

So E(S ovepriced )= 0.9798 - 0.5034 = 0.4764

Similarly we have to calculate for age , feature and location

So fo age E(S,age ) = 0.25022

E(S,features)= 0.2317

E(S, location) = 0.3580

So the hieghest entropy is for overpriced so the root node is overpriced which has two outcomes 0 an d 1

Information gain asociated with root node = 0.4764

Add a comment
Know the answer?
Add Answer to:
Consider the following data set which will be used for a binary classification problem where the ...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • QUESTION 9 Consider the following binary search tree: If the root node, 50, is deleted, which node will become the new root?   A 15 B 24 C 37 D 62    QUESTION 10 In the...

    QUESTION 9 Consider the following binary search tree: If the root node, 50, is deleted, which node will become the new root?   A 15 B 24 C 37 D 62    QUESTION 10 In the following trees EXCEPT______, the left and right subtrees of any node have heights that differ by at most 1. A complete trees B perfect full trees C balanced binary trees D binary search trees    QUESTION 11 A perfect full binary tree whose height is 5 has...

  • 07. [Classification] Consider the following data set for a binary-class problem. [20] Customer ID Gender M...

    07. [Classification] Consider the following data set for a binary-class problem. [20] Customer ID Gender M Class CO CO M M M M Car Type Family Sports Sports Sports Sports Sports Sports Sports Sports Luxury Family Family Family Luxury Luxury Luxury Luxury Luxury Luxury Luxury Shirt Size Small Medium Medium Large Extra Large Extra Large Small Small Medium Large Large Extra Large Medium Extra Large Small Small Medium Medium Medium 888885555555555 Large 1. Compute the Gini index for the overall...

  • Consider the following data set containing examples e1.,.... e5, each comprised of three binary input attributes...

    Consider the following data set containing examples e1.,.... e5, each comprised of three binary input attributes (x1,x2,x3) and one output label Examplex1 x2 x3 Output y 1 0 1 0 0 1 0 0 e4 e5 1 101 Construct a decision tree for this data set. Compute information gain for each attribute. Select the attribute with the highest information gain as the root node. Then recursively create the children by recomputing information gain for each remaining attribute. Repeat this procedure...

  • Additional Problem 1: Consider the following binary phase diagram for the Cu-Ni system at 1 atm...

    Additional Problem 1: Consider the following binary phase diagram for the Cu-Ni system at 1 atm pressure (a) For XNfotal 0.3 at 1500 K, find XNt and Xw°, and estimate f and f (b) Starting from XNfotal = 0.3 at 1500 K, the system is cooled, and a (Cu,Ni) solid solution forms. Find XNi for the last drop of the liquid as it crystallizes Ni 10 TГ Cu - Ni 20 30 40 50 60 70 80 90 wt% 1800...

  • In this assignment you’ll implement a data structure called a trie, which is used to answer...

    In this assignment you’ll implement a data structure called a trie, which is used to answer queries regarding the characteristics of a text file (e.g., frequency of a given word). This write-up introduces the concept of a trie, specifies the API you’re expected to implement, and outlines submission instructions as well as the grading rubric. Please carefully read the entire write-up before you begin coding your submission. Tries A trie is an example of a tree data structure that compactly...

  • Attention!!!!!!! I need python method!!!!!!!!! the part which need to edit is below: i nee...

    attention!!!!!!! I need python method!!!!!!!!! the part which need to edit is below: i need python one!!!!!!!!! the part below is interface for the range search tree which don’t need to modify it. Week 3: Working with a BST TODO: Implement a Binary Search Tree (Class Name: RangesizeTree) Choose one language fromJava or Python. Iin both languages,there is an empty main function in the Range Size Tree file. The main function is not tested, however, it is provided for you...

  • 1: Consider the following set of data ccollected for a reaction of the form A -->...

    1: Consider the following set of data ccollected for a reaction of the form A --> products. time (seconds) [A] (M) 0 1.000 10 0.641 20 0.472 30 0.373 40 0.309 50 0.263 A: What is the average rate of the reaction for the first 10 seconds? rate = ___ M/s B: What is the average rate of the reaction between 30 and 40 seconds? rate = ___ M/s 2: Consider the dimerization of C4H6: 2 C4H6 --> C8H12 for...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT