Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Question

Question

Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages and the weaknesses of decision trees.

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

Overfitting is a significant practical difficulty for decision tree models and many other predictive models. Overfitting happens when the learning algorithm continues to develop hypotheses that reduce training set error at the cost of an
increased test set error. There are several approaches to avoiding overfitting in building decision trees.
Pre-pruning that stop growing the tree earlier, before it perfectly classifies the training set.
Post-pruning that allows the tree to perfectly classify the training set, and then post prune the tree.
Practically, the second approach of post-pruning overfit trees is more successful because it is not easy to precisely estimate when to stop growing the tree.
The important step of tree pruning is to define a criterion be used to determine the correct final tree size using one of the following methods:
Use a distinct dataset from the training set (called validation set), to evaluate the effect of post-pruning nodes from the tree.
Build the tree by using the training set, then apply a statistical test to estimate whether pruning or expanding a particular node is likely to produce an improvement beyond the training set.
Error estimation
Significance testing (e.g., Chi-square test)
Minimum Description Length principle : Use an explicit measure of the complexity for encoding the training set and the decision tree, stopping growth of the tree when this encoding size (size(tree) + size(misclassifications(tree)) is minimized.
The first method is the most common approach. In this approach, the available data are separated into two sets of examples: a training set, which is used to build the decision tree, and a validation set, which is used to evaluate the impact of pruning the tree. The second method is also a common approach. Here, we explain the error estimation and Chi2 test.

The advantages of a decision tree are fairly obvious: a “path” through possibilities, with alternatives, leading toward a desirable outcome. The tree anticipates dead ends and disastrous missteps, but most importantly it clarifies the difference between controlled and uncontrolled events – what decisions are in the CEO’s power to make, and what decisions must await the outcome of changes uncontrollable. For example, a tree showing ways to use excess capital will show what choices are available, and what choices must await Stock Market fluctuation. Another revelation from decision trees is the taxonomy of priorities – for example, is employee maintenance more or less important than stockholder dividends?

The major disadvantage of decision trees is loss of innovation – only past experience and corporate habit go into the “branching” of choices; new ideas don’t get much consideration. There is a tendency with trees to only consider paths that have been successful in the past, thus stultifying thought about changing situations. The trees are usually over-simple, not branched enough, and little consideration given to the “thickness” (value and probability) of each branch. Finally, like all metaphors, there is a tendency to argue by analogy – phrases like “the roots of the business,” the “seasons of new growth,” etc., tend to obfuscate the real debate. So, while they visualize the decisions to be made, at the same time they condense a complex process into discrete steps (which may be a good or a bad thing)

Add a comment

Answer 2

Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Homework Answers

Add Answer to:
Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Post as a guest

Earn Coins

JAVA: Explain the advantages and disadvantages of binary search tree structures. Discuss ways to quantify performance.

5. In your own words, explain why overfitting and underfitting are not desirable? How would you confirm that your model is overfitting? State two methods to combat model overfitting. (15) 5. In...

List and briefly discuss the advantages and disadvantages of the profitability index decision criteria.

. List and briefly explain two out of the three ways to prevent agency problems

How to avoid overfitting with linear regression ? Give at least two solutions and explain your...

3) What two factors are held constant when constructing a calibration curve? 4) Briefly explain the...

plz no copy Compare the advantages and disadvantages of eager classification (e.g., decision tree, Bayesian, neural...

Briefly explain the meaning of moral hazard, and identify two ways in which an insurer can...

Can you give me a poste for Science Writing TOPIC: DECISION TREE Decision Tree Algorithm Pseudocode:-...

Explain what are the benefits of decision trees? and why don’t many predictive modelers prefer to...

Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Homework Answers

Add Answer to: Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...

Post as a guest

Earn Coins

Add Answer to:
Briefly explain two ways to limit overfitting in constructing a decision tree. Briefly explain the advantages...