Decision tree construction Efficient algorithms have been developed to induce a reasonably accurate, although suboptimal, decision tree in a reasonable amount of time These algorithms usually employ a greedy strategy that makes a series of locally optimal decisions about which attribute to use for partitioning the data 1/30/2021 PATTERN RECOGNITION
Decision tree construction Efficient algorithms have been developed to induce a reasonably accurate, although suboptimal, decision tree in a reasonable amount of time. These algorithms usually employ a greedy strategy that makes a series of locally optimal decisions about which attribute to use for partitioning the data. 1/30/2021 PATTERN RECOGNITION 6
Decision tree construction a decision tree is grown in a recursive fashion by partitioning the training records into successively purer subsets We suppose Un is the set of training records that are associated with node n C=CL C2,.,CK) is the set of class labels 1/30/2021 PATTERN RECOGNITION
Decision tree construction A decision tree is grown in a recursive fashion by partitioning the training records into successively purer subsets. We suppose ◦ Un is the set of training records that are associated with node n. ◦ C={c1 , c2 ,…, cK} is the set of class labels. 1/30/2021 PATTERN RECOGNITION 7
Decision tree construction If all the records in u belong to the same class cu then n is a leaf node labeled as c If Un contains records that belong to more than one class, An attribute test condition is selected to partition the records into smaller subsets a child node is created for each outcome of the test condition The records in u are distributed to the children based on the outcomes The algorithm is then recursively applied to each child node 1/30/2021 PATTERN RECOGNITION
Decision tree construction If all the records in Un belong to the same class ck , then n is a leaf node labeled as ck . If Un contains records that belong to more than one class, ◦ An attribute test condition is selected to partition the records into smaller subsets. ◦ A child node is created for each outcome of the test condition. ◦ The records in Un are distributed to the children based on the outcomes. The algorithm is then recursively applied to each child node. 1/30/2021 PATTERN RECOGNITION 8
Decision tree construction For each node, let p(ck) denotes the fraction of training records from class k In most cases the leaf node is assigned to the class that has the majority number of training records The fraction p(ck for a node can also be used to estimate the probability that a record assigned to that node belongs to class k 1/30/2021 PATTERN RECOGNITION 9
Decision tree construction For each node, let p(ck ) denotes the fraction of training records from class k. In most cases, the leaf node is assigned to the class that has the majority number of training records. The fraction p(ck ) for a node can also be used to estimate the probability that a record assigned to that node belongs to class k. 1/30/2021 PATTERN RECOGNITION 9
Decision tree construction Decision trees that are too large are susceptible to a phenomenon known as overfitting a tree pruning step can be performed to reduce the size of the decision tree Pruning helps by trimming the tree branches in a way that improves the generalization error. 1/30/2021 PATTERN RECOGNITION
Decision tree construction Decision trees that are too large are susceptible to a phenomenon known as overfitting. A tree pruning step can be performed to reduce the size of the decision tree. Pruning helps by trimming the tree branches in a way that improves the generalization error. 1/30/2021 PATTERN RECOGNITION 10