5 Accuracy of Classification The accuracy of a classifier on a given test set is the percentage of test set tuples that are correctly classified by the classifier. The associated class label of each test tuple is compared with the learned classifier's class prediction for that tuple. The predictive accuracy of the classifier is estimated. Estimate would likely be c optimistic,because the classifier tends to overfit the data. ATA 11 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 11 5 Accuracy of Classification The accuracy of a classifier on a given test set is the percentage of test set tuples that are correctly classified by the classifier. The associated class label of each test tuple is compared with the learned classifier’s class prediction for that tuple. The predictive accuracy of the classifier is estimated. Estimate would likely be optimistic, because the classifier tends to overfit the data
Decision Tree Induction 12 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 12 Decision Tree Induction
1 Decision Tree Induction Decision tree induction is the learning of decision trees from class-labeled training tuples. A decision tree is a flowchart-like tree structure. Each internal node(non-leaf node)denotes a test on an attribute Each branch represents an outcome of the test. Each leaf node (or terminal node)holds a class label. The topmost node in a tree is the root node. ATA 13 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 13 1 Decision Tree Induction Decision tree induction is the learning of decision trees from class-labeled training tuples. A decision tree is a flowchart-like tree structure. Each internal node(non-leaf node) denotes a test on an attribute Each branch represents an outcome of the test. Each leaf node (or terminal node) holds a class label. The topmost node in a tree is the root node
2 Example age? youth middle_aged senior student? yes credit_.rating.? no yes fair excellent 11o yes no yes A decision tree for the concept buys computer,indicating whether an AllElectronics customer is likely to purchase a computer.Each internal (nonleaf)node represents a test on an attribute.Each leaf node represents a class (either buys_computer =yes or buys computer no). ATA 14 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 14 2 Example A decision tree for the concept buys_computer, indicating whether an AllElectronics customer is likely to purchase a computer. Each internal (nonleaf) node represents a test on an attribute. Each leaf node represents a class (either buys_computer = yes or buys_computer = no)
3 Advantage and Applications No need any domain knowledge or parameter setting. Medicine Manufacturing and production ●Financial analysis ●Astronomy ●Molecular biology 15 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 15 3 Advantage and Applications No need any domain knowledge or parameter setting. Medicine Manufacturing and production Financial analysis Astronomy Molecular biology