2 General Approach to Classification ·Two-step process: Step 1:Learning Step /Training Phase Supervised Learning V.S.Unsupervised Learning (e.g.Clustering) a classification model is constructed .Step 2:Classification Step the model is used to predict class labels for given data ATA 6 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 6 2 General Approach to Classification Two-step process: Step 1: Learning Step / Training Phase Supervised Learning V.S. Unsupervised Learning (e.g. Clustering) a classification model is constructed Step 2:Classification Step the model is used to predict class labels for given data
Example-Step1 Classification algorithm Training data name age income loan decision Sandy Jones youth low risky Bill Lee youth low risky Caroline Fox middle_aged high safe Rick Field middle_aged low risky Susan Lake senior low safe Classification rules Claire Phips senior medium safe Joe Smith middle_aged high safe 4年0 IF age youth THEN loan_decision risky IF income high THEN loan_decision safe IF age middle_aged AND income low THEN loan_decision risky .Learning:Training data are analyzed by a classification algorithm.Here,the class label attribute is loan decision,and the learned model or classifier is represented in the form of classification rules. ATA Copyright2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 7 Example-Step1 Learning: Training data are analyzed by a classification algorithm. Here, the class label attribute is loan decision, and the learned model or classifier is represented in the form of classification rules
Example-Step2 Classification rules Test data New data name age income loan_decision (John Henry,middle_aged,low) Loan decision? Juan Bello senior low safe Sylvia Crest middle_aged low risky Anne Yee middle_aged high safe risky Classification:Test data are used to estimate the accuracy of the classification rules.If the accuracy is considered acceptable,the rules can be applied to the 8 classification of new data tuples. DATA Copyright2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 8 Example-Step2 Classification: Test data are used to estimate the accuracy of the classification rules. If the accuracy is considered acceptable, the rules can be applied to the classification of new data tuples
3 Training Set .A Training Set made up of database tuples and their associated class labels. A tuple,X,is represented by an n-dimensional attribute vector,X=(xI,x2,...,xn),depicting n measurements made on the tuple from n database attributes, respectively,A1,42,...,An. Each tuple,X,is assumed to belong to a predefined class as determined by another database attribute called the class label attribute. DATA 9 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 9 3 Training Set A Training Set made up of database tuples and their associated class labels. A tuple, X, is represented by an n-dimensional attribute vector, X=(x1, x2,…, xn), depicting n measurements made on the tuple from n database attributes, respectively, A1, A2,…, An. Each tuple, X, is assumed to belong to a predefined class as determined by another database attribute called the class label attribute
4 Test set .A Test Set made up of test tuples and their associated class labels. Test Set is independent of the training tuples, meaning that they were not used to construct the classifier. 10 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 10 4 Test set A Test Set made up of test tuples and their associated class labels. Test Set is independent of the training tuples, meaning that they were not used to construct the classifier