COMP 578 Discovering Classification Rules Keith c.c. chan Department of Computing The Hong Kong Polytechnic University
COMP 578 Discovering Classification Rules Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University
An Example Classification Problem Patient records Recovered Symptoms Treatment Not recovered A B?
2 An Example Classification Problem Patient Records Symptoms & Treatment Recovered Not Recovered A? B?
Classification in Relational DB Patient Symptom Treatment/Recovered Mike Headache TypeAYes Mary Fever typeANo Bill Cough Type B2 No Fever Type C1Yes Dave Doug h Type C1Yes Anne Headache Type B2 Yes Will John, having a headache Class Label and treated with Type CI recover?
3 Classification in Relational DB Patient Symptom TreatmentRecovered Mike Headache Type A Yes Mary Fever Type A No Bill Cough Type B2 No Jim Fever Type C1 Yes Dave Cough Type C1 Yes Anne Headache Type B2 Yes Class Label Will John, having a headache and treated with Type C1, recover?
Discovering of Classification Rules Minins Classification Training Rules Data NAME Symptom Treat. Recover? Mike Headache Type AYes Mary Fever Type ANo Classification Cough Type B2No Rules Jim Fever ype C1 Yes Dave Cough Typ ype C1 YesIF Symptom=Headache Anne Headache Type B2 Yes AND Treatment=Cl Then Recover Yes Based on the classification rule discovered. John will recover
4 Discovering of Classification Rules Training Data NAME Symptom Treat. Recover? Mike Headache Type A Yes Mary Fever Type A No Bill Cough Type B2 No Jim Fever Type C1 Yes Dave Cough Type C1 Yes Anne Headache Type B2 Yes Mining Classification Rules IF Symptom = Headache AND Treatment = C1 THEN Recover = Yes Classification Rules Based on the classification rule discovered, John will recover!!!
The classification problem a Given a database consisting of n records Each record characterized by m attributes Each record pre-classified into p different classes Find A set of classification rules(that constitutes a classification model) that characterizes the different classes so that records not originally in the database can be accurately classified I. e predicting"class labels
5 The Classification Problem Given: – A database consisting of n records. – Each record characterized by m attributes. – Each record pre-classified into p different classes. Find: – A set of classification rules (that constitutes a classification model) that characterizes the different classes – so that records not originally in the database can be accurately classified. – I.e “predicting” class labels