Classification Classification methods Perceptrons (refer to lecture 9.2 Naive bayes kNN Support vector machine(svm
Classification 6 Classification Methods ▪ Perceptrons (refer to lecture 9.2) ▪ Naïve Bayes ▪ kNN ▪ Support vector machine (SVM)
Classification Naive Bayes Bayesian Methods Learning and classification methods based on probability theory Bayes theorem plays a critical role in probabilistic learning and classification Builds a generative model that approximates how data is produced Uses prior probability of each category given no information about an item Categorization produces a posterior probability distribution over the possible categories given a description of an item
Classification 7 Bayesian Methods ▪ Learning and classification methods based on probability theory. ▪ Bayes theorem plays a critical role in probabilistic learning and classification. ▪ Builds a generative model that approximates how data is produced ▪ Uses prior probability of each category given no information about an item. ▪ Categorization produces a posterior probability distribution over the possible categories given a description of an item. Naïve Bayes
Classification Naive Bayes Bayes Rule for classification or a point d and a class c P(c,d)=P(cldp(a=P(dCPc P(cd) P(ac)p(c) P(d)
Classification 8 Bayes’ Rule for classification ▪ For a point d and a class c P(c,d) = P(c | d)P(d) = P(d | c)P(c) P(c | d) = P(d | c)P(c) P(d) Naïve Bayes
Classification Naive Bayes Naive Bayes classifiers Task: Classify a new point d based on a tuple of attribute values into one of the classes c∈C XI CMAP =argmax P(c,Ix,,x,,.,xn) C;∈ P(x12x2,…,xnc,)P(C,) argmaX C argmax P(X,X MAPis“ maximum a posteriori”= most likely class
Classification 9 Naive Bayes Classifiers Task: Classify a new point d based on a tuple of attribute values into one of the classes cj C d = x1 , x2 ,, xn argmax ( | , , , ) j 1 2 n c C MAP c P c x x x j = ( , , , ) ( , , , | ) ( ) argmax 1 2 1 2 n n j j c C P x x x P x x x c P c j = argmax ( , , , | ) ( ) 1 2 n j j c C P x x x c P c j = MAP is “maximum a posteriori” = most likely class Naïve Bayes
Classification Naive Bayes Naive Bayes Classifier Naive bayes assumption P() Can be estimated from the frequency of classes in the training examples P O(X/n. C))parameters Could only be estimated if a very very large number of training examples was available Naive bayes Conditional Independence Assumption assume that the probability of observing the conjunction of attributes is equal to the product of the individual probabilities P(x c)
Classification 10 Naïve Bayes Classifier: Naïve Bayes Assumption ▪ P(cj ) ▪ Can be estimated from the frequency of classes in the training examples. ▪ P(x1 ,x2 ,…,xn |cj ) ▪ O(|X|n•|C|) parameters ▪ Could only be estimated if a very, very large number of training examples was available. Naïve Bayes Conditional Independence Assumption: ▪ Assume that the probability of observing the conjunction of attributes is equal to the product of the individual probabilities P(xi|cj ). Naïve Bayes