p(w) p(lw1) p(w2) R R2 R(→>a)amdR2(→>a2)
11 ( ) ( ) R1 →1 R2 →2 and
Equivalently in words: Divide space in two regions Ifx∈R1→xnO1 Ifx∈R2→xim2 ☆ Probability of error Total shaded area +0 12(x)女jp(xan 二 Bayesian classifier is OPTIMAL with respect to minimising the classification error probability!
12 ❖ Equivalently in words: Divide space in two regions ❖ Probability of error ➢ Total shaded area ➢ ❖ Bayesian classifier is OPTIMAL with respect to minimising the classification error probability!!!! 2 2 1 1 If in If in x R x x R x + − = + 0 0 ( ) 2 1 ( ) 2 1 2 1 x x Pe p x dx p x dx
p(2U1 p(ecw) RI R2 > Indeed: Moving the threshold the total shaded area increases by the extra gray area
13 ➢Indeed: Moving the threshold the total shaded area INCREASES by the extra “gray” area
The Bayes classification rule for many (M>2) classes Given x classify it to o, if P|x)>PO|x)≠ Such a choice also minimizes the classification error probabili o Minimizing the average risk For each wrong decision a penalty term is assigned since some decisions are more sensitive than others
14 ❖ The Bayes classification rule for many (M>2) classes: ➢ Given classify it to if: ➢Such a choice also minimizes the classification error probability ❖ Minimizing the average risk ➢ For each wrong decision, a penalty term is assigned since some decisions are more sensitive than others P x P x j i (i ) ( j ) x i
For M=2 Define the loss matrix L 2 22 n2 penalty term for deciding class @2 although the pattern belongs to @, etc Risk with respect to @1 R1 R 15
15 ➢For M=2 • Define the loss matrix • penalty term for deciding class , although the pattern belongs to , etc. ➢Risk with respect to ( ) 21 22 11 12 L = 12 1 1 r p x d x p x d x R R ( ) ( ) 1 1 1 1 1 2 1 1 2 = + 2