The length of d: lldll2 vard=vazwT bw'x+bl lw'x bl VwIw w2 Margin of H with respect to data set D: (w,b)=min lw'x+bl min w'x +b XED llwll ,w2x∈D w2 can be moved out of minimization term since w is not a function ofx. By definition,the margin and hyperplane are scale invariant: (6w,Bb)=(w,b),V3丰0 20/263
▶ The length of d: kdk2 = √ d Td = √ α2wTw = |w T x + b| √ wTw = |w T x + b| kwk2 ▶ Margin of H with respect to data set D: γ¯(w, b) = min x∈D |w T x + b| kwk2 = 1 kwk2 min x∈D |w T x + b| kwk2 can be moved out of minimization term since w is not a function of x. ▶ By definition, the margin and hyperplane are scale invariant: γ¯(βw, βb) = ¯γ(w, b), ∀β 6= 0 20 / 263
Outline (Level 2-3) o Three Types of Margin o Distance of point to hyperplane ●Functional margin o Geometric margin 21/263
Outline (Level 2-3) Three Types of Margin Distance of point to hyperplane Functional margin Geometric margin 21 / 263
2.3.2.Functional margin In general,the distance between a point and the separation hyperplane can indicate the degree of confidence in the classification prediction. Functional margin Given training dataset Tand separation hyperplane(w,b) Functional margin between (w,b)and sample (xi,y)is: =y(w·x+b) o Functional margin between (w,b)and training dataset T is: minminw) 22/263
2.3.2. Functional margin ▶ In general, the distance between a point and the separation hyperplane can indicate the degree of confidence in the classification prediction. ▶ Functional margin Given training dataset T and separation hyperplane (w, b) Functional margin between (w, b) and sample (xi , yi) is: γˆi = yi(w · xi + b) Functional margin between (w, b) and training dataset T is: γˆ = min i=1,..., N γˆi = min i=1,..., N yi(w · xi + b) 22 / 263
Functional margin can indicate the correctness and confidence of the classification prediction.As to choosing separation hyperplanes,only functional margin is not enough. As long as w,b are changed proportionally,for example,by changing them to 2w,2b,the hyperplane does not change,but the function margin is twice as large as the original. We can add constraint to the normal vector w of separation hyperplane,such as normalization. Ilwll 1 to make margin deterministic.Then functional margin becomes geometric margin. 23/263
▶ Functional margin can indicate the correctness and confidence of the classification prediction. As to choosing separation hyperplanes, only functional margin is not enough. ▶ As long as w, b are changed proportionally, for example, by changing them to 2w, 2b, the hyperplane does not change, but the function margin is twice as large as the original. ▶ We can add constraint to the normal vector w of separation hyperplane, such as normalization. kwk = 1 to make margin deterministic. Then functional margin becomes geometric margin. 23 / 263
Outline (Level 2-3) o Three Types of Margin o Distance of point to hyperplane o Functional margin ●Geometric margin 24/263
Outline (Level 2-3) Three Types of Margin Distance of point to hyperplane Functional margin Geometric margin 24 / 263