Landmark Detection using Support Vector Machines(SvMs) False Acceptance vS. False Rejection Errors timit per 10ms frame SVMStop release Detector: Half the Error of an HMM LINEAR (1) Delta-Energy(“Deri”): 0.25 Equal Error Rate=0.2 (2)HMM(*):False Rejection Error=0.3% (3) LinearSVM: SVM EER=0.15% 035 0.45 False Re jection (4) Radial Basis Function SVM: Equal Error rate=0.13% Niyogi burges, 1999, 2002
Landmark Detection using Support Vector Machines (SVMs) False Acceptance vs. False Rejection Errors, TIMIT, per 10ms frame SVM Stop Release Detector: Half the Error of an HMM (3) Linear SVM: EER = 0.15% (4) Radial Basis Function SVM: Equal Error Rate=0.13% Niyogi & Burges, 1999, 2002 (1) Delta-Energy (“Deriv”): Equal Error Rate = 0.2% (2) HMM (*): False Rejection Error=0.3%
Dynamic Programming Smooths SVMs Maximize ili p( features(ti) X(t)p(ti+-ti features(tD)) Soft-decision smoothing " mode p( acoustics landmarks )computed, fed to pronunciation model ime:0.000:0.00a108:23 D:0.000001:0.00400R:0.00400(F 00 q r b r ax s n s ah w ah n q eh gcl hv m b ow bcl bcl al bcl ST SC vV Fr v Fr V SC V SC V ST ST SC V SC VST LENCE Fr SC SILENCE SILENCE V SILENCE V SILENCE SILENC SILENCE SILENCE SILENCE SILENCE
Dynamic Programming Smooths SVMs • Maximize Pi p( features(ti ) | X(ti ) ) p(ti+1-t i | features(ti )) • Soft-decision “smoothing” mode: p( acoustics | landmarks ) computed, fed to pronunciation model
// Example 1 Cues for place of Articulation 1601 MFCC+formants ratescale. within 150ms of landmark 243 /p/ Example 1 Time(ms) / Example 2 2399 1601 1009 Time(ms) Time(ms)
Cues for Place of Articulation: MFCC+formants + ratescale, within 150ms of landmark
Soft- Decision landmark probabilities Kernel: 8 Transform to Infinite Dimensional Hilbert pace △△△ SVMDiscriminant dimension SVM Extracts a argmin(error(margin)+1/width(margin) Discriminant dimension Niyogi burges, 2002: p(class acoustics)s Sigmoid model in discriminant dimension OR Juneja espy-Wilson, 2003: p(classacoustics)s Histogram in discriminant dimension
Kernel: Transform to InfiniteDimensional Hilbert Space Niyogi & Burges, 2002: p(class|acoustics) ≈ Sigmoid Model in Discriminant Dimension Soft-Decision Landmark Probabilities SVM Extracts a Discriminant Dimension SVM Discriminant Dimension = argmin(error(margin)+1/width(margin) Juneja & Espy-Wilson, 2003: p(class|acoustics) ≈ Histogram in Discriminant Dimension OR