Room 1 Room 2 T77(r A B X Z Unseen a The Answer (step 3) 令 Conclusions: The bright one(among three) Switch b The hot one(among left two) Switch a The left one Switch C Center of speech Technology, Tsinghua University Slide 11
Center of Speech Technology, Tsinghua University Slide 11 Unseen A B C X Y Z Room 1 Room 2 ❑ The Answer (Step 3) ❖ Conclusions: ➢ The BRIGHT one (among three) Switch B. ➢ The HOT one (among left two) Switch A. ➢ The left one Switch C
o Those behind the answer . Turning on/off Switch A and Turning on Switch B in room I is a feature extraction procedure andstatus checking in room 2 a feature selection and recognition Feature vector=(hot, bright)' Feature selecting vector: W=(Wn, wb), where wh, W E(0, 1) A hierarchical feature selecting procedure Step 1:W=(0,1)r(“ Bright” component) to tell” from others(“A”&“C”) Step2:W=(1,0)r(“ Hot component) to tell " a”fom“C” Dr alternatively O Step 1:W=(1,0) to tell"a”fomB”l&“C” Step 2: W=(0, 1) to tell"B from"C Center of speech Technology, Tsinghua University Slide 12
Center of Speech Technology, Tsinghua University Slide 12 ❑ Those Behind the Answer ❖ “Turning on/off Switch A and Turning on Switch B in Room 1” is a feature extraction procedure; and “status checking in Room 2” a feature selection and recognition. ❖ Feature vector = (hot, bright) T . ❖ Feature selecting vector: W=(wh , wb ) T , where wh , wb {0, 1} ❖ A hierarchical feature selecting procedure ▪ Step 1: W=(0,1)T (“Bright” component) to tell “B” from others (“A” & “C”) ▪ Step 2: W=(1,0)T ( “Hot” component) to tell “A” from “C” ➢ Or alternatively ▪ Step 1: W=(1,0)T to tell “A” from “B” & “C” ▪ Step 2: W=(0,1)T to tell “B” from “C
o Is this idea suitable for asr? 今 This is a good idea But in mfcc/Lpcc features components are not so separate from each other because each component contributes to the recognition of any unit a solution is to generalize the feature selection to feature weighting so that the difference of contribution of the feature component to different speech recognition unit is reflected .In Feature Weighting, the ws element value range [0, 1,instead of 10, 1) Center of speech Technology, Tsinghua University Slide 13
Center of Speech Technology, Tsinghua University Slide 13 ❑ Is this idea suitable for ASR? ❖ This is a good idea. ❖ But in MFCC/LPCC features, components are not so separate from each other because each component contributes to the recognition of any unit. A solution is to generalize the feature selection to feature weighting, so that the difference of contribution of the feature component to different speech recognition unit is reflected. ❖ In “Feature Weighting”, the W’s element value range is [0,1], instead of {0, 1}
o What we have experimented Hierarchical Feature Weighting Each speech recognition unit(SRU) sub-set shares the same and fixed feature weighting vector (using same feature components) The sru set is divided into subsets according to a Minimum Classification Error(MCE)criterion Sub-band feature weighting inside MfcC calculation s The weight in each sub-band is based on its snr level The method is combined with the noise spectral Subtraction Center of speech Technology, Tsinghua University Slide 14
Center of Speech Technology, Tsinghua University Slide 14 ❑ What we have experimented ❖ Hierarchical Feature Weighting, ➢ Each speech recognition unit (SRU) sub-set shares the same and fixed feature weighting vector (using same feature components); ➢ The SRU set is divided into subsets according to a Minimum Classification Error (MCE) criterion. ❖ Sub-band feature weighting inside MFCC calculation ➢ The weight in each sub-band is based on its SNR level. ➢ The method is combined with the noise spectral subtraction
A Hierarchical Feature Weighting Method Based on Minimum Classification Error(MSe) Center of speech Technology, Tsinghua University Slide 15
Center of Speech Technology, Tsinghua University Slide 15 I. A Hierarchical Feature Weighting Method Based on Minimum Classification Error (MSE)