Automatic Speech Recognition ING SHEN SCHOOL OF SOFTWARE ENGINEERING TONGJIUNIVERSITY
Automatic Speech Recognition Y I NG SH EN SCH O O L O F SO FTWARE ENGI NEERING TO NGJI UNI VERSI TY
Outline Introduction Speech recognition based on HMm Acoustic processing Acoustic modeling: Hidden Markov Model anguage modeling HUMAN COMPUTER INTERACTION
Outline Introduction Speech recognition based on HMM • Acoustic processing • Acoustic modeling: Hidden Markov Model • Language modeling 1/28/2021 HUMAN COMPUTER INTERACTION 2
What is speech recognition Automatic speech recognition(asr) is the process by which a computer maps an acoustic speech signal to text Challenges for researchers Linguistic factor Physiologic factor Environmental factor HUMAN COMPUTER INTERACTION
What is speech recognition? Automatic speech recognition(ASR) is the process by which a computer maps an acoustic speech signal to text. Challenges for researchers • Linguistic factor • Physiologic factor • Environmental factor 1/28/2021 HUMAN COMPUTER INTERACTION 3
Classification of speech recognition system Users Speaker dependent system Speaker independent system Speaker adaptive system Vocabulary small vocabulary: tens of word medium vocabulary: hundreds of words large vocabulary: thousands of words very-large vocabulary: tens of thousands of words Word pattern isolated-word system: single words at a time continuous speech system: words are connected together HUMAN COMPUTER INTERACTION
Classification of speech recognition system Users • Speaker dependent system • Speaker independent system • Speaker adaptive system Vocabulary • small vocabulary : tens of word • medium vocabulary : hundreds of words • large vocabulary : thousands of words • very-large vocabulary : tens of thousands of words Word pattern • isolated-word system : single words at a time • continuous speech system : words are connected together 1/28/2021 HUMAN COMPUTER INTERACTION 4
How do human do it? Middle ear 咖中 Eustachian ICULATE CORTE Articulation produces sound waves COCHLEA Which the ear conveys to the brain SIGNAL FROM for processing LEFT EAR COCHI三AR NUCLE SUPERIOR OLIVE HUMAN COMPUTER INTERACTION
How do human do it? Articulation produces sound waves Which the ear conveys to the brain for processing 1/28/2021 HUMAN COMPUTER INTERACTION 5