Neural Network Language ModelsAdvantagecompared withbookslaptopsn-grammodelsNodatasparsityproblemDon't need to store occurring n-gramsZOOUW#6thegirlheropenedx(2)x(3)x(4)x(1)
the girl opened her _ � (1) � (2) � (3) � (4) � � books laptops a zoo Advantage compared with �-gram models • No data sparsity problem • Don’t need to store occurring �-grams Neural Network Language Models
Neural Network Language ModelsAdvantagecompared withbookslaptopsn-grammodelsNodatasparsityproblemDon't need to store occurring n-gramsZOOUProblemsThe size of windowis too smallWWwillincreaseifthewindowsize increasesThe sizeof windowwill neverbe big enough000600606#6thegirlheropenedx(3)x(1)x(2)x(4)
the girl opened her _ � (1) � (2) � (3) � (4) � � books laptops a zoo Problems • The size of window is too small • � will increase if the window size increases • The size of window will never be big enough Neural Network Language Models Advantage compared with �-gram models • No data sparsity problem • Don’t need to store occurring �-grams
Neural NetworkLanguage ModelsAdvantage compared withbookslaptopsn-grammodelsNo data sparsityproblemDon't need to store occurring n-gramsZOOUProblemsThesizeofwindowistoo smallWWwillincreaseifthewindowsize increasesThe sizeof windowwill neverbe big enough600Q006Aneural networkcouldprocesssequenceswithdifferent lengththegirlheropenedx(2)x(3)x(4)x(1)
the girl opened her _ � (1) � (2) � (3) � (4) � � books laptops a zoo A neural network could process sequences with different length Neural Network Language Models Problems • The size of window is too small • � will increase if the window size increases • The size of window will never be big enough Advantage compared with �-gram models • No data sparsity problem • Don’t need to store occurring �-grams
Neural NetworkLanguage ModelsThe ground breaking work of language model:“abc”outputP(w/u)Dhiddenh2MhiddenhC1C2inputembeddingsinputwordsBengio,Yoshua,etal."Aneural probabilistic languagemodel."Journal ofmachinelearningresearch3.Feb(2003):1137-1155
Neural Network Language Models The ground breaking work of language model :
Outlines1. NNLM2.CBOW3. Skip-gram4. Hierarchical softmax& Negative sampling5. Glove
Outlines 1. NNLM 2. CBOW 3. Skip-gram 4. Hierarchical softmax & Negative sampling 5. Glove