n-gramModelsthe girl opened herLet's have a simple assumption first: the prediction of x(t-1) depends on onlythe previous n-1 wordsP(x(t+1)|x(t), . x(1) = P(x(t+1)|x(t), .. x(t-n+2))Hypotheticallyn - 1 words交道大学
n-gram Models the girl opened her _ • Let’s have a simple assumption first: the prediction of � (�−1) depends on only the previous � − 1 words � � (�+1) � (�) , ., � (1) = � � (�+1) � (�) , ., � (�−�+2) � − 1 words Hypothetically
n-gram Modelsthe girl opened herLet's have a simple assumption first: the prediction of x(t-1) depends on onlythepreviousn-1wordsP(x(t+1)|x(t), . x(1) = P(x(t+1)|x(t), .. x(t-n+2))Hypotheticallyn - 1' wordsn-gram probabilityr(t-n+2)P(x(t+1) x(t)ConditionalP(x(t), .., x(t-n+2))probability(n-1)-gram probability逸大
n-gram Models the girl opened her _ = �(� (�+1) , � (�) , ., � (�−�+2)) �(� (�) , ., � (�−�+2)) Conditional probability �-gram probability (� − 1)-gram probability • Let’s have a simple assumption first: the prediction of � (�−1) depends on only the previous � − 1 words � � (�+1) � (�) , ., � (1) = � � (�+1) � (�) , ., � (�−�+2) � − 1 words Hypothetically
n-gramModelsthe girl opened herLet's have a simple assumption first: the prediction of x(t-1) depends on onlythepreviousn-1wordsP(x(t+1)|x(t), . x(1) = P(x(t+1)|x(t), .. x(t-n+2))Hypotheticallyn - 1 wordsn-gram probabilityP(x(t+1) x(t) .x(t-n+2))ConditionalP(x(t), ., x(t-n+2))probability(n-1)-gramprobability爱通大寺Problem: How to compute the probability of n-gram and (n - 1)-gram?Method: Compute the probability based on a huge corpuscount(x(t+1), x(t), .,x(t-n+2)Estimatednumbercount(x(t),.. x(t-n+2))
n-gram Models = �(� (�+1) , � (�) , ., � (�−�+2)) �(� (�) , ., � (�−�+2)) Conditional probability �-gram probability (� − 1)-gram probability • Problem: How to compute the probability of �-gram and (� − 1)-gram? • Method: Compute the probability based on a huge corpus. ≈ 𝑐���(� (�+1) , � (�) , ., � (�−�+2)) 𝑐���(� (�) , ., � (�−�+2)) Estimated number the girl opened her _ • Let’s have a simple assumption first: the prediction of � (�−1) depends on only the previous � − 1 words � � (�+1) � (�) , ., � (1) = � � (�+1) � (�) , ., � (�−�+2) � − 1 words Hypothetically
n-gramModelsAssume we train a 4-gram model-Before her mum arrives, the girl opened herconditioncount(girl opened her w)P(wlgirl opened her) =count(girl opened her)交通大学
n-gram Models • Assume we train a 4-gram model condition �(�|𝑔𝑖 ����𝑒 ℎ��) = 𝑐���(𝑔𝑖 ����𝑒 ℎ�� �) 𝑐���(𝑔𝑖 ����𝑒 ℎ��) Before her mum arrives, the girl opened her _
n-gramModelsAssume we train a 4-gram model-Before her mum arrives, the girl opened herconditioncount(girl opened her w)P(wlgirl opened her) =count(girl opened her)For example, within the corpus1000 times“girl opened her"“girl opened her book"400 timesP(booklgirl openedher)=0.4"girlopenedherlaptop"100timesP(laptoplgirl opened her) = 0.1
n-gram Models • Assume we train a 4-gram model condition �(�|𝑔𝑖 ����𝑒 ℎ��) = 𝑐���(𝑔𝑖 ����𝑒 ℎ�� �) 𝑐���(𝑔𝑖 ����𝑒 ℎ��) For example, within the corpus § “girl opened her” 1000 times § “girl opened her book” 400 times �(𝒃��|𝑔𝑖 ����𝑒 ℎ��) = 0.4 § “girl opened her laptop” 100 times �(𝒍𝒂��|𝑔𝑖 ����𝑒 ℎ��) = 0.1 Before her mum arrives, the girl opened her _