PostaggingInformation Source of PoS taggingLexicalInformationDefinitionofword交道大学
POS tagging • Lexical Information Definition of word Information Source of POS tagging
PostaggingInformationSourceofPoStaggingLexical InformationDefinitionofwordSyntactical InformationContext of word-i.e., its relationship with adjacent and related words in aphrase,sentence,orparagraph交道大学
POS tagging • Lexical Information Definition of word • Syntactical Information Context of word—i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph Information Source of POS tagging
PostaggingInformation Source of PostaggingLexical InformationDefinitionofwordSyntactical InformationContext of word-i.e., its relationship with adjacent and related words in aphrase,sentence,orparagraphMeasure of PoS taggingState-of-the-art methods achieve ~97% accuracyBaselinemethodscouldachieve90%Baseline methods:choosing the most frequent POS tag for a wordotherwords are annotated as noun
POS tagging • Lexical Information Definition of word • Syntactical Information Context of word—i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph Information Source of POS tagging • State-of-the-art methods achieve ~97% accuracy • Baseline methods could achieve 90% Baseline methods: choosing the most frequent POS tag for a word other words are annotated as noun Measure of POS tagging
MMPOStaggerPrincipleW-words, T-tags, O.一corpusMM POs tagger: Assume that t; conditionally depends only on ti-1 and thisdependencydoesnotchangeovertimeargmaxP(t... I w...)t...nP(wi.n Iti.n)P(ti.n)=argmaxP(wi.n)ti.n= argmaxP(win Itin)P(t.n)fi.n
MM POS tagger • W —— words,T ——tags,O —— corpus • MM POS tagger:Assume that �� conditionally depends only on ��−1 and this dependency does not change over time Principle
MMPOStaggerPrincipleP(wi.n It.n)P(t.n)=P(w, It.n)i=1xP(t, Itin-1)xP(t-, Itin-2)×...x P(t, It,)II P(w,It,)i=1× P(t, Ith-)× P(tn-1 Itn-2)×...xP(t, It)[P(w, It,)× P(t, It-)]i=l人The most likely sequence of POs tags for a given sequence of words :ti.n =argmaxP(ti, I wi,)=/P(w, It,)P(t, It-)yi=1
MM POS tagger The most likely sequence of POS tags for a given sequence of words: Principle