当前位置：和泉文库 > 计算机 > 浏览文档

哈尔滨工业大学：词义消歧（PPT讲稿）Word sense disambiguation

文件格式：PPT，文件大小：242KB，售价：9.95元

文档详细内容（约44页）

Bayesian Disambiguation Algorithm Disambiguation for all senses s, of w do for all senses Sk of w do for all y; in vocabulary do score (Sy=log p(se P(y|)=C()/C() for all v: in context window cdo end core end log P(i lsg for all senses Sk of w do end P(g=C(/C(u) end end choose argmax k score ( y Gale, Church, and Yarowsky obtain 90% correct disambiguation on 6 ambiguous nouns in Hansard corpus using this approach (e. g, drug as a medication Vs illicit substance 20212/5 Natural Language Processing--Word Sense Disambiguation

2021/2/5 Natural Language Processing -- Word Sense Disambiguation 11 Bayesian Disambiguation Algorithm Training: for all senses sk of w do for all vj in vocabulary do P(vj|sk ) = C(vj ,sk )/C(sk ) end end for all senses sk of w do P(sk ) = C(sk )/C(w) end Disambiguation: for all senses sk of w do score(sk ) = log P(sk ) for all vj in context window c do score(sk ) = score(sk ) + log P(vj|sk ) end end choose argmaxsk score (sk ) Gale, Church, and Yarowsky obtain 90% correct disambiguation on 6 ambiguous nouns in Hansard corpus using this approach (e.g., drug as a medication vs. illicit substance

Supervised Disambiguation An Information-Theoretic Approach Brown et al., 1991)attempt to find a single contextual feature that reliably indicates which sense of an ambiguous word is being used a For example, the French verb prendre has two different readings that are affected by the word earing in object position( mesur→onk, ecision→0mhe2), but the verb vouloir s reading is affected by tense ( present→owmh, conditional→mbke) To make good use of an informant its values need to be categorized as to which sense they indicate(e. g mr→make, ecision→mnke); Brown et a.use the flip-Flop als gorithm to do this 20212/5 Natural Language Processing--Word Sense Disambiguation 12

2021/2/5 Natural Language Processing -- Word Sense Disambiguation 12 Supervised Disambiguation: An Information-Theoretic Approach ◼ (Brown et al., 1991) attempt to find a single contextual feature that reliably indicates which sense of an ambiguous word is being used. ◼ For example, the French verb prendre has two different readings that are affected by the word appearing in object position (mesure → to take, décision → to make), but the verb vouloir’s reading is affected by tense (present → to want, conditional → to like). ◼ To make good use of an informant, its values need to be categorized as to which sense they indicate (e.g., mesure → to take, décision → to make); Brown et al. use the Flip-Flop algorithm to do this

Supervised Disambiguation An Information- Theoretic approach Let t,, .. tm be translations for an ambigu uous word an d be possible values of the indicator The Flip-Flop algorithm is used to disambiguate between the different senses of a word using mutual intormation: I(x,)=2∈xy∈Yp(x,y)logp(xy)/(p()p) See brown et al. for an extension to more than two senses The algorithm works by searching for a partition of senses that maximizes the mutual information. The algorithm stops when the increase becomes insignificant 20212/5 Natural Language Processing--Word Sense Disambiguation

2021/2/5 Natural Language Processing -- Word Sense Disambiguation 13 Supervised Disambiguation: An Information-Theoretic Approach ◼ Let t1 ,…, tm be translations for an ambiguous word and x1 ,…, xn be possible values of the indicator. ◼ The Flip-Flop algorithm is used to disambiguate between the different senses of a word using mutual information: ◼ I(X;Y)=Σx∈X Σ y ∈ Y p(x,y) log p(x,y)/(p(x)p(y)) ◼ See Brown et al. for an extension to more than two senses. ◼ The algorithm works by searching for a partition of senses that maximizes the mutual information. The algorithm stops when the increase becomes insignificant

Mutual Information I(X; Y=H(-HXp=H-H(r/X), the mutual information between X and Y is the reduction in uncertainty of one random variable due to knowing about another or. in other words the amount of information one random variable contains about another H(X,Y H(XY H(YX H() H( 20212/5 Natural Language Processing--Word Sense Disambiguation

2021/2/5 Natural Language Processing -- Word Sense Disambiguation 14 Mutual Information ◼ I(X; Y)=H(X)-H(X|Y)=H(Y)-H(Y|X), the mutual information between X and Y, is the reduction in uncertainty of one random variable due to knowing about another, or, in other words, the amount of information one random variable contains about another

Mutual Information(cont) I(X; Y=H(X-H(XY=HY-HYX I(X; Y) is symmetric, non-negative measure of the common information of two variables Some see it as a measure of dependence between two variable but better to think of it as a measure of independence I(X; Y) is 0 only when X and Y are independent: H(XY=H(X) For two dependent variables, I grows not only according to the degree of dependence but also according to the entropy of the two variables H④=H④H(X|Ⅹ=I(x;Ⅹ→ Why entropy is called self- information 20212/5 Natural Language Processing--Word Sense Disambiguation

2021/2/5 Natural Language Processing -- Word Sense Disambiguation 15 Mutual Information (cont) I(X; Y) = H(X) – H(X|Y) = H(Y) – H(Y|X) ◼ I(X; Y) is symmetric, non-negative measure of the common information of two variables. ◼ Some see it as a measure of dependence between two variables, but better to think of it as a measure of independence. ◼ I(X; Y) is 0 only when X and Y are independent: H(X|Y)=H(X) ◼ For two dependent variables, I grows not only according to the degree of dependence but also according to the entropy of the two variables. ◼ H(X)=H(X)-H(X|X)=I(X; X) ➔ Why entropy is called selfinformation

点击进入文档下载页（PPT格式）

共44页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

香港城市大学：Adaptive Random Test Case Prioritization（PPT讲稿）
《单片机原理及接口技术》课程教学资源（PPT课件）第7章 AT89C51单片机系统扩展 7.4 数据存储器的扩展 7.5 I/O口的扩展
《计算机组装与维护》课程教学资源（PPT课件讲稿）第16章常见计算机故障解决案例
《计算机组装与维护》课程教学资源（PPT讲稿）第九章计算机软件维护
对外经济贸易大学：《电子商务概论 Electronic Commerce》课程教学资源（PPT课件讲稿）第八章电子支付与网络银行
西安电子科技大学：《Mobile Programming》课程PPT教学课件（Android Programming）Lecture 04 Activity, Intent and UI
中国科学技术大学：《网络信息安全 NETWORK SECURITY》课程教学资源（PPT课件讲稿）第九章网络攻击
《面向对象建模技术》课程教学资源（PPT课件讲稿）第11章 UML与RUP
上海交通大学：IT项目管理（PPT讲稿）讲座5 目标、范围管理与需求工程
南京大学：《面向对象技术 OOT》课程教学资源（PPT课件讲稿）设计模式 Design Patterns（1）
《算法分析与设计》课程教学资源（PPT课件讲稿）第六章基本检索与周游方法（一般方法）
《面向对象技术》课程教学大纲 Technology of Object-Oriented Programming
大连工业大学：《数据结构》课程教学资源（PPT课件讲稿，共十章，路莹）
清华大学出版社：《计算机网络安全与应用技术》课程教学资源（PPT课件讲稿）第6章黑客原理与防范措施
中国科学技术大学：《信息论与编码技术》课程教学资源（PPT课件讲稿）第2章离散信源及其信息测度
《数字图像处理》课程教学资源（PPT课件）第七章图像分割
Detecting Evasion Attack at High Speed without Reassembly
南京大学：《面向对象技术 OOT》课程教学资源（PPT课件讲稿）类和对象 Class and Object
《数字图像处理》课程教学资源（PPT课件）第五章代数运算
《高级语言程序设计》课程教学资源（试卷习题）试题三（无答案）
东南大学：《操作系统概念 Operating System Concepts》课程教学资源（PPT课件讲稿）08 Main Memory（主讲：张柏礼）
中国科学技术大学：《高级操作系统 Advanced Operating System》课程教学资源（PPT课件讲稿）第四章分布式进程和处理机管理
Network Alignment（PPT讲稿）Treating Networks as Wireless Interference Channel
虚拟存储（PPT课件讲稿）Virtual Memory

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录