当前位置：和泉文库 > 信息系统 > 中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）概率图及主题模型 Probabilistic Graphical Models Topic Model

中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）概率图及主题模型 Probabilistic Graphical Models Topic Model

• 什么是Graphical Model • 定义、示例 • Representation、Inference、Learning • 主题模型与分类 • LSA (Latent Semantic Analysis), 1990 • pLSA (probabilistic Latent Semantic Analysis), 1999 • LDA(Latent Dirichlet Allocation), 2003 • Hierarchical Bayesian model • 主题模型的R语言实现示例

文件格式：PDF，文件大小：3.65MB，售价：17.63元

共73页，可试读20页，点击往前阅读 ↑↑

文档详细内容（约73页）

信息检索与数据挖掘 2019/4/29 13 概率图模型的表示结构：GV,E) 参数：CPTs Representation A Bayesian network(BN)represents the joint distribution of a set of n(discrete) variables,Xi,X2,...,Xn,as a directed acyclic graph (DAG)and a set of conditional probability tables(CPTs).Each node,that corresponds to a variable,has an associated CPT that contains the probability of each state of the variable given its parents in the graph.The structure of the network implies a set of conditional independence assertions,which give power to this representation. A PGM is specified by two aspects:(i)a graph,G,E),that defines the structure of the model;and(ii)a set of local functions,f(Yi),that define the parameters,where Yi is a subset of X.The joint probability is obtained by the product of the local functions: M P(X1,X2....,XN)=K f(Yi) i=1 where K is a normalization constant.This representation in terms of a graph and a set of local functions(called potentials)is the basis for inference and learning in PGMs. Advances in Computer Vision and Pattern Recognition Luis Enrique Sucar,Probabilistic Graphical Models (Principles and Applications).2015

信息检索与数据挖掘 2019/4/29 13 概率图模型的表示 Representation A Bayesian network (BN) represents the joint distribution of a set of n (discrete) variables, X1, X2, . . . , Xn, as a directed acyclic graph (DAG) and a set of conditional probability tables (CPTs). Each node, that corresponds to a variable, has an associated CPT that contains the probability of each state of the variable given its parents in the graph. The structure of the network implies a set of conditional independence assertions, which give power to this representation. Advances in Computer Vision and Pattern Recognition Luis Enrique Sucar, Probabilistic Graphical Models (Principles and Applications)， 2015 结构：G(V,E) 参数：CPTs A PGM is specified by two aspects: (i) a graph, G(V, E), that defines the structure of the model; and (ii) a set of local functions, f (Yi ), that define the parameters, where Yi is a subset of X. The joint probability is obtained by the product of the local functions: where K is a normalization constant. This representation in terms of a graph and a set of local functions (called potentials) is the basis for inference and learning in PGMs

信息检索与数据挖掘 2019/4/29 14 概率图模型的推理 Inference 含有5个变量的贝叶斯网络及其表示的联合分布 D P(A.B,C.D.E)-P(AP(BAP(CBP(D B)P(EC.D) 如果观测到变量E=e,←给定证据Evidence) 想要计算变量C=c的条件概率P(cle),←推理inference) 则 Pce)=∑P(a.,bcd,e 精确推理近似推理 Z=∑ad P(a,b,c,d,e

信息检索与数据挖掘 2019/4/29 14 概率图模型的推理 Inference P(A,B,C,D,E)=P(A)P(B|A)P(C|B)P(D|B)P(E|C,D) 含有5个变量的贝叶斯网络及其表示的联合分布如果观测到变量E=e，给定证据(Evidence) 想要计算变量C=c的条件概率P(c|e)， 推理(inference) 则精确推理近似推理

信息检索与数据挖掘 2019/4/29 15 概率图模型的学习 Learning ·概率图结构已知，即为参数的学习（估计） ·常用的学习方法有两类：最大似然估计 (MLE)、贝叶斯估计。前者视模型参数为定值，后者视其为随机变量。 ·MLE在数据完备的情况下，可将参数学习问题转化为充分统计量的计算问题，在数据不完备的情况下，采用EM算法，用迭代方式逐步最大化p(x|)。。贝叶斯估计在数据完备的情况下，根据误差准则不同，可以诱导出最大后验估计或者后验均值的估计方法，在数据不完备的情况下：可以将0视为一种特殊的隐变量，从而问题归结为推理问题，可以采用变分贝叶斯方法近似求解。 ·概率图结构未知。数据完备时，较好的方式是定义一个得分函数，评估结构与数据的匹配程度，然后搜索最大得分的结构。实际中需要根据奥克姆剃刀原理，选择可以拟合数据的最简单模型。如果预先假定结构是树模型（每个节点至多有一个父节点），则搜索可在多项式时间内完成，否则是NP-hard问题。数据不完备，需考虑structural EM.算法。 Cmap arg max P(cld)=arg max P(c)P(tklc) cEC cEC 1<k≤nd

信息检索与数据挖掘 2019/4/29 15 概率图模型的学习 Learning • 概率图结构已知，即为参数的学习（估计） • 常用的学习方法有两类：最大似然估计（MLE）、贝叶斯估计。前者视模型参数为定值，后者视其为随机变量。 • MLE在数据完备的情况下，可将参数学习问题转化为充分统计量的计算问题，在数据不完备的情况下，采用EM算法，用迭代方式逐步最大化p(x | θ)。 • 贝叶斯估计在数据完备的情况下，根据误差准则不同，可以诱导出最大后验估计或者后验均值的估计方法，在数据不完备的情况下，可以将 θ 视为一种特殊的隐变量，从而问题归结为推理问题，可以采用变分贝叶斯方法近似求解。 • 概率图结构未知 • 数据完备时，较好的方式是定义一个得分函数，评估结构与数据的匹配程度，然后搜索最大得分的结构。实际中需要根据奥克姆剃刀原理，选择可以拟合数据的最简单模型。如果预先假定结构是树模型（每个节点至多有一个父节点），则搜索可在多项式时间内完成，否则是NP-hard问题。 • 数据不完备，需考虑structural EM算法

信息检索与数据挖掘 2019/4/29 16 小结：表示、推理、学习 Representation,Inference,Learning Representation ·a graph←结构：G(V,E) ·a set of local functions(called potentials)←参数：CPTs 。Inference answering different probabilistic queries based on the model and some evidence. obtain the marginal or conditional probabilities of any subset of variables Z given any other subset Y. 。Learning given a set of data values for X(that can be incomplete) estimate the structure(graph)and parameters (local functions) of the model

信息检索与数据挖掘 2019/4/29 16 小结：表示、推理、学习 Representation, Inference, Learning • Representation • a graph  结构：G(V,E) • a set of local functions (called potentials) 参数：CPTs • Inference • answering different probabilistic queries based on the model and some evidence. • obtain the marginal or conditional probabilities of any subset of variables Z given any other subset Y. • Learning • given a set of data values for X (that can be incomplete) estimate the structure (graph) and parameters (local functions) of the model

信息检索与数据挖掘 2019/4/29 17 概率图模型的常见类型 Directed Acyclic Graph Undirected Graph Table 1.3 Main types of probabilistic graphical models Type Directed/Undirected Static/Dynamic Prob./Decisional Bayesian classifiers D/U S P Markov chains D D P Hidden Markov D D P models Markov random fields U S P Bayesian networks D S P Dynamic Bayesian D networks Influence diagrams S D Markov decision D D D processes(MDPs) Partially observable D D D MDPs Advances in Computer Vision and Pattern Recognition Luis Enrique Sucar,Probabilistic Graphical Models(Principles and Applications),2015

信息检索与数据挖掘 2019/4/29 17 概率图模型的常见类型 Advances in Computer Vision and Pattern Recognition Luis Enrique Sucar, Probabilistic Graphical Models (Principles and Applications)， 2015 Directed Acyclic Graph Undirected Graph

点击进入文档下载页（PDF格式）

共73页，可试读20页，点击继续阅读 ↓↓

您可能感兴趣的文档

中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第10章文本分类（支持向量机及机器学习方法）
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第10章文本分类（基于向量空间的文本分类）
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第10章文本分类（文本分类及朴素贝叶斯方法）
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）矩阵分解在信息检索中的应用
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）课程要求（论文阅读&研讨）
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第9章基于语言建模的检索模型
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第8章概率模型
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第7章相关反馈和查询扩展
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第6章检索的评价
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第5章向量模型及检索系统 5.2 检索系统
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第5章向量模型及检索系统 5.1 向量模型
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第4章索引构建与索引压缩 4.2 索引压缩
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第11章文本聚类
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）图像分类的算法思想
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）数据挖掘经典算法概述
中国科学技术大学：《信息检索与数据挖掘》课程教学资源（课件讲稿）第12章 Web搜索
长沙医学院：信息工程学院课程简介
南京大学：《信息与计算科学导论》课程教学资源（课件讲稿）集合与关系 Sets-and-Relations
南京大学：《信息与计算科学导论》课程教学资源（课件讲稿）递归算法与递归方程 Recursive Algorithm and Recurrence Relations
《管理信息系统》课程教学资源（书籍教材）第2章管理信息系统的技术基础
国家中医药管理局：中医医院信息系统基本功能规范（修订，征求意见稿，2019年3月）
北京中医药大学：《数据科学导论》课程教学资源（PPT课件）第1章绪论 Introduction to Data Science（主讲：韩爱庆）
北京中医药大学：《数据科学导论》课程教学资源（PPT课件）第2章计算机基础
北京中医药大学：《数据科学导论》课程教学资源（PPT课件）第3章计算机网络

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录