Latent Wishart Processes for Relational Kernel Learning Wu-Jun Li Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong,China Joint work with Zhihua Zhang and Dit-Yan Yeung 4日4日+4工至三)及0 Li,Zhang and Yeung (CSE.HKUST) LWP AISTATS 2009 1/23
Latent Wishart Processes for Relational Kernel Learning Wu-Jun Li Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong, China Joint work with Zhihua Zhang and Dit-Yan Yeung Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 1 / 23
Contents ① Introduction ②Preliminaries ·Gaussian Processes ●Vishart Processes ③ Latent Wishart Processes o Model Formulation o Learning Out-of-Sample Extension Relation to Existing Work ⑤Experiments 6 Conclusion and Future Work 4日4日+4立4至,至)80 Li,Zhang and Yeung (CSE,HKUST) LWP AISTATS 2009 2/23
Contents 1 Introduction 2 Preliminaries Gaussian Processes Wishart Processes 3 Latent Wishart Processes Model Formulation Learning Out-of-Sample Extension 4 Relation to Existing Work 5 Experiments 6 Conclusion and Future Work Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 2 / 23
Introduction Relational Learning o Traditional machine learning models: Assumption:i.i.d. 。Advantage:simple Many real-world applications: Relational:instances are related (linked)to each other Autocorrelation:statistical dependency between the values of a random variable on related objects(non i.i.d.) .E.g.,web pages,protein-protein interaction data o Relational learning: An emerging research area attempting to represent,reason,and learn in domains with complex relational structure [Getoor Taskar,2007] ●Application areas: Web mining.social network analysis,bioinformatics,marketing,etc. 三)Q0 Li,Zhang and Yeung (CSE.HKUST) LWP A1 STATS20093/23
Introduction Relational Learning Traditional machine learning models: Assumption: i.i.d. Advantage: simple Many real-world applications: Relational: instances are related (linked) to each other Autocorrelation: statistical dependency between the values of a random variable on related objects (non i.i.d.) E.g., web pages, protein-protein interaction data Relational learning: An emerging research area attempting to represent, reason, and learn in domains with complex relational structure [Getoor & Taskar, 2007]. Application areas: Web mining, social network analysis, bioinformatics, marketing, etc. Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 3 / 23
Introduction Relational Kernel Learning o Kernel function: To characterize the similarity between data instances: K(xi,xj) e.g.,K(cat,tiger)>K(cat,elephant) Positive semidefiniteness (p.s.d.) o Kernel learning: To learn an appropriate kernel matrix or kernel function for a kernel-based learning method. o Relational kernel learning (RKL): To learn an appropriate kernel matrix or kernel function for relational data by incorporating relational information between instances into the learning process. 4口40+4立4至,三)及0 Li,Zhang and Yeung (CSE,HKUST) LWP A1 STATS20094/23
Introduction Relational Kernel Learning Kernel function: To characterize the similarity between data instances: K(xi , xj) e.g., K(cat,tiger) > K(cat, elephant) Positive semidefiniteness (p.s.d.) Kernel learning: To learn an appropriate kernel matrix or kernel function for a kernel-based learning method. Relational kernel learning (RKL): To learn an appropriate kernel matrix or kernel function for relational data by incorporating relational information between instances into the learning process. Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 4 / 23
Prelimineries Gaussian Processes Stochastic Processes and Gaussian Processes oStochastic processes: A stochastic process (or random process)y(x)is specified by giving the joint distribution for any finite set of instances {x1,...,xn in a consistent manner. ●Gaussian processes: A Gaussian process is a distribution over functions y(x)s.t.the values of y(x)evaluated at an arbitrary set of points {x1,...,xn jointly have a Gaussian distribution. Assuming y(x)has zero mean,the specification of a Gaussian process is completed by giving the covariance function of y(x)evaluated at any two values of x,given by the kernel function K(,): Ey(x)y(x】=K(x,x): 4口4日+1艺4至卡三及0 Li,Zhang and Yeung (CSE.HKUST) LWP AISTATS 2009 5/23
Preliminaries Gaussian Processes Stochastic Processes and Gaussian Processes Stochastic processes: A stochastic process (or random process) y(x) is specified by giving the joint distribution for any finite set of instances {x1, . . . , xn} in a consistent manner. Gaussian processes: A Gaussian process is a distribution over functions y(x) s.t. the values of y(x) evaluated at an arbitrary set of points {x1, . . . , xn} jointly have a Gaussian distribution. Assuming y(x) has zero mean, the specification of a Gaussian process is completed by giving the covariance function of y(x) evaluated at any two values of x, given by the kernel function K(·, ·): E[y(xi)y(xj)] = K(xi , xj). Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 5 / 23