Preliminaries Wishart Processes Wishart Processes oWishart distribution: An n x n random symmetric positive definite matrix A is said to have a Wishart distribution with parameters n,q,and n x n scale matrix Σ>O,written as A~Wn(q,Σ),if its p.d.f.is given by |A(q-n-1)/2 2m2T9/21znep(-2(z-1A)),9≥n Here 0 means that is positive definite(p.d.). ●Wishart processes: Given an input space={x1,x2,...},the kernel function A(xi,xj)xi,xj e}is said to be a Wishart process (WP)if for any n E N and {x1,...,xn,the nxn random matrix A=[A(xi,xj)=1 follows a Wishart distribution. 4日4日+4言4至,至)及0 Li,Zhang and Yeung (CSE,HKUST) LWP AISTATS 2009 6/23
Preliminaries Wishart Processes Wishart Processes Wishart distribution: An n × n random symmetric positive definite matrix A is said to have a Wishart distribution with parameters n, q, and n × n scale matrix Σ 0, written as A ∼ Wn(q, Σ), if its p.d.f. is given by |A| (q−n−1)/2 2 qn/2 Γn(q/2)|Σ| q/2 exp − 1 2 tr(Σ −1A) , q ≥ n. Here Σ 0 means that Σ is positive definite (p.d.). Wishart processes: Given an input space X = {x1, x2, . . .}, the kernel function {A(xi , xj) | xi , xj ∈ X } is said to be a Wishart process (WP) if for any n ∈ N and {x1, . . . , xn} ⊆ X , the n×n random matrix A = [A(xi , xj)]n i,j=1 follows a Wishart distribution. Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 6 / 23
Preliminaries Wishart Processes Relationship between GP and WP o For any kernel function A:'x -R,there exists a function B:Fs.t.A(xi,xj)=B(xi)'B(xj). where A'is the input space and FCR9 is some latent(feature) space(in general the feature space may also be infinite-dimensional). Our previous result:A(xi,xj)is a Wishart process iff [Bk(x)1 are q mutually independent Gaussian processes. Let A=[A(xi,xj)]2j=1 and B=[B(x1),...,B(xn)]'=[b1,...,bn]'. Then b;are the latent vectors,and A BB'is a linear kernel in the latent space but is a nonlinear kernel w.r.t.the input space. Theorem Let E be an nxn positive definite matrix.Then A is distributed according to the Wishart distribution Wn(q,if and only if B is distributed according to the (matrix-variate)Gaussian distribution Nn.g(0,) Li,Zhang and Yeung (CSE,HKUST) LWP AISTATS 2009 7/23
Preliminaries Wishart Processes Relationship between GP and WP For any kernel function A : X × X → R, there exists a function B : X → F s.t. A(xi , xj) = B(xi) 0B(xj), where X is the input space and F ⊂ R q is some latent (feature) space (in general the feature space may also be infinite-dimensional). Our previous result: A(xi , xj) is a Wishart process iff {Bk (x)} q k=1 are q mutually independent Gaussian processes. Let A = [A(xi , xj)]n i,j=1 and B = [B(x1), . . . ,B(xn)]0 = [b1, . . . , bn] 0 . Then bi are the latent vectors, and A = BB0 is a linear kernel in the latent space but is a nonlinear kernel w.r.t. the input space. Theorem Let Σ be an n×n positive definite matrix. Then A is distributed according to the Wishart distribution Wn(q, Σ) if and only if B is distributed according to the (matrix-variate) Gaussian distribution Nn,q(0, Σ⊗Iq). Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 7 / 23
Preliminaries Wishart Processes GP and WP in a Nutshell oGaussian distribution: Each sampled instance is a finite-dimensional vector, v=(h,,vd). o Wishart distribution: Each sampled instance is a finite-dimensional p.s.d.matrix,M 0. Gaussian process: Each sampled instance is an infinite-dimensional function,f(). ●Wishart process: Each sampled instance is an infinite-dimensional p.s.d.function, g(,). 口4日+4三至三)及0 Li,Zhang and Yeung (CSE.HKUST) LWP AISTATS 2009 8/23
Preliminaries Wishart Processes GP and WP in a Nutshell Gaussian distribution: Each sampled instance is a finite-dimensional vector, v = (v1, . . . , vd ) 0 . Wishart distribution: Each sampled instance is a finite-dimensional p.s.d. matrix, M 0. Gaussian process: Each sampled instance is an infinite-dimensional function, f (·). Wishart process: Each sampled instance is an infinite-dimensional p.s.d. function, g(·, ·). Li, Zhang and Yeung (CSE, HKUST) LWP AISTATS 2009 8 / 23