http:/parnec.nuaa.edu.cn Incorporating prior knowledge on features into learning(AISTATS'o7) ● Motivation OKernel design by meta-features ● A toy examp le O Handwritten digit recognition aided by meta-features O Towards a theory of meta-features
Company name www.themegallery.com Incorporating prior knowledge on features into learning (AISTATS’07) ⚫ Motivation ⚫Kernel design by meta-features ⚫ A toy example ⚫ Handwritten digit recognition aided by meta-features ⚫ Towards a theory of meta-features
http:/parnec.nuaa.edu.cn Kernel design by meta-features In the standard approach of linear SVM, we solve w:(wx)y≥1,i=1,…,m which can be viewed as finding the maximum-likelihood hypothesis, under the above constraint, where we have a gaussian prior on w P(w)xe 去w2C-1 The covariance matrix C equals the unit matrix, i.e. all weights are assumed to be independent and have the same variance
Company name www.themegallery.com Kernel design by meta-features In the standard approach of linear SVM, we solve which can be viewed as finding the maximum-likelihood hypothesis, under the above constraint, where we have a Gaussian prior on w The covariance matrix C equals the unit matrix, i.e. all weights are assumed to be independent and have the same variance
www.themegallery.com Kernel design by meta-features We can use meta-feature to create a better prior on w features with similar meta-feature are expected to be similar in weights, i. e, the weights would be a smooth function of the meta-features Use a Gaussian prior on w, defined by a covariance matrix C, and the covariance between a pair of weights is taken to be a decreasing function of the distance between their meta-features Cii=C(ui,ui)
Company name www.themegallery.com Kernel design by meta-features We can use meta-feature to create a better prior on w : features with similar meta-feature are expected to be similar in weights, i.e., the weights would be a smooth function of the meta-features. Use a Gaussian prior on w, defined by a covariance matrix C, and the covariance between a pair of weights is taken to be a decreasing function of the distance between their meta-features
www.themegallery.com Kernel design by meta-features argon C wwxd)y≥1,i=1,,m C1/2x argmin w:(wrx)y≥1,i=1…,m The invariance is incorporated by the assumption of smoothness of weights in the meta-feature space Gaussian process: x>y, smoothness of y in the feature space This work: u>w, smoothness of weight w in the meta-feature space
Company name www.themegallery.com Kernel design by meta-features ◼ The invariance is incorporated by the assumption of smoothness of weights in the meta-feature space. ◼ Gaussian process: x→y, smoothness of y in the feature space. This work: u→w, smoothness of weight w in the meta-feature space
http:/parnec.nuaa.edu.cn Incorporating prior knowledge on features into learning(AISTATS'o7) ● Motivation KErnel design by meta-features ● a toy example O Handwritten digit recognition aided by meta-features O Towards a theory of meta-features
Company name www.themegallery.com Incorporating prior knowledge on features into learning (AISTATS’07) ⚫ Motivation ⚫Kernel design by meta-features ⚫ A toy example ⚫ Handwritten digit recognition aided by meta-features ⚫ Towards a theory of meta-features