Outline Primal svm Model selection 1)Bilevel Program for Cv 2)TWo optimization Methods Implicit EXplicit methods 3)Experiments 4)Conclusions
Outline • Primal SVM • Model selection 1) Bilevel Program for CV 2) Two optimization Methods: Impilicit & Explicit methods 3) Experiments 4) Conclusions
Primal svm Advantages 1) simple to implement, theoretically sound, and easy to customize to different tasks such as classification, regression, ranking and so forth 2)very fast, linear in the number of samples · Difficulty model selection
Primal SVM • Advantages: 1) simple to implement, theoretically sound, and easy to customize to different tasks such as classification, regression, ranking and so forth. 2) very fast, linear in the number of samples • Difficulty model selection
Model selection An often-adopted approach Cross-validation(Cv over a grid Advantage simple and almost universal Weakness high computation exponential in the number of hyperparameters and the number of grid points for each hyperparameter
Model selection An often-adopted approach: Cross-validation (CV) over a grid Advantage: simple and almost universal! Weakness: high computation exponential in the number of hyperparameters and the number of grid points for each hyperparameter
Motivation CV is naturally and precisely formulated as a bilevel program ( BP)shown as follows LEADER outer-level min Y val Bilevel CV Problem model hyperparameters (BCP) weights FOLLOWER inner-level minw Ctm(w, Y)
Motivation • CV is naturally and precisely formulated as a bilevel program (BP) shown as follows. Bilevel CV Problem (BCP)
Bilevel CV Problem(BCP)( BCP for a single validation and training split The outer-level leader problem selects the nyperparameters, to perform well on a validation set The follower problem trains an optimal inner-level model for the given hyperparameters, and returns a weight vector for validation
Bilevel CV Problem (BCP) (1) BCP for a single validation and training split: • The outer-level leader problem selects the hyperparameters, γ, to perform well on a validation set. • The follower problem trains an optimal inner-level model for the given hyperparameters, and returns a weight vector w for validation