Bilevel cv problem (BCP)(2 More Specifically, Model selection via T- fold o∨→BCP! 1) The inner-level problems minimize the regularized training error to determine the best function for the given hyperparameters for each fold 2) The hyperparameters are the outer-level control variables. the objective of the outer-level is to minimize the valid ation error based on the optimal parameters( w) returned for each fold
Bilevel CV Problem (BCP) (2) More Specifically, Model selection via T-fold CV → BCP! 1) The inner-level problems minimize the regularized training error to determine the best function for the given hyperparameters for each fold. 2) The hyperparameters are the outer-level control variables. The objective of the outer-level is to minimize the validation error based on the optimal parameters (w) returned for each fold
Formal Formulation for BCP(1) Given a training sample g2:={xy}户=1…∈Rm T-CV: Partition@2 into Equally sized divisions then for fold tl.. t. one of the divisions is used as the validation set, Q2val, and the remaining T-1 divisions are assigned to the training set, Q2 Let r erm be the set of m model hyperparameters and w, be the model weights for the t-th fold
Formal Formulation for BCP (1) • Given a training sample Ω:= {xj , yj }, j=1… l∈ Rn+1 . • T-CV: PartitionΩ into T equally sized divisions; then for fold t=1…T, one of the divisions is used as the validation set, , and the remaining T-1 divisions are assigned to the training set, . • Letγ∈Rm be the set of m model hyperparameters and wt be the model weights for the t-th fold. t val t trn
Formal Formulation for BCP (2) Let trn (wr,y(xj,y)∈!2m) be the inner-level training function given the t-th fold training dataset and valt, y(x,y)∈2a be the t-th outer-level validation loss function given its validation dataset
Formal Formulation for BCP (2) Let be the inner-level training function given the t-th fold training dataset and be the t-th outer-level validation loss function given its validation dataset
Formal Formulation for BCP ( 3) The bilevel program for T-fold Cvis min ∑ 1,…,wr,y 个(W,y(x,y)∈2a) (outer-level subject to y∈r for t=1..T: Wr∈ arg min{cm(w,y(xy,y)∈2m)} (inner-level The bcp is challenging to solve in this form
Formal Formulation for BCP (3) The bilevel program for T-fold CV is: (2) The BCP is challenging to solve in this form
Formal Formulation for BCP (4) minLout(w, y) t. Ay (3) W Earg min Lin (w,y) TWo solution methods: 1)Implicit and 2) explicit
Formal Formulation for BCP (4) (3) Two solution methods: 1) Implicit and 2) explicit