Model selection criteria Bayesian score: posterior probability P(mD P(mD)=P(mP(D m)/P(D) =P(m)P(DIm, e) P(e m)de/P(DI BIC Score: Large sample approximation of bayesian score BIC(m D)=log P(D/m, 8-d/2 logN d: number of free parameters; n is the sample size 8*: MLE of 0, estimated using the Em algorithm Likelihood term of bic Measure how well the model fits data Second term Penalty for model complexity. The use of the bic score indicates that we are looking for a model that fits the data well, and at the same time, not overly complex AAAl2014 Tutorial Nevin L Zhang HKUST
AAAI 2014 Tutorial Nevin L. Zhang HKUST 6 Bayesian score: posterior probability P(m|D) P(m|D) = P(m)P(D|m) / P(D) = P(m)∫ P(D|m, θ) P(θ |m) dθ / P(D) BIC Score: Large sample approximation of Bayesian score BIC(m|D) = log P(D|m, θ*) – d/2 logN d : number of free parameters; N is the sample size. θ*: MLE of θ, estimated using the EM algorithm. Likelihood term of BIC: Measure how well the model fits data. Second term: Penalty for model complexity. The use of the BIC score indicates that we are looking for a model that fits the data well, and at the same time, not overly complex. Model Selection Criteria
Model selection criteria AlC(Akaike, 1974) AlC(m D)=log P(DIm, 8*)-d/2 holdout likelihood Data=> Training set, validation set Model parameters estimated based on the training set s Quality of model is measured using likelihood on the validation set Cross validation too expensive AAAl2014 Tutorial Nevin L Zhang HKUST
AAAI 2014 Tutorial Nevin L. Zhang HKUST 7 AIC (Akaike, 1974): AIC(m|D) = log P(D|m, θ*) – d/2 Holdout likelihood Data => Training set, validation set. Model parameters estimated based on the training set. Quality of model is measured using likelihood on the validation set. Cross validation: too expensive Model Selection Criteria
Search Algorithms Double hill climbing (DHC),(zhang 2002, 2004) )7 manifest variables Single hill climbing( SHC),(Zhang and Kocka 2004 12 manifest variables HeuristIc SHC (HSHC),(Zhang and Kocka 2004) 50 manifest variables EAST, ( Chen et al 2011) 100+ manifest variables AAAl2014 Tutorial Nevin L Zhang HKUST
AAAI 2014 Tutorial Nevin L. Zhang HKUST 8 Search Algorithms Double hill climbing (DHC), (Zhang 2002, 2004) 7 manifest variables. Single hill climbing (SHC), (Zhang and Kocka 2004) 12 manifest variables Heuristic SHC (HSHC), (Zhang and Kocka 2004) 50 manifest variables EAST, (Chen et al 2011) 100+ manifest variables
Double Hill climbing ( DHC) Two search procedures s One for model structure One for cardinalities of latent variables Very inefficient. Tested only on data sets with 7 or fewer variables (Zhang 2004) DHC tested on synthetic and real-world data sets, together with BIC AIC, and Holdout likelihood respectively Best models found when bic was used So subsequent work based on bIC AAAl2014 Tutorial Nevin L Zhang HKUST
AAAI 2014 Tutorial Nevin L. Zhang HKUST 9 Two search procedures One for model structure One for cardinalities of latent variables. Very inefficient. Tested only on data sets with 7 or fewer variables. (Zhang 2004) DHC tested on synthetic and real-world data sets, together with BIC, AIC, and Holdout likelihood respectively. Best models found when BIC was used. So subsequent work based on BIC. Double Hill Climbing (DHC)
Single Hill Climbing (HSC) Determines both model structure and cardinalities of latent variables using a single search procedure Uses five search operators Node Introduction (ND) Node Deletion(ND) Node Relation (NR) State Introduction (SI) State Deletion (SI) AAAl2014 Tutorial Nevin L Zhang HKUST 10
AAAI 2014 Tutorial Nevin L. Zhang HKUST 10 Determines both model structure and cardinalities of latent variables using a single search procedure. Uses five search operators Node Introduction (NI) Node Deletion (ND) Node Relation (NR) State Introduction (SI) State Deletion (SI) Single Hill Climbing (HSC)