当前位置：和泉文库 > 计算机 > 浏览文档

南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 13 Advanced Topics - non-stationary online learning, universal online learning, online ensemble, base algorithm, meta algorithm

• Problem Setup • Non-stationary Online Learning • Universal Online Learning • Conclusion

文件格式：PDF，文件大小：9.3MB，售价：17.43元

文档详细内容（约77页）

Bandit Convex Optimization(BCO) BCO with one-point feedback the learner sends a single point wt Ew,and then receives the function value fi(wt)only ry Amazon Prime e today and get unlimited fast,FREE shipping [Flaxman et al.,SODA 2005;Bubeck et al,STOC 2017] spired by you pping trends BCO with two-point feedback the learner sends two points wi,w?W, and then receives their function values,namely, fi(w)and fi(w?),only online recommendation [Agarwal et al.,COLT 2010;Shamir,JMLR 2017] Peng Zhao (Nanjing University) 16

Peng Zhao (Nanjing University) 16 Bandit Convex Optimization (BCO) • BCO with one-point feedback [Flaxman et al., SODA 2005; Bubeck et al., STOC 2017] • BCO with two-point feedback [Agarwal et al., COLT 2010; Shamir, JMLR 2017] online recommendation

A Gentle Start Online Gradient Descent(OGD) Challenge:with only bandit feedback, the learner cannot evaluate the gradient fort=1 to Tdo Play model w and suffer loss f(w) FKM estimator [Flaxman et al.,SODA'05] Update the model construct w:using the perturbation technique wt+1=Πw[wt-Vf(wt】 wL≌wt+δst st is random vector sampled end for from ball B={vv1} descend =Vf() project [proved by Stokes equation] with f(w)Eve[f(w+v)]being smoothed function. define gf(w+s).s as gradient estimator https://www.nature.com/articles/s41534-017-0043-1 Peng Zhao (Nanjing University) 17

Peng Zhao (Nanjing University) 17 A Gentle Start Online Gradient Descent (OGD) https://www.nature.com/articles/s41534-017-0043-1 FKM estimator [Flaxman et al., SODA’05] Challenge: with only bandit feedback, the learner cannot evaluate the gradient [proved by Stokes equation]

A Gentle Start Online Gradient Descent(OGD) Challenge:with only bandit feedback, the learner cannot evaluate the gradient fort=1 to T do Play model wt and suffer loss fi(wt) FKM estimator [Flaxman et al.,SODA05] Update the model construct w:using the perturbation technique wt+1=Πw[wt-nVf(w)刃 w:≌wt+6st st is random vector sampled end for from ball B={vv1} Consider the 1-dim case(d=1). descend project [8w+网 2苏f(峦+d) 2苏(m-) https://www.nature.com/articles/s41534-017-0043-1 Peng Zhao (Nanjing University) 18

Peng Zhao (Nanjing University) 18 A Gentle Start Online Gradient Descent (OGD) https://www.nature.com/articles/s41534-017-0043-1 Challenge: with only bandit feedback, the learner cannot evaluate the gradient Consider the 1-dim case (𝒅𝒅 = 𝟏𝟏). FKM estimator [Flaxman et al., SODA’05]

Base Algorithm BGD ●Gradient estimator::gt=号ft(wt+st)·st Perform Online Gradient Descent using this gradient estimator. Bandit Gradient Descent(BGD) fort=1 to T do Select a unit vector st uniformly at random Submit wt=wt+st Receive ft(wt)as the feedback Construct the gradient estimator by g=f(w:+6s).st E[gi]=Vf(wt) wt+1=Πa-a)wlwt-门g] end for f(w)≌EveB[f:(w+v)】 Peng Zhao(Nanjing University) 19

Peng Zhao (Nanjing University) 19 Bandit Gradient Descent (BGD) Base Algorithm : BGD

Base Algorithm:Dynamic Regret Theorem 1.Under certain standard assumptions,for any perturbation parameter 6>0,step size n>0,and shrinkage parameter a =6/r,the expected dynamic regret of BGD(T,6,a,n)for the one-point feedback model satisfies 4切 202 where Pr=u measures the non-stationarity level. Optimal parameter setting is step size n=( →O(T3/4(1+P)1/4) perturbation parameter .=n Peng Zhao(Nanjing University) 20

Peng Zhao (Nanjing University) 20 Base Algorithm: Dynamic Regret Optimal parameter setting is

点击进入文档下载页（PDF格式）

共77页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 12 Stochastic Bandits - MAB, UCB, linear bandits, self-normalized concentration, generalized linear bandits
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 11 Adversarial Bandits - MAB, IW estimator, Exp3, lower bound, BCO, gradient estimator, self-concordant barrier
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 10 Online Learning in Games - two-player zero-sum games, repeated play, minimax theorem, fast convergence
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 09 Optimistic Online Mirror Descent - optimistic online learning, predictable sequence, small-loss bound, gradient-variance bound, gradient-variation bound
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 08 Adaptive Online Convex Optimization - problem-dependent guarantee, small-loss bound, self-confident tuning, small-loss OCO, self-bounding property bound
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 07 Online Mirror Descent - OMD framework, regret analysis, primal-dual view, mirror map, FTRL, dual averaging
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 06 Prediction with Expert Advice - Hedge, minimax bound, lower bound; mirror descent（motivation and preliminary）
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 05 Online Convex Optimization - OGD, convex functions, strongly convex functions, online Newton step, exp-concave functions
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 04 GD Methods II - GD method, smooth optimization, Nesterov’s AGD, composite optimization
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 03 GD Methods I - GD method, Lipschitz optimization
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 02 Convex Optimization Basics; Function Properties
南京大学：《高级优化 Advanced Optimization》课程教学资源（讲稿）Lecture 01 Introduction; Mathematical Background
南京大学：《组合数学》课程教学资源（课堂讲义）课程简介 Combinatorics Introduction（主讲：尹一通）
南京大学：《组合数学》课程教学资源（课堂讲义）基本计数 Basic enumeration
南京大学：《组合数学》课程教学资源（课堂讲义）生成函数 Generating functions
南京大学：《组合数学》课程教学资源（课堂讲义）筛法 Sieve methods
南京大学：《组合数学》课程教学资源（课堂讲义）Cayley公式 Cayley's formula
南京大学：《组合数学》课程教学资源（课堂讲义）Pólya计数法 Pólya's theory of counting
南京大学：《组合数学》课程教学资源（课堂讲义）Ramsey理论 Ramsey theory
南京大学：《组合数学》课程教学资源（课堂讲义）存在性问题 Existence problems
南京大学：《组合数学》课程教学资源（课堂讲义）极值图论 Extremal graph theory
南京大学：《组合数学》课程教学资源（课堂讲义）极值集合论 Extremal set theory
南京大学：《组合数学》课程教学资源（课堂讲义）概率法 The probabilistic method
南京大学：《组合数学》课程教学资源（课堂讲义）匹配论 Matching theory

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录