Reduction for MaB at~pt MAB Hedge Problem for PEA reduction T Begupeedy-a.-空.sov7 t- ●pt∈△k denotes the distribution over arms sampling an arm at~pr=pr .Ris the estimated loss fed to Hedge how to construct loss estimator? Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 16
Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 16 Reduction for MAB how to construct loss estimator? Hedge for PEA MAB Problem reduction
Loss Estimator at~pt MAB Hedge Problem for PEA reduction Idea:ensure=(p in order to re-use Hedge's regret guarantee Importance-Weighted (IW)Loss Estimator lt,at a=41{a=4}= Pt,at if a=at; Pt,a 0 else. Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 17
Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 17 Loss Estimator Importance-Weighted (IW) Loss Estimator Hedge for PEA MAB Problem reduction
Loss Estimator lt.at if a at; IW Loss Estimator Pt,at Pt,a 0 else. ●Property 1. la=(pu,e) ·Property2.Ea4p:Ce.al=l.a,a∈[K unbiasedness o=区ea=a=k[会la=a fiBap fa=a= Pt.a Pt,a Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 18
Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 18 Loss Estimator unbiasedness IW Loss Estimator Proof
Other Choice at ~Pt MAB Hedge Problem for PEA reduction .Other estimators coming to mind, =00,4e00 at-th entry (p,t) = pt,aut,at卡 cannot apply Hedge loss for Hedge loss for MAB Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 19
Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 19 Other Choice cannot apply Hedge Hedge for PEA MAB Problem reduction
Importance-Weighted Loss Estimator at ~Pt MAB Hedge Problem for PEA reduction Importance weighting estimator, 4=[0,,0, 4@,0,,0 tay at-th entry balancing exploitation lt.and exploration pt, Advanced Optimization(Fall 2023) Lecture 11.Adversarial Bandits 20
Advanced Optimization (Fall 2023) Lecture 11. Adversarial Bandits 20 Importance-Weighted Loss Estimator Hedge for PEA MAB Problem reduction