Upper Confidence Bound ·ETE u(2)● t(3)● (1)● (3)● t(2)● (1)● Relying on the estimate of the previous Torounds. There is no way to revise the estimate! Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 11
Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 11 Upper Confidence Bound • ETE Relying on the estimate of the previous rounds. There is no way to revise the estimate!
Upper Confidence Bound ·UCB 4(3)● μ(2)● (1)● (3)● t(2)● u(1)● Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 12
Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 12 Upper Confidence Bound • UCB
Upper Confidence Bound ·UCB With high probability u(a)<UCB:(a)=(a)+B(a) UCB(2)· UCB:(3) UCB:(1)9 3,(3) 3,(1) (2)。(2) t(3)● t(1)● (3)● t(2)· (1)● Optimism in Face of Uncertainty:at=arg maxK]UCB(a) Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 13
Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 13 Upper Confidence Bound • UCB
Upper Confidence Bound ·UCB With high probability u(a)<UCB:(a)=p(a)+B(a) UCB,(2) UCB:(1)8 UCB(3)3,(3) 3,(1) u(2)●8(2 : (1)● t(2)· (1)● Optimism in Face of Uncertainty:a=arg maxK]UCB(a) Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 14
Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 14 Upper Confidence Bound • UCB
Upper Confidence Bound ·UCB With high probability u(a)<UCB(a)=p(a)+B(a) UCB,(2) UCB:(1) UCB:(3)界3,③) 3,(1) u(2)●(2 : t(1)● t(2)· (1)● Optimism in Face of Uncertainty:at=arg maxaEIK]UCB(a) Advanced Optimization(Fall 2023) Lecture 12.Stochastic Bandits 15
Advanced Optimization (Fall 2023) Lecture 12. Stochastic Bandits 15 Upper Confidence Bound • UCB