吸鲁 NJUAT 南京大学 人工智能学院 SCHODL OF ARTIFICIAL INTELUGENCE,NANJING UNIVERSITY Lecture 3.Gradient Descent Method Advanced Optimization(Fall 2023) Peng Zhao zhaop@lamda.nju.edu.cn Nanjing University
Lecture 3. Gradient Descent Method Peng Zhao zhaop@lamda.nju.edu.cn Nanjing University Advanced Optimization (Fall 2023)
Outline ·Gradient Descent ·Convex and Lipschitz ·Polyak Step Size Convergence without Optimal Value Optimal Time-Varying Step Sizes Strongly Convex and Lipschitz Advanced Optimization(Fall 2023) Lecture 3.Gradient Descent Method 2
Advanced Optimization (Fall 2023) Lecture 3. Gradient Descent Method 2 Outline • Gradient Descent • Convex and Lipschitz • Polyak Step Size • Convergence without Optimal Value • Optimal Time-Varying Step Sizes • Strongly Convex and Lipschitz
Part 1.Gradient Descent Convex Optimization Problem Gradient Descent ·Performance Measure The First Gradient Descent Lemma Advanced Optimization(Fall 2023) Lecture 3.Gradient Descent Method 3
Advanced Optimization (Fall 2023) Lecture 3. Gradient Descent Method 3 Part 1. Gradient Descent • Convex Optimization Problem • Gradient Descent • Performance Measure • The First Gradient Descent Lemma
Convex Optimization Problem We adopt a minimization language min f(x) s.t.x∈X optimization variable x E Rd objective function f:RaR:convex and continuously differentiable feasible domain tC Rd:convex Advanced Optimization(Fall 2023) Lecture 3.Gradient Descent Method 4
Advanced Optimization (Fall 2023) Lecture 3. Gradient Descent Method 4 Convex Optimization Problem • We adopt a minimization language
Goal To output a sequence {x}such that x approximates x*when t goes larger. ●Function-value level:f(xr)-f(x*)≤e(T) .Optimizer--value level::xr-x*l‖l≤e(T) where can be statistics of the original sequence {x and s(T)is the approximation error and is a function of iterations T. Advanced Optimization(Fall 2023) Lecture 3.Gradient Descent Method 5
Advanced Optimization (Fall 2023) Lecture 3. Gradient Descent Method 5 Goal