例子 一设训练数据集包含3个实例点(:1),(x2,2),(x3,),其 中x1=-1,为=2,x2=0,2=3,x3=1,%=5。请采用线 性回归模型和平方损失函数,求y=wx+b的解析解。 2 ·构造设计矩阵X= 目标矩阵d= 35 由平方损失函数的最小二乘法求优化有: w-wwwa-目- 3/2 10/3 最后w=3/2,b=10/3 15/51
例子 ▶ 设训练数据集包含 3 个实例点 (x1, y1), (x2, y2), (x3, y3), 其 中 x1 = −1, y1 = 2, x2 = 0, y2 = 3, x3 = 1, y3 = 5 。请采用线 性回归模型和平方损失函数,求 y = w Tx + b 的解析解。 ▶ 构造设计矩阵 X = −1, 1 0, 1 1, 1 ,目标矩阵 ⃗d = 2 3 5 由平方损失函数的最小二乘法求优化有: W = (X TX) −1X T⃗d = 2, 0 0, 3 −1 −1, 0, 1 1, 1, 1 2 3 5 = 3/2 10/3 最后 w = 3/2, b = 10/3 15 / 51
3.3.从线性到非线性:用线性模型 最小二乘回归(Least Squares Regression) Least Squares Regression: [a,月1]=arg min∑(化,-(6+Bx) 0,1 =1 Statistical model: Y=o+3X+e,(i=1,2,…,N) where ei is zero-mean noise for i =1,2,...,N.Ideally noise should be I.I.D.zero-mean Gaussian ~N(0,2)for some unknowm o2. Remark:least squares regression is sensitive to outliers and not robust if e,is heavier tailed than Gaussian. 16/51
3.3. 从线性到非线性:用线性模型 最小二乘回归 (Least Squares Regression) ▶ Least Squares Regression: [βˆ 0, βˆ 1] = arg min β0,β1 X N i=1 (Yi − (β0 + β1Xi))2 ▶ Statistical model: Yi = β0 + β1Xi + εi , (i = 1, 2, · · · , N) where εi is zero-mean noise for i = 1, 2, · · · , N. Ideally noise should be I.I.D. zero-mean Gaussian εi ∼ N(0, σ2 ) for some unknowm σ 2 . Remark: least squares regression is sensitive to outliers and not robust if εi is heavier tailed than Gaussian. 16 / 51
Residue r;=Y;-(Bo+X)should approximate =and look random. If not,we may add additional features to improve model. Year quadratic term X2 piecewise linear term max(0.X;-70) 17/51
▶ Residue ri = Yi − (βˆ 0 + βˆ 1Xi) should approximate εi and look random. ▶ If not, we may add additional features to improve model. 17 / 51