|
|
|
@ -139,13 +139,8 @@ Support vector machines
|
|
|
|
|
> 图中心,虚线到实线的距离我们称之为γ,我们要做的是最大化γ,使得这个超平面调整为γ的一个最大值,等价于找到了最优的超平面
|
|
|
|
|
|
|
|
|
|
**式子如下:**
|
|
|
|
|
$$
|
|
|
|
|
\max_{w,b} \quad γ
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
s.t.\quad y_i(\frac{w}{||w||}*x_i+\frac{b}{||w||})≥γ \quad i=1,2,...,N
|
|
|
|
|
$$
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
> γ:表示几何间隔
|
|
|
|
|
>
|
|
|
|
@ -164,47 +159,68 @@ $$
|
|
|
|
|
既然我们最终是\frac{\hat{γ}}{||w||} ,那么式子我们可以简化成
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
y_i(wx_i+b)≥\hat{γ},其中\hat{γ}是函数间隔
|
|
|
|
|
$$
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
max的时候是几何间隔,也就是最终s.t. 还是会约束着它朝着几何间隔去走,但是这样的好处就是下方的||w||就没有了
|
|
|
|
|
|
|
|
|
|
**简化后如下:**
|
|
|
|
|
$$
|
|
|
|
|
\max_{w,b} \quad \frac{\hat{γ}}{||w||}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
s.t.\quad y_i(w*x_i+b)≥γ \quad i=1,2,...,N
|
|
|
|
|
$$
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
之前我们说过,对于函数间隔,我们等比例放大缩小w、b可以让最终结果变成1,也就是γ=1
|
|
|
|
|
|
|
|
|
|
**再简化后:**
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
$$
|
|
|
|
|
\max_{w,b} \quad \frac{1}{||w||}
|
|
|
|
|
后面要用到拉格朗日乘子法,我们把\frac{1}{||w||}变成\frac{1}{2}||w||^2,这两者是等价的
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
**再简化后:**
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
$$
|
|
|
|
|
s.t.\quad y_i(w*x_i+b)≥1 \quad i=1,2,...,N
|
|
|
|
|
\min_{w,b} \quad \frac{1}{2}||w||^2
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
我们想要最大化\frac{1}{||w||},那么相当于最小化||w||
|
|
|
|
|
s.t.\quad y_i(w*x_i+b)-1≥0 \quad i=1,2,...,N
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
> 利用拉格朗日乘子法,推导成如下式子
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
$$
|
|
|
|
|
后面要用到拉格朗日乘子法,我们把\frac{1}{||w||}变成\frac{1}{2}||w||^2,这两者是等价的
|
|
|
|
|
L(w,b,α)=\quad \frac{1}{2}||w||^2-\sum^N_{i=1}α_iy_i(w*x_i+b)+\sum^N_{i=1}α_i
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
**再简化后:**
|
|
|
|
|
$$
|
|
|
|
|
\min_{w,b} \quad \frac{1}{2}||w||^2
|
|
|
|
|
目标:\min_{w,b}\max_aL(w,b,α)
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
s.t.\quad y_i(w*x_i+b)-1≥0 \quad i=1,2,...,N
|
|
|
|
|
转换成:\max_a\min_{w,b}L(w,b,α)
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
将拉格朗日函数L(w,b,α)分别对w,b求偏导并令其等于0
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
进行推导
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2.求minL(w,b,α)对α的极大,即是对偶问题
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
3.求max转换成min:
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
接下来就是求解α的问题了
|
|
|
|
|
|
|
|
|
|