|
|
|
@ -90,3 +90,54 @@ How to chose K
|
|
|
|
|
**在应用中**:
|
|
|
|
|
|
|
|
|
|
先取一个较小的K值,再通过交叉验证法来选取最优的K值
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 分数表决规则
|
|
|
|
|
|
|
|
|
|
Majority voting rule
|
|
|
|
|
|
|
|
|
|
分类决策规则:多数表决
|
|
|
|
|
|
|
|
|
|
损失函数:
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
实心圆内都判断为红色的损失值
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
实心圆内都判断为蓝色的损失值
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### K近邻算法
|
|
|
|
|
|
|
|
|
|
K-nearest neighbor
|
|
|
|
|
|
|
|
|
|
输入:训练数据T = [(x1, y1),...,(xn,yn)]
|
|
|
|
|
|
|
|
|
|
实例特征向量x。
|
|
|
|
|
|
|
|
|
|
1. 根据给定的距离度量,在训练集中找到与x最近的k个点,涵盖这k个点的邻域记作Nk(x)
|
|
|
|
|
|
|
|
|
|
2. 在Nk(x)中根据分类决策规则(如多少表决)决定x的类别y
|
|
|
|
|
|
|
|
|
|
输出实例x所属的类别y
|
|
|
|
|
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 总结
|
|
|
|
|
|
|
|
|
|
Summarization
|
|
|
|
|
|
|
|
|
|
1. K近邻的思想:物以类聚
|
|
|
|
|
2. K近邻没有显式的训练过场
|
|
|
|
|
3. 距离度量:欧式距离、曼哈顿距离、切比雪夫距离
|
|
|
|
|
4. 分类方式:多数表决规则
|