|
|
|
@ -4,15 +4,19 @@
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## 建筑能源使用数据\n",
|
|
|
|
|
"[来源](https://github.com/WillKoehrsen/machine-learning-project-walkthrough)\n",
|
|
|
|
|
"## 介绍:机器学习项目第一部分\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"### 项目目标\n",
|
|
|
|
|
"* 确定能源之星评分数据集中的预测因素。\n",
|
|
|
|
|
"* 使用提供的建筑能源数据开发模型,并预测建筑物的能源之星的得分(0-100的连续值)。\n",
|
|
|
|
|
"* 解释模型结果。\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"基于项目目标,我们需要做的是一个回归模型。"
|
|
|
|
|
"建筑能源数据[来源](https://github.com/WillKoehrsen/machine-learning-project-walkthrough)\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"* 监督问题:我们既有特征又有目标\n",
|
|
|
|
|
"* 回归问题:目标是一个连续变量,在本例中范围是0-100\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"在训练过程中,我们希望模型学习特征和分数之间的关系,以便我们同时给出特征和答案。然后,为了测试模型的学习效果,我们在一个从未见过答案的测试集中对其进行评估!"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
@ -22,12 +26,12 @@
|
|
|
|
|
"## 机器学习——工作流程\n",
|
|
|
|
|
"1. 数据清洗与格式转换\n",
|
|
|
|
|
"2. 探索性数据分析\n",
|
|
|
|
|
"3. 特征工程\n",
|
|
|
|
|
"4. 建立基础模型,尝试多种算法\n",
|
|
|
|
|
"5. 模型调参\n",
|
|
|
|
|
"6. 评估与测试\n",
|
|
|
|
|
"7. 解释模型\n",
|
|
|
|
|
"8. 提交答案\n",
|
|
|
|
|
"3. 特征工程与选择\n",
|
|
|
|
|
"4. 建立基础模型,比较多种模型性能指标\n",
|
|
|
|
|
"5. 模型超参数调参,针对问题进行优化\n",
|
|
|
|
|
"6. 在测试集上评估最佳模型\n",
|
|
|
|
|
"7. 尽可能解释模型结果\n",
|
|
|
|
|
"8. 得出结论,并提交答案\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"这些过程并不是严格的从头到尾,可能在4建立模型时,发现1的数据清洗有问题,再回来做1,该项目包含3个notebook"
|
|
|
|
|
]
|
|
|
|
|