{ "cells": [ { "cell_type": "markdown", "id": "33127151", "metadata": {}, "source": [ "# 自动化特征工程" ] }, { "cell_type": "markdown", "id": "66dfb30d", "metadata": {}, "source": [ "### 结论:效果一般\n", "搬运参考:https://www.kaggle.com/liananapalkova/automated-feature-engineering-for-titanic-dataset" ] }, { "cell_type": "markdown", "id": "91896713", "metadata": {}, "source": [ "### 1.介绍\n", "如果您曾经为您的ML项目手动创建过数百个特性(我相信您做到了),那么您将乐于了解名为“featuretools”的Python包如何帮助完成这项任务。好消息是这个软件包很容易使用。它的目标是自动化特征工程。当然,人类的专业知识是无法替代的,但是“featuretools”可以自动化大量的日常工作。出于探索目的,这里使用fetch_covtype数据集。\n", "\n", "本笔记本的主要内容包括:\n", "\n", "首先,使用自动特征工程(“featuretools”包),从54个特征总数增加到N个。\n", "\n", "其次,应用特征约简和选择方法,从N个特征中选择X个最相关的特征。" ] }, { "cell_type": "code", "execution_count": 1, "id": "522eb443", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)]\n" ] } ], "source": [ "import sys\n", "print(sys.version) # 版本信息" ] }, { "cell_type": "code", "execution_count": 5, "id": "51e62bae", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleNote: you may need to restart the kernel to use updated packages.\n", "Collecting featuretools\n", " Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8f/32/b5d02df152aff86f720524540ae516a8e15d7a8c53bd4ee06e2b1ed0c263/featuretools-0.26.2-py3-none-any.whl (327 kB)\n", "Requirement already satisfied: numpy>=1.16.6 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (1.19.5)\n", "Requirement already satisfied: dask[dataframe]>=2.12.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (2021.4.0)\n", "Requirement already satisfied: pyyaml>=5.4 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (5.4.1)\n", "Requirement already satisfied: tqdm>=4.32.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (4.59.0)\n", "Requirement already satisfied: scipy>=1.3.2 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (1.6.2)\n", "Requirement already satisfied: click>=7.0.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (7.1.2)\n", "Requirement already satisfied: pandas<2.0.0,>=1.2.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (1.2.4)\n", "Requirement already satisfied: psutil>=5.6.6 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (5.8.0)\n", "Requirement already satisfied: distributed>=2.12.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (2021.4.0)\n", "Requirement already satisfied: cloudpickle>=0.4.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from featuretools) (1.6.0)\n", "Requirement already satisfied: partd>=0.3.10 in d:\\programdata\\anaconda3\\lib\\site-packages (from dask[dataframe]>=2.12.0->featuretools) (1.2.0)\n", "Requirement already satisfied: fsspec>=0.6.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from dask[dataframe]>=2.12.0->featuretools) (0.9.0)\n", "Requirement already satisfied: toolz>=0.8.2 in d:\\programdata\\anaconda3\\lib\\site-packages (from dask[dataframe]>=2.12.0->featuretools) (0.11.1)\n", "Requirement already satisfied: tblib>=1.6.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (1.7.0)\n", "Requirement already satisfied: zict>=0.1.3 in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (2.0.0)\n", "Requirement already satisfied: sortedcontainers!=2.0.0,!=2.0.1 in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (2.3.0)\n", "Requirement already satisfied: tornado>=6.0.3 in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (6.1)\n", "Requirement already satisfied: msgpack>=0.6.0 in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (1.0.2)\n", "Requirement already satisfied: setuptools in d:\\programdata\\anaconda3\\lib\\site-packages (from distributed>=2.12.0->featuretools) (52.0.0.post20210125)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in d:\\programdata\\anaconda3\\lib\\site-packages (from pandas<2.0.0,>=1.2.0->featuretools) (2.8.1)\n", "Requirement already satisfied: pytz>=2017.3 in d:\\programdata\\anaconda3\\lib\\site-packages (from pandas<2.0.0,>=1.2.0->featuretools) (2021.1)\n", "Requirement already satisfied: locket in d:\\programdata\\anaconda3\\lib\\site-packages\\locket-0.2.1-py3.8.egg (from partd>=0.3.10->dask[dataframe]>=2.12.0->featuretools) (0.2.1)\n", "Requirement already satisfied: six>=1.5 in d:\\programdata\\anaconda3\\lib\\site-packages (from python-dateutil>=2.7.3->pandas<2.0.0,>=1.2.0->featuretools) (1.15.0)\n", "Requirement already satisfied: heapdict in d:\\programdata\\anaconda3\\lib\\site-packages (from zict>=0.1.3->distributed>=2.12.0->featuretools) (1.0.1)\n", "Installing collected packages: featuretools\n", "Successfully installed featuretools-0.26.2\n", "\n" ] } ], "source": [ "pip install featuretools" ] }, { "cell_type": "code", "execution_count": 19, "id": "43cc9a46", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import time\n", "import gc\n", "import pandas as pd\n", "\n", "import featuretools as ft\n", "from featuretools.primitives import *\n", "from featuretools.variable_types import Numeric\n", "from sklearn.svm import LinearSVC\n", "from sklearn.feature_selection import SelectFromModel\n", "# 导入相关模型,没有的pip install xxx 即可\n", "\n", "from sklearn.ensemble import RandomForestClassifier\n", "\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import accuracy_score\n", "from sklearn.preprocessing import OrdinalEncoder\n", "from sklearn.metrics import log_loss" ] }, { "cell_type": "code", "execution_count": 2, "id": "4c17c0bc", "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import fetch_covtype\n", "data = fetch_covtype()" ] }, { "cell_type": "code", "execution_count": 3, "id": "bcce5a3d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "七分类任务,处理前: [1 2 3 4 5 6 7]\n", "[5 5 2 ... 3 3 3]\n", "七分类任务,处理后: [0. 1. 2. 3. 4. 5. 6.]\n", "[4. 4. 1. ... 2. 2. 2.]\n" ] } ], "source": [ "# 预处理\n", "X, y = data['data'], data['target']\n", "# 由于模型标签需要从0开始,所以数字需要全部减1\n", "print('七分类任务,处理前:',np.unique(y))\n", "print(y)\n", "ord = OrdinalEncoder()\n", "y = ord.fit_transform(y.reshape(-1, 1))\n", "y = y.reshape(-1, )\n", "print('七分类任务,处理后:',np.unique(y))\n", "print(y)" ] }, { "cell_type": "code", "execution_count": 4, "id": "4afeeca5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationAspectSlopeHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHillshade_9amHillshade_NoonHillshade_3pmHorizontal_Distance_To_Fire_PointsWilderness_Area_0Wilderness_Area_1Wilderness_Area_2Wilderness_Area_3Soil_Type_0Soil_Type_1Soil_Type_2Soil_Type_3Soil_Type_4
002596.051.03.0258.00.0510.0221.0232.0148.06279.01.00.00.00.00.00.00.00.00.0
112590.056.02.0212.0-6.0390.0220.0235.0151.06225.01.00.00.00.00.00.00.00.00.0
\n", "
" ], "text/plain": [ " index Elevation Aspect Slope Horizontal_Distance_To_Hydrology \\\n", "0 0 2596.0 51.0 3.0 258.0 \n", "1 1 2590.0 56.0 2.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "0 0.0 510.0 \n", "1 -6.0 390.0 \n", "\n", " Hillshade_9am Hillshade_Noon Hillshade_3pm \\\n", "0 221.0 232.0 148.0 \n", "1 220.0 235.0 151.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Wilderness_Area_0 Wilderness_Area_1 \\\n", "0 6279.0 1.0 0.0 \n", "1 6225.0 1.0 0.0 \n", "\n", " Wilderness_Area_2 Wilderness_Area_3 Soil_Type_0 Soil_Type_1 \\\n", "0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 \n", "\n", " Soil_Type_2 Soil_Type_3 Soil_Type_4 \n", "0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = pd.DataFrame(X,columns=data.feature_names)\n", "X = X.reset_index()\n", "X = X.iloc[:,:20] # 数据集过大,这里仅用前20列做演示\n", "X.head(2)" ] }, { "cell_type": "code", "execution_count": 5, "id": "af6722f2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexCover_Type
004.0
114.0
\n", "
" ], "text/plain": [ " index Cover_Type\n", "0 0 4.0\n", "1 1 4.0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = pd.DataFrame(y, columns=data.target_names)\n", "y = y.reset_index()\n", "y.head(2)" ] }, { "cell_type": "code", "execution_count": 6, "id": "2d34ab5c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 581012 entries, 0 to 581011\n", "Data columns (total 20 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 index 581012 non-null int64 \n", " 1 Elevation 581012 non-null float64\n", " 2 Aspect 581012 non-null float64\n", " 3 Slope 581012 non-null float64\n", " 4 Horizontal_Distance_To_Hydrology 581012 non-null float64\n", " 5 Vertical_Distance_To_Hydrology 581012 non-null float64\n", " 6 Horizontal_Distance_To_Roadways 581012 non-null float64\n", " 7 Hillshade_9am 581012 non-null float64\n", " 8 Hillshade_Noon 581012 non-null float64\n", " 9 Hillshade_3pm 581012 non-null float64\n", " 10 Horizontal_Distance_To_Fire_Points 581012 non-null float64\n", " 11 Wilderness_Area_0 581012 non-null float64\n", " 12 Wilderness_Area_1 581012 non-null float64\n", " 13 Wilderness_Area_2 581012 non-null float64\n", " 14 Wilderness_Area_3 581012 non-null float64\n", " 15 Soil_Type_0 581012 non-null float64\n", " 16 Soil_Type_1 581012 non-null float64\n", " 17 Soil_Type_2 581012 non-null float64\n", " 18 Soil_Type_3 581012 non-null float64\n", " 19 Soil_Type_4 581012 non-null float64\n", "dtypes: float64(19), int64(1)\n", "memory usage: 88.7 MB\n" ] } ], "source": [ "X.info()" ] }, { "cell_type": "code", "execution_count": 7, "id": "1551c241", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 581012 entries, 0 to 581011\n", "Data columns (total 20 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 index 581012 non-null int32 \n", " 1 Elevation 581012 non-null float32\n", " 2 Aspect 581012 non-null float32\n", " 3 Slope 581012 non-null float32\n", " 4 Horizontal_Distance_To_Hydrology 581012 non-null float32\n", " 5 Vertical_Distance_To_Hydrology 581012 non-null float32\n", " 6 Horizontal_Distance_To_Roadways 581012 non-null float32\n", " 7 Hillshade_9am 581012 non-null float32\n", " 8 Hillshade_Noon 581012 non-null float32\n", " 9 Hillshade_3pm 581012 non-null float32\n", " 10 Horizontal_Distance_To_Fire_Points 581012 non-null float32\n", " 11 Wilderness_Area_0 581012 non-null float32\n", " 12 Wilderness_Area_1 581012 non-null float32\n", " 13 Wilderness_Area_2 581012 non-null float32\n", " 14 Wilderness_Area_3 581012 non-null float32\n", " 15 Soil_Type_0 581012 non-null float32\n", " 16 Soil_Type_1 581012 non-null float32\n", " 17 Soil_Type_2 581012 non-null float32\n", " 18 Soil_Type_3 581012 non-null float32\n", " 19 Soil_Type_4 581012 non-null float32\n", "dtypes: float32(19), int32(1)\n", "memory usage: 44.3 MB\n" ] } ], "source": [ "# 转换数据格式以减少内存占用\n", "for col in X.columns:\n", " if X[col].dtype=='float64': X[col] = X[col].astype('float32')\n", " if X[col].dtype=='int64': X[col] = X[col].astype('int32')\n", "X.info() # 减少了一半" ] }, { "cell_type": "markdown", "id": "f68429bf", "metadata": {}, "source": [ "### 2.执行自动化特征工程\n", "需要先确认是否有NaN值,对NaN值做处理建议参考:" ] }, { "cell_type": "code", "execution_count": 8, "id": "06f24545", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Object `es.entity_from_dataframe` not found.\n" ] } ], "source": [ "es.entity_from_dataframe?" ] }, { "cell_type": "markdown", "id": "e3f82b96", "metadata": {}, "source": [ "创建实体集后,可以使用所谓的原特征生成新特征。\n", "\n", "分为两类:\n", "\n", "* 聚合:这些函数将每个父项的子数据点组合在一起,然后计算统计数据,如平均值、最小值、最大值或标准偏差。聚合使用表之间的关系跨多个表工作。\n", "\n", "* 转换:这些函数处理单个表的一列或多列。\n", "\n", "我们可以使用\"normalize_entity\"函数创建虚拟表。这样我们就可以应用聚合函数和转换函数来生成新特性。为了创建这样的表,我们将使用分类变量、布尔变量和整数变量。" ] }, { "cell_type": "code", "execution_count": 9, "id": "f2c69a94", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Entityset: fetch_covtype_data\n", " Entities:\n", " X [Rows: 581012, Columns: 20]\n", " Relationships:\n", " No relationships" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "es = ft.EntitySet(id = 'fetch_covtype_data')\n", "es = es.entity_from_dataframe(entity_id = 'X', dataframe = X, \n", " variable_types = \n", " {\n", " 'Aspect': ft.variable_types.Categorical,\n", " 'Slope': ft.variable_types.Categorical,\n", " 'Hillshade_9am': ft.variable_types.Categorical,\n", " 'Hillshade_Noon': ft.variable_types.Categorical,\n", " 'Hillshade_3pm': ft.variable_types.Categorical,\n", " 'Wilderness_Area_0': ft.variable_types.Boolean,\n", " 'Wilderness_Area_1': ft.variable_types.Boolean,\n", " 'Wilderness_Area_2': ft.variable_types.Boolean,\n", " 'Wilderness_Area_3': ft.variable_types.Boolean,\n", " 'Soil_Type_0': ft.variable_types.Boolean,\n", " 'Soil_Type_1': ft.variable_types.Boolean,\n", " 'Soil_Type_2': ft.variable_types.Boolean,\n", " 'Soil_Type_3': ft.variable_types.Boolean,\n", " 'Soil_Type_4': ft.variable_types.Boolean\n", " },\n", " index = 'index')\n", "\n", "es" ] }, { "cell_type": "code", "execution_count": 10, "id": "770130bc", "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "Entityset: fetch_covtype_data\n", " Entities:\n", " X [Rows: 581012, Columns: 20]\n", " Wilderness_Area_0 [Rows: 2, Columns: 1]\n", " Wilderness_Area_1 [Rows: 2, Columns: 1]\n", " Wilderness_Area_2 [Rows: 2, Columns: 1]\n", " Wilderness_Area_3 [Rows: 2, Columns: 1]\n", " Soil_Type_0 [Rows: 2, Columns: 1]\n", " Soil_Type_1 [Rows: 2, Columns: 1]\n", " Soil_Type_2 [Rows: 2, Columns: 1]\n", " Soil_Type_3 [Rows: 2, Columns: 1]\n", " Soil_Type_4 [Rows: 2, Columns: 1]\n", " Relationships:\n", " X.Wilderness_Area_0 -> Wilderness_Area_0.Wilderness_Area_0\n", " X.Wilderness_Area_1 -> Wilderness_Area_1.Wilderness_Area_1\n", " X.Wilderness_Area_2 -> Wilderness_Area_2.Wilderness_Area_2\n", " X.Wilderness_Area_3 -> Wilderness_Area_3.Wilderness_Area_3\n", " X.Soil_Type_0 -> Soil_Type_0.Soil_Type_0\n", " X.Soil_Type_1 -> Soil_Type_1.Soil_Type_1\n", " X.Soil_Type_2 -> Soil_Type_2.Soil_Type_2\n", " X.Soil_Type_3 -> Soil_Type_3.Soil_Type_3\n", " X.Soil_Type_4 -> Soil_Type_4.Soil_Type_4" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "es = es.normalize_entity(base_entity_id='X', new_entity_id='Wilderness_Area_0', index='Wilderness_Area_0')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Wilderness_Area_1', index='Wilderness_Area_1')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Wilderness_Area_2', index='Wilderness_Area_2')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Wilderness_Area_3', index='Wilderness_Area_3')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Soil_Type_0', index='Soil_Type_0')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Soil_Type_1', index='Soil_Type_1')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Soil_Type_2', index='Soil_Type_2')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Soil_Type_3', index='Soil_Type_3')\n", "es = es.normalize_entity(base_entity_id='X', new_entity_id='Soil_Type_4', index='Soil_Type_4')\n", "es" ] }, { "cell_type": "code", "execution_count": 11, "id": "352fa085", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nametypedask_compatiblekoalas_compatibledescriptionvalid_inputsreturn_type
0sumaggregationTrueTrueCalculates the total addition, ignoring `NaN`.NumericNumeric
1firstaggregationFalseFalseDetermines the first value in a list.VariableNone
2lastaggregationFalseFalseDetermines the last value in a list.VariableNone
3trendaggregationFalseFalseCalculates the trend of a variable over time.DatetimeTimeIndex, NumericNumeric
4n_most_commonaggregationFalseFalseDetermines the `n` most common elements.DiscreteDiscrete
5time_since_lastaggregationFalseFalseCalculates the time elapsed since the last datetime (default in seconds).DatetimeTimeIndexNumeric
6stdaggregationTrueTrueComputes the dispersion relative to the mean value, ignoring `NaN`.NumericNumeric
7medianaggregationFalseFalseDetermines the middlemost number in a list of values.NumericNumeric
8countaggregationTrueTrueDetermines the total number of values, excluding `NaN`.IndexNumeric
9percent_trueaggregationTrueFalseDetermines the percent of `True` values.BooleanNumeric
10time_since_firstaggregationFalseFalseCalculates the time elapsed since the first datetime (in seconds).DatetimeTimeIndexNumeric
11maxaggregationTrueTrueCalculates the highest value, ignoring `NaN` values.NumericNumeric
12anyaggregationTrueFalseDetermines if any value is 'True' in a list.BooleanBoolean
13modeaggregationFalseFalseDetermines the most commonly repeated value.DiscreteNone
14entropyaggregationFalseFalseCalculates the entropy for a categorical variableCategoricalNumeric
15minaggregationTrueTrueCalculates the smallest value, ignoring `NaN` values.NumericNumeric
16allaggregationTrueFalseCalculates if all values are 'True' in a list.BooleanBoolean
17skewaggregationFalseFalseComputes the extent to which a distribution differs from a normal distribution.NumericNumeric
18meanaggregationTrueTrueComputes the average for a list of values.NumericNumeric
19avg_time_betweenaggregationFalseFalseComputes the average number of seconds between consecutive events.DatetimeTimeIndexNumeric
20num_uniqueaggregationTrueTrueDetermines the number of distinct values, ignoring `NaN` values.DiscreteNumeric
21num_trueaggregationTrueFalseCounts the number of `True` values.BooleanNumeric
\n", "
" ], "text/plain": [ " name type dask_compatible koalas_compatible \\\n", "0 sum aggregation True True \n", "1 first aggregation False False \n", "2 last aggregation False False \n", "3 trend aggregation False False \n", "4 n_most_common aggregation False False \n", "5 time_since_last aggregation False False \n", "6 std aggregation True True \n", "7 median aggregation False False \n", "8 count aggregation True True \n", "9 percent_true aggregation True False \n", "10 time_since_first aggregation False False \n", "11 max aggregation True True \n", "12 any aggregation True False \n", "13 mode aggregation False False \n", "14 entropy aggregation False False \n", "15 min aggregation True True \n", "16 all aggregation True False \n", "17 skew aggregation False False \n", "18 mean aggregation True True \n", "19 avg_time_between aggregation False False \n", "20 num_unique aggregation True True \n", "21 num_true aggregation True False \n", "\n", " description \\\n", "0 Calculates the total addition, ignoring `NaN`. \n", "1 Determines the first value in a list. \n", "2 Determines the last value in a list. \n", "3 Calculates the trend of a variable over time. \n", "4 Determines the `n` most common elements. \n", "5 Calculates the time elapsed since the last datetime (default in seconds). \n", "6 Computes the dispersion relative to the mean value, ignoring `NaN`. \n", "7 Determines the middlemost number in a list of values. \n", "8 Determines the total number of values, excluding `NaN`. \n", "9 Determines the percent of `True` values. \n", "10 Calculates the time elapsed since the first datetime (in seconds). \n", "11 Calculates the highest value, ignoring `NaN` values. \n", "12 Determines if any value is 'True' in a list. \n", "13 Determines the most commonly repeated value. \n", "14 Calculates the entropy for a categorical variable \n", "15 Calculates the smallest value, ignoring `NaN` values. \n", "16 Calculates if all values are 'True' in a list. \n", "17 Computes the extent to which a distribution differs from a normal distribution. \n", "18 Computes the average for a list of values. \n", "19 Computes the average number of seconds between consecutive events. \n", "20 Determines the number of distinct values, ignoring `NaN` values. \n", "21 Counts the number of `True` values. \n", "\n", " valid_inputs return_type \n", "0 Numeric Numeric \n", "1 Variable None \n", "2 Variable None \n", "3 DatetimeTimeIndex, Numeric Numeric \n", "4 Discrete Discrete \n", "5 DatetimeTimeIndex Numeric \n", "6 Numeric Numeric \n", "7 Numeric Numeric \n", "8 Index Numeric \n", "9 Boolean Numeric \n", "10 DatetimeTimeIndex Numeric \n", "11 Numeric Numeric \n", "12 Boolean Boolean \n", "13 Discrete None \n", "14 Categorical Numeric \n", "15 Numeric Numeric \n", "16 Boolean Boolean \n", "17 Numeric Numeric \n", "18 Numeric Numeric \n", "19 DatetimeTimeIndex Numeric \n", "20 Discrete Numeric \n", "21 Boolean Numeric " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "primitives = ft.list_primitives()\n", "pd.options.display.max_colwidth = 100\n", "primitives[primitives['type'] == 'aggregation'].head(primitives[primitives['type'] == 'aggregation'].shape[0])" ] }, { "cell_type": "code", "execution_count": 12, "id": "7762885f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nametypedask_compatiblekoalas_compatibledescriptionvalid_inputsreturn_type
22greater_thantransformTrueFalseDetermines if values in one list are greater than another list.Ordinal, Datetime, NumericBoolean
23less_thantransformTrueTrueDetermines if values in one list are less than another list.Ordinal, Datetime, NumericBoolean
24andtransformTrueTrueElement-wise logical AND of two lists.BooleanBoolean
25less_than_scalartransformTrueTrueDetermines if values are less than a given scalar.Ordinal, Datetime, NumericBoolean
26modulo_numerictransformTrueTrueElement-wise modulo of two lists.NumericNumeric
........................
79is_weekendtransformTrueTrueDetermines if a date falls on a weekend.DatetimeBoolean
80num_characterstransformTrueTrueCalculates the number of characters in a string.NaturalLanguageNumeric
81latitudetransformFalseFalseReturns the first tuple value in a list of LatLong tuples.LatLongNumeric
82cum_sumtransformFalseFalseCalculates the cumulative sum.NumericNumeric
83subtract_numeric_scalartransformTrueTrueSubtract a scalar from each element in the list.NumericNumeric
\n", "

62 rows × 7 columns

\n", "
" ], "text/plain": [ " name type dask_compatible koalas_compatible \\\n", "22 greater_than transform True False \n", "23 less_than transform True True \n", "24 and transform True True \n", "25 less_than_scalar transform True True \n", "26 modulo_numeric transform True True \n", ".. ... ... ... ... \n", "79 is_weekend transform True True \n", "80 num_characters transform True True \n", "81 latitude transform False False \n", "82 cum_sum transform False False \n", "83 subtract_numeric_scalar transform True True \n", "\n", " description \\\n", "22 Determines if values in one list are greater than another list. \n", "23 Determines if values in one list are less than another list. \n", "24 Element-wise logical AND of two lists. \n", "25 Determines if values are less than a given scalar. \n", "26 Element-wise modulo of two lists. \n", ".. ... \n", "79 Determines if a date falls on a weekend. \n", "80 Calculates the number of characters in a string. \n", "81 Returns the first tuple value in a list of LatLong tuples. \n", "82 Calculates the cumulative sum. \n", "83 Subtract a scalar from each element in the list. \n", "\n", " valid_inputs return_type \n", "22 Ordinal, Datetime, Numeric Boolean \n", "23 Ordinal, Datetime, Numeric Boolean \n", "24 Boolean Boolean \n", "25 Ordinal, Datetime, Numeric Boolean \n", "26 Numeric Numeric \n", ".. ... ... \n", "79 Datetime Boolean \n", "80 NaturalLanguage Numeric \n", "81 LatLong Numeric \n", "82 Numeric Numeric \n", "83 Numeric Numeric \n", "\n", "[62 rows x 7 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "primitives[primitives['type'] == 'transform'].head(primitives[primitives['type'] == 'transform'].shape[0])" ] }, { "cell_type": "markdown", "id": "2a1baf81", "metadata": {}, "source": [ "1. 现在我们将应用一个深度特征合成(DFS)函数,该函数将通过自动应用适当的聚合来生成新特征,这里选择了深度2。深度值越高,将堆叠越多的基本体。" ] }, { "cell_type": "code", "execution_count": 14, "id": "6d3df2f7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 1min 3s\n" ] } ], "source": [ "%%time\n", "features, feature_names = ft.dfs(entityset = es, \n", " target_entity = 'X', \n", " max_depth = 2)" ] }, { "cell_type": "markdown", "id": "3c16b6f0", "metadata": {}, "source": [ "这是一个新功能的列表。例如,\"Wilderness_Area_0.MEAN(X.Elevation)\"表示Wilderness_Area_0的每一个唯一值的Elevation值的均值。即相同的Wilderness_Area_0的Elevation值的均值" ] }, { "cell_type": "code", "execution_count": 15, "id": "9a44a98a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "feature_names" ] }, { "cell_type": "code", "execution_count": 16, "id": "d5036e65", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Wilderness_Area_0.MEAN(X.Elevation)ElevationWilderness_Area_0
index
03000.2673342596.01.0
5613000.2673342596.01.0
20622926.0532232596.00.0
69462926.0532232596.00.0
69762926.0532232596.00.0
\n", "
" ], "text/plain": [ " Wilderness_Area_0.MEAN(X.Elevation) Elevation Wilderness_Area_0\n", "index \n", "0 3000.267334 2596.0 1.0\n", "561 3000.267334 2596.0 1.0\n", "2062 2926.053223 2596.0 0.0\n", "6946 2926.053223 2596.0 0.0\n", "6976 2926.053223 2596.0 0.0" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features[features['Elevation'] == 2596][[\"Wilderness_Area_0.MEAN(X.Elevation)\",\"Elevation\",\"Wilderness_Area_0\"]].head()" ] }, { "cell_type": "code", "execution_count": 17, "id": "ec8b7ccd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(581012, 532)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features.shape" ] }, { "cell_type": "markdown", "id": "6b988d39", "metadata": {}, "source": [ "通过使用“featuretools”,我们能够在瞬间生成512个特征。\n", "\n", "“featuretools”是一个功能强大的软件包,它可以节省从多个数据表创建新功能的时间。然而,它并不能完全替代人类领域的知识。此外,现在我们面临另一个问题,称为“维度灾难”。" ] }, { "cell_type": "markdown", "id": "c92f91f5", "metadata": {}, "source": [ "### 3.“维度灾难”:特征约简与选择" ] }, { "cell_type": "markdown", "id": "75b7cc64", "metadata": {}, "source": [ "为了解决“维数灾难”,有必要应用特征约简和选择,这意味着从数据中去除低值特征。但请记住,特征选择可能会影响ML模型的性能。棘手的是,ML模型的设计包含一个艺术元素。这绝对不是一个具有严格规则的确定性过程,要想取得成功就必须遵循这些规则。为了得到一个精确的模型,有必要应用、组合和比较几十种方法。在本notebook中,我不会解释所有可能的方法来处理“维度灾难”。我将集中讨论以下方法:\n", "\n", "* 确定共线特征\n", "\n", "* 使用L1范数惩罚的线性模型检测最相关的特征" ] }, { "cell_type": "markdown", "id": "20b48cb1", "metadata": {}, "source": [ "#### 3.1 确认共线特征\n", "\n", "共线性意味着独立特征之间的高度相关性。如果我们在模式中保持这些特征,可能很难评估独立特征对目标变量的影响。因此,我们将检测这些功能并删除它们,尽管在删除之前会应用手动修订。" ] }, { "cell_type": "code", "execution_count": 37, "id": "eda67b1a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_NoonHillshade_3pm...Soil_Type_4.STD(X.Elevation)Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways)Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology)Soil_Type_4.SUM(X.Elevation)Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways)Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology)
ElevationNaN0.3062290.0933060.3655590.1480220.0157350.2426970.1121790.2058870.059148...0.1503760.1503760.1503760.1503760.1503760.1503760.1503760.1503760.1503760.150376
Horizontal_Distance_To_HydrologyNaNNaN0.6062360.0720300.0518740.0173760.0106070.0270880.0467900.052330...0.0093700.0093700.0093700.0093700.0093700.0093700.0093700.0093700.0093700.009370
Vertical_Distance_To_HydrologyNaNNaNNaN0.0463720.0699130.0703050.2749760.1663330.1109570.034902...0.0267720.0267720.0267720.0267720.0267720.0267720.0267720.0267720.0267720.026772
Horizontal_Distance_To_RoadwaysNaNNaNNaNNaN0.3315800.0251210.2159140.0343490.1894610.106119...0.0616070.0616070.0616070.0616070.0616070.0616070.0616070.0616070.0616070.061607
Horizontal_Distance_To_Fire_PointsNaNNaNNaNNaNNaN0.1091720.1856620.1326690.0573290.047981...0.0518450.0518450.0518450.0518450.0518450.0518450.0518450.0518450.0518450.051845
AspectNaNNaNNaNNaNNaNNaN0.0787280.5792730.3361030.646944...0.0089380.0089380.0089380.0089380.0089380.0089380.0089380.0089380.0089380.008938
SlopeNaNNaNNaNNaNNaNNaNNaN0.3271990.5269110.175854...0.0723110.0723110.0723110.0723110.0723110.0723110.0723110.0723110.0723110.072311
Hillshade_9amNaNNaNNaNNaNNaNNaNNaNNaN0.0100370.780296...0.0465140.0465140.0465140.0465140.0465140.0465140.0465140.0465140.0465140.046514
Hillshade_NoonNaNNaNNaNNaNNaNNaNNaNNaNNaN0.594274...0.0620440.0620440.0620440.0620440.0620440.0620440.0620440.0620440.0620440.062044
Hillshade_3pmNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0069000.0069000.0069000.0069000.0069000.0069000.0069000.0069000.0069000.006900
Wilderness_Area_0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0122250.0122250.0122250.0122250.0122250.0122250.0122250.0122250.0122250.012225
Wilderness_Area_2NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0461660.0461660.0461660.0461660.0461660.0461660.0461660.0461660.0461660.046166
Wilderness_Area_3NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.2014010.2014010.2014010.2014010.2014010.2014010.2014010.2014010.2014010.201401
Soil_Type_0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0038020.0038020.0038020.0038020.0038020.0038020.0038020.0038020.0038020.003802
Soil_Type_1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0060140.0060140.0060140.0060140.0060140.0060140.0060140.0060140.0060140.006014
Soil_Type_2NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0048030.0048030.0048030.0048030.0048030.0048030.0048030.0048030.0048030.004803
Soil_Type_3NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0077520.0077520.0077520.0077520.0077520.0077520.0077520.0077520.0077520.007752
Soil_Type_4NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...1.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.000000
Wilderness_Area_0.COUNT(X)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MAX(X.Elevation)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MEAN(X.Elevation)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MIN(X.Elevation)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Aspect)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Hillshade_3pm)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Hillshade_9am)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Hillshade_Noon)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Slope)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Soil_Type_0)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Soil_Type_1)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Soil_Type_2)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Soil_Type_3)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Soil_Type_4)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Wilderness_Area_1)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.MODE(X.Wilderness_Area_2)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
Wilderness_Area_0.MODE(X.Wilderness_Area_3)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.NUM_UNIQUE(X.Aspect)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.0473790.047379
\n", "

50 rows × 532 columns

\n", "
" ], "text/plain": [ " Elevation \\\n", "Elevation NaN \n", "Horizontal_Distance_To_Hydrology NaN \n", "Vertical_Distance_To_Hydrology NaN \n", "Horizontal_Distance_To_Roadways NaN \n", "Horizontal_Distance_To_Fire_Points NaN \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Horizontal_Distance_To_Hydrology \\\n", "Elevation 0.306229 \n", "Horizontal_Distance_To_Hydrology NaN \n", "Vertical_Distance_To_Hydrology NaN \n", "Horizontal_Distance_To_Roadways NaN \n", "Horizontal_Distance_To_Fire_Points NaN \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Vertical_Distance_To_Hydrology \\\n", "Elevation 0.093306 \n", "Horizontal_Distance_To_Hydrology 0.606236 \n", "Vertical_Distance_To_Hydrology NaN \n", "Horizontal_Distance_To_Roadways NaN \n", "Horizontal_Distance_To_Fire_Points NaN \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Horizontal_Distance_To_Roadways \\\n", "Elevation 0.365559 \n", "Horizontal_Distance_To_Hydrology 0.072030 \n", "Vertical_Distance_To_Hydrology 0.046372 \n", "Horizontal_Distance_To_Roadways NaN \n", "Horizontal_Distance_To_Fire_Points NaN \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Horizontal_Distance_To_Fire_Points \\\n", "Elevation 0.148022 \n", "Horizontal_Distance_To_Hydrology 0.051874 \n", "Vertical_Distance_To_Hydrology 0.069913 \n", "Horizontal_Distance_To_Roadways 0.331580 \n", "Horizontal_Distance_To_Fire_Points NaN \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Aspect \\\n", "Elevation 0.015735 \n", "Horizontal_Distance_To_Hydrology 0.017376 \n", "Vertical_Distance_To_Hydrology 0.070305 \n", "Horizontal_Distance_To_Roadways 0.025121 \n", "Horizontal_Distance_To_Fire_Points 0.109172 \n", "Aspect NaN \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Slope \\\n", "Elevation 0.242697 \n", "Horizontal_Distance_To_Hydrology 0.010607 \n", "Vertical_Distance_To_Hydrology 0.274976 \n", "Horizontal_Distance_To_Roadways 0.215914 \n", "Horizontal_Distance_To_Fire_Points 0.185662 \n", "Aspect 0.078728 \n", "Slope NaN \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Hillshade_9am \\\n", "Elevation 0.112179 \n", "Horizontal_Distance_To_Hydrology 0.027088 \n", "Vertical_Distance_To_Hydrology 0.166333 \n", "Horizontal_Distance_To_Roadways 0.034349 \n", "Horizontal_Distance_To_Fire_Points 0.132669 \n", "Aspect 0.579273 \n", "Slope 0.327199 \n", "Hillshade_9am NaN \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Hillshade_Noon \\\n", "Elevation 0.205887 \n", "Horizontal_Distance_To_Hydrology 0.046790 \n", "Vertical_Distance_To_Hydrology 0.110957 \n", "Horizontal_Distance_To_Roadways 0.189461 \n", "Horizontal_Distance_To_Fire_Points 0.057329 \n", "Aspect 0.336103 \n", "Slope 0.526911 \n", "Hillshade_9am 0.010037 \n", "Hillshade_Noon NaN \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " Hillshade_3pm \\\n", "Elevation 0.059148 \n", "Horizontal_Distance_To_Hydrology 0.052330 \n", "Vertical_Distance_To_Hydrology 0.034902 \n", "Horizontal_Distance_To_Roadways 0.106119 \n", "Horizontal_Distance_To_Fire_Points 0.047981 \n", "Aspect 0.646944 \n", "Slope 0.175854 \n", "Hillshade_9am 0.780296 \n", "Hillshade_Noon 0.594274 \n", "Hillshade_3pm NaN \n", "Wilderness_Area_0 NaN \n", "Wilderness_Area_1 NaN \n", "Wilderness_Area_2 NaN \n", "Wilderness_Area_3 NaN \n", "Soil_Type_0 NaN \n", "Soil_Type_1 NaN \n", "Soil_Type_2 NaN \n", "Soil_Type_3 NaN \n", "Soil_Type_4 NaN \n", "Wilderness_Area_0.COUNT(X) NaN \n", "Wilderness_Area_0.MAX(X.Elevation) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Elevation) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Elevation) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MODE(X.Aspect) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) NaN \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) NaN \n", "Wilderness_Area_0.MODE(X.Slope) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) NaN \n", "\n", " ... \\\n", "Elevation ... \n", "Horizontal_Distance_To_Hydrology ... \n", "Vertical_Distance_To_Hydrology ... \n", "Horizontal_Distance_To_Roadways ... \n", "Horizontal_Distance_To_Fire_Points ... \n", "Aspect ... \n", "Slope ... \n", "Hillshade_9am ... \n", "Hillshade_Noon ... \n", "Hillshade_3pm ... \n", "Wilderness_Area_0 ... \n", "Wilderness_Area_1 ... \n", "Wilderness_Area_2 ... \n", "Wilderness_Area_3 ... \n", "Soil_Type_0 ... \n", "Soil_Type_1 ... \n", "Soil_Type_2 ... \n", "Soil_Type_3 ... \n", "Soil_Type_4 ... \n", "Wilderness_Area_0.COUNT(X) ... \n", "Wilderness_Area_0.MAX(X.Elevation) ... \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) ... \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) ... \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MEAN(X.Elevation) ... \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) ... \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) ... \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MIN(X.Elevation) ... \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) ... \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) ... \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) ... \n", "Wilderness_Area_0.MODE(X.Aspect) ... \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) ... \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) ... \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) ... \n", "Wilderness_Area_0.MODE(X.Slope) ... \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) ... \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) ... \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) ... \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) ... \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) ... \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) ... \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) ... \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) ... \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) ... \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) ... \n", "\n", " Soil_Type_4.STD(X.Elevation) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.SUM(X.Elevation) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways) \\\n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", " Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology) \n", "Elevation 0.150376 \n", "Horizontal_Distance_To_Hydrology 0.009370 \n", "Vertical_Distance_To_Hydrology 0.026772 \n", "Horizontal_Distance_To_Roadways 0.061607 \n", "Horizontal_Distance_To_Fire_Points 0.051845 \n", "Aspect 0.008938 \n", "Slope 0.072311 \n", "Hillshade_9am 0.046514 \n", "Hillshade_Noon 0.062044 \n", "Hillshade_3pm 0.006900 \n", "Wilderness_Area_0 0.047379 \n", "Wilderness_Area_1 0.012225 \n", "Wilderness_Area_2 0.046166 \n", "Wilderness_Area_3 0.201401 \n", "Soil_Type_0 0.003802 \n", "Soil_Type_1 0.006014 \n", "Soil_Type_2 0.004803 \n", "Soil_Type_3 0.007752 \n", "Soil_Type_4 1.000000 \n", "Wilderness_Area_0.COUNT(X) 0.047379 \n", "Wilderness_Area_0.MAX(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MAX(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MAX(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Fire_Points) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Horizontal_Distance_To_Roadways) 0.047379 \n", "Wilderness_Area_0.MEAN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MIN(X.Elevation) 0.047379 \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Fire_Points) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Hydrology) NaN \n", "Wilderness_Area_0.MIN(X.Horizontal_Distance_To_Roadways) NaN \n", "Wilderness_Area_0.MIN(X.Vertical_Distance_To_Hydrology) 0.047379 \n", "Wilderness_Area_0.MODE(X.Aspect) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_3pm) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_9am) 0.047379 \n", "Wilderness_Area_0.MODE(X.Hillshade_Noon) 0.047379 \n", "Wilderness_Area_0.MODE(X.Slope) 0.047379 \n", "Wilderness_Area_0.MODE(X.Soil_Type_0) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_1) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_2) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_3) NaN \n", "Wilderness_Area_0.MODE(X.Soil_Type_4) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_1) NaN \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_2) 0.047379 \n", "Wilderness_Area_0.MODE(X.Wilderness_Area_3) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Aspect) NaN \n", "Wilderness_Area_0.NUM_UNIQUE(X.Hillshade_3pm) 0.047379 \n", "\n", "[50 rows x 532 columns]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 相关系数超过一定阈值则删除\n", "threshold = 0.95\n", "\n", "# 绝对值相关系数矩阵\n", "corr_matrix = features.corr().abs()\n", "upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))\n", "upper.head(50)" ] }, { "cell_type": "code", "execution_count": 38, "id": "b437e62c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "There are 407 features to remove.\n" ] } ], "source": [ "# 选择相关系数低于阈值的特征\n", "collinear_features = [column for column in upper.columns if any(upper[column] > threshold)]\n", "\n", "print('There are %d features to remove.' % (len(collinear_features)))" ] }, { "cell_type": "code", "execution_count": 39, "id": "db09c8dc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The number of features that passed the collinearity threshold: 125\n" ] } ], "source": [ "features_filtered = features.drop(columns = collinear_features)\n", "\n", "print('The number of features that passed the collinearity threshold: ', features_filtered.shape[1])" ] }, { "cell_type": "markdown", "id": "b174e3b5", "metadata": {}, "source": [ "但是,请注意,在不了解删除过程的情况下,仅通过关联删除特征不是一个好主意。具有非常高相关性的两者之间存在显著差异的功能可能需要额外操作。因此,手动操作是必要的。但是这个主题超出了内核的范围。" ] }, { "cell_type": "markdown", "id": "613b437a", "metadata": {}, "source": [ "#### 3.2 使用L1范数惩罚的线性模型检测最相关的特征\n", "下一步是使用L1 norml惩罚的线性模型。" ] }, { "cell_type": "markdown", "id": "6d522273", "metadata": {}, "source": [ "注意,正常情况下我们是不知道测试集的标签,所以这里先做分割,切分训练和预测集合" ] }, { "cell_type": "code", "execution_count": 46, "id": "7afc5013", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_Noon...Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.MODE(X.Soil_Type_0)Soil_Type_4.MODE(X.Soil_Type_1)Soil_Type_4.MODE(X.Soil_Type_2)Soil_Type_4.MODE(X.Soil_Type_3)Soil_Type_4.MODE(X.Wilderness_Area_0)Soil_Type_4.MODE(X.Wilderness_Area_1)Soil_Type_4.MODE(X.Wilderness_Area_2)Cover_Type
002596.0258.00.0510.06279.051.03.0221.0232.0...0.00.00.00.00.00.00.00.00.04.0
112590.0212.0-6.0390.06225.056.02.0220.0235.0...0.00.00.00.00.00.00.00.00.04.0
\n", "

2 rows × 127 columns

\n", "
" ], "text/plain": [ " index Elevation Horizontal_Distance_To_Hydrology \\\n", "0 0 2596.0 258.0 \n", "1 1 2590.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "0 0.0 510.0 \n", "1 -6.0 390.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Aspect Slope Hillshade_9am \\\n", "0 6279.0 51.0 3.0 221.0 \n", "1 6225.0 56.0 2.0 220.0 \n", "\n", " Hillshade_Noon ... Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points) \\\n", "0 232.0 ... 0.0 \n", "1 235.0 ... 0.0 \n", "\n", " Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology) \\\n", "0 0.0 \n", "1 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_0) Soil_Type_4.MODE(X.Soil_Type_1) \\\n", "0 0.0 0.0 \n", "1 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_2) Soil_Type_4.MODE(X.Soil_Type_3) \\\n", "0 0.0 0.0 \n", "1 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_0) \\\n", "0 0.0 \n", "1 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_1) \\\n", "0 0.0 \n", "1 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_2) Cover_Type \n", "0 0.0 4.0 \n", "1 0.0 4.0 \n", "\n", "[2 rows x 127 columns]" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.merge(features_filtered, y, on=['index'])\n", "df.head(2)" ] }, { "cell_type": "code", "execution_count": 48, "id": "54d2eefe", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_Noon...Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.MODE(X.Soil_Type_0)Soil_Type_4.MODE(X.Soil_Type_1)Soil_Type_4.MODE(X.Soil_Type_2)Soil_Type_4.MODE(X.Soil_Type_3)Soil_Type_4.MODE(X.Wilderness_Area_0)Soil_Type_4.MODE(X.Wilderness_Area_1)Soil_Type_4.MODE(X.Wilderness_Area_2)Cover_Type
4422164422162833.060.026.01890.01211.0258.026.0148.0244.0...0.00.00.00.00.00.00.00.00.01.0
20198201983008.0339.07.06427.02971.045.02.0220.0234.0...0.00.00.00.00.00.00.00.00.01.0
\n", "

2 rows × 127 columns

\n", "
" ], "text/plain": [ " index Elevation Horizontal_Distance_To_Hydrology \\\n", "442216 442216 2833.0 60.0 \n", "20198 20198 3008.0 339.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "442216 26.0 1890.0 \n", "20198 7.0 6427.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Aspect Slope Hillshade_9am \\\n", "442216 1211.0 258.0 26.0 148.0 \n", "20198 2971.0 45.0 2.0 220.0 \n", "\n", " Hillshade_Noon ... \\\n", "442216 244.0 ... \n", "20198 234.0 ... \n", "\n", " Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points) \\\n", "442216 0.0 \n", "20198 0.0 \n", "\n", " Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology) \\\n", "442216 0.0 \n", "20198 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_0) Soil_Type_4.MODE(X.Soil_Type_1) \\\n", "442216 0.0 0.0 \n", "20198 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_2) Soil_Type_4.MODE(X.Soil_Type_3) \\\n", "442216 0.0 0.0 \n", "20198 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_0) \\\n", "442216 0.0 \n", "20198 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_1) \\\n", "442216 0.0 \n", "20198 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_2) Cover_Type \n", "442216 0.0 1.0 \n", "20198 0.0 1.0 \n", "\n", "[2 rows x 127 columns]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_df, test_df = train_test_split(df,random_state=42)\n", "train_df.head(2)" ] }, { "cell_type": "code", "execution_count": 53, "id": "7c1ae694", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_Noon...Soil_Type_3.NUM_UNIQUE(X.Wilderness_Area_3)Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.MODE(X.Soil_Type_0)Soil_Type_4.MODE(X.Soil_Type_1)Soil_Type_4.MODE(X.Soil_Type_2)Soil_Type_4.MODE(X.Soil_Type_3)Soil_Type_4.MODE(X.Wilderness_Area_0)Soil_Type_4.MODE(X.Wilderness_Area_1)Soil_Type_4.MODE(X.Wilderness_Area_2)
2507282507283351.0726.0124.03813.02271.0206.027.0192.0252.0...2.00.00.00.00.00.00.00.00.00.0
2467882467882732.0212.01.01082.0912.0129.07.0231.0236.0...2.00.00.00.00.00.00.00.00.00.0
\n", "

2 rows × 126 columns

\n", "
" ], "text/plain": [ " index Elevation Horizontal_Distance_To_Hydrology \\\n", "250728 250728 3351.0 726.0 \n", "246788 246788 2732.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "250728 124.0 3813.0 \n", "246788 1.0 1082.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Aspect Slope Hillshade_9am \\\n", "250728 2271.0 206.0 27.0 192.0 \n", "246788 912.0 129.0 7.0 231.0 \n", "\n", " Hillshade_Noon ... Soil_Type_3.NUM_UNIQUE(X.Wilderness_Area_3) \\\n", "250728 252.0 ... 2.0 \n", "246788 236.0 ... 2.0 \n", "\n", " Soil_Type_4.MIN(X.Horizontal_Distance_To_Fire_Points) \\\n", "250728 0.0 \n", "246788 0.0 \n", "\n", " Soil_Type_4.MIN(X.Horizontal_Distance_To_Hydrology) \\\n", "250728 0.0 \n", "246788 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_0) Soil_Type_4.MODE(X.Soil_Type_1) \\\n", "250728 0.0 0.0 \n", "246788 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Soil_Type_2) Soil_Type_4.MODE(X.Soil_Type_3) \\\n", "250728 0.0 0.0 \n", "246788 0.0 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_0) \\\n", "250728 0.0 \n", "246788 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_1) \\\n", "250728 0.0 \n", "246788 0.0 \n", "\n", " Soil_Type_4.MODE(X.Wilderness_Area_2) \n", "250728 0.0 \n", "246788 0.0 \n", "\n", "[2 rows x 126 columns]" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features_positive = features_filtered.loc[:, features_filtered.ge(0).all()]\n", "\n", "train_X = train_df.drop('Cover_Type',1)\n", "train_y = train_df['Cover_Type']\n", "\n", "test_X = test_df.drop('Cover_Type',1)\n", "test_y = test_df['Cover_Type']\n", "test_X.head(2)" ] }, { "cell_type": "code", "execution_count": 54, "id": "0c224d19", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "D:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\svm\\_base.py:985: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.\n", " warnings.warn(\"Liblinear failed to converge, increase \"\n" ] }, { "data": { "text/plain": [ "(435759, 36)" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lsvc = LinearSVC(C=0.01, penalty=\"l1\", dual=False).fit(train_X, train_y)\n", "model = SelectFromModel(lsvc, prefit=True)\n", "X_new = model.transform(train_X)\n", "X_selected_df = pd.DataFrame(X_new, columns=[train_X.columns[i] for i in range(len(train_X.columns)) if model.get_support()[i]])\n", "X_selected_df.shape" ] }, { "cell_type": "code", "execution_count": 55, "id": "9679358f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Elevation', 'Horizontal_Distance_To_Hydrology',\n", " 'Vertical_Distance_To_Hydrology', 'Horizontal_Distance_To_Roadways',\n", " 'Horizontal_Distance_To_Fire_Points', 'Aspect', 'Slope',\n", " 'Hillshade_9am', 'Hillshade_Noon', 'Hillshade_3pm', 'Wilderness_Area_0',\n", " 'Wilderness_Area_1', 'Wilderness_Area_2', 'Wilderness_Area_3',\n", " 'Soil_Type_0', 'Soil_Type_1', 'Soil_Type_2', 'Soil_Type_3',\n", " 'Soil_Type_4', 'Wilderness_Area_0.NUM_UNIQUE(X.Aspect)',\n", " 'Wilderness_Area_2.MODE(X.Aspect)',\n", " 'Wilderness_Area_2.NUM_UNIQUE(X.Aspect)',\n", " 'Wilderness_Area_2.NUM_UNIQUE(X.Soil_Type_1)',\n", " 'Wilderness_Area_2.NUM_UNIQUE(X.Soil_Type_2)',\n", " 'Wilderness_Area_2.NUM_UNIQUE(X.Soil_Type_3)',\n", " 'Wilderness_Area_3.NUM_UNIQUE(X.Aspect)',\n", " 'Wilderness_Area_3.NUM_UNIQUE(X.Soil_Type_1)',\n", " 'Wilderness_Area_3.NUM_UNIQUE(X.Soil_Type_2)',\n", " 'Wilderness_Area_3.NUM_UNIQUE(X.Soil_Type_3)',\n", " 'Soil_Type_1.MODE(X.Hillshade_9am)',\n", " 'Soil_Type_1.NUM_UNIQUE(X.Wilderness_Area_2)',\n", " 'Soil_Type_1.NUM_UNIQUE(X.Wilderness_Area_3)',\n", " 'Soil_Type_2.NUM_UNIQUE(X.Wilderness_Area_2)',\n", " 'Soil_Type_2.NUM_UNIQUE(X.Wilderness_Area_3)',\n", " 'Soil_Type_3.NUM_UNIQUE(X.Wilderness_Area_2)',\n", " 'Soil_Type_3.NUM_UNIQUE(X.Wilderness_Area_3)'],\n", " dtype='object')" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_selected_df.columns" ] }, { "cell_type": "markdown", "id": "ece30629", "metadata": {}, "source": [ "### 4. 训练和测试单模型" ] }, { "cell_type": "markdown", "id": "b1ce758e", "metadata": {}, "source": [ "最后,我们将创建一个基本随机森林分类器。请注意,我跳过了一些基本步骤,如交叉验证、学习曲线分析等。" ] }, { "cell_type": "code", "execution_count": 59, "id": "e7c55d5d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 12min 18s\n" ] }, { "data": { "text/plain": [ "RandomForestClassifier(n_estimators=500, oob_score=True)" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "random_forest = RandomForestClassifier(n_estimators=500,oob_score=True)\n", "random_forest.fit(X_selected_df, train_y)" ] }, { "cell_type": "markdown", "id": "644cacf6", "metadata": {}, "source": [ "### 5.验证效果" ] }, { "cell_type": "code", "execution_count": 60, "id": "6345b91b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9439598493662782\n" ] } ], "source": [ "# 验证效果\n", "Y_pred = random_forest.predict(test_X[X_selected_df.columns])\n", "print(accuracy_score(Y_pred,test_y)) # RF" ] }, { "cell_type": "code", "execution_count": 67, "id": "be8025ef", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "51238" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"\n", "del features_filtered\n", "del features_positive\n", "del fetch_covtype\n", "del df, X,y, X_selected_df,train,test,train_df,test_df,train_X,train_y\n", "\"\"\"\n", "gc.collect()" ] }, { "cell_type": "markdown", "id": "8406d7c5", "metadata": {}, "source": [ "### 5.1 比较原特征的分数" ] }, { "cell_type": "code", "execution_count": 8, "id": "b7241552", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationAspectSlopeHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHillshade_9amHillshade_NoonHillshade_3pmHorizontal_Distance_To_Fire_PointsWilderness_Area_0Wilderness_Area_1Wilderness_Area_2Wilderness_Area_3Soil_Type_0Soil_Type_1Soil_Type_2Soil_Type_3Soil_Type_4
2507282507283351.0206.027.0726.0124.03813.0192.0252.0180.02271.01.00.00.00.00.00.00.00.00.0
2467882467882732.0129.07.0212.01.01082.0231.0236.0137.0912.00.00.01.00.00.00.00.00.00.0
\n", "
" ], "text/plain": [ " index Elevation Aspect Slope Horizontal_Distance_To_Hydrology \\\n", "250728 250728 3351.0 206.0 27.0 726.0 \n", "246788 246788 2732.0 129.0 7.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "250728 124.0 3813.0 \n", "246788 1.0 1082.0 \n", "\n", " Hillshade_9am Hillshade_Noon Hillshade_3pm \\\n", "250728 192.0 252.0 180.0 \n", "246788 231.0 236.0 137.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Wilderness_Area_0 \\\n", "250728 2271.0 1.0 \n", "246788 912.0 0.0 \n", "\n", " Wilderness_Area_1 Wilderness_Area_2 Wilderness_Area_3 Soil_Type_0 \\\n", "250728 0.0 0.0 0.0 0.0 \n", "246788 0.0 1.0 0.0 0.0 \n", "\n", " Soil_Type_1 Soil_Type_2 Soil_Type_3 Soil_Type_4 \n", "250728 0.0 0.0 0.0 0.0 \n", "246788 0.0 0.0 0.0 0.0 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "org_df = pd.merge(X, y, on=['index'])\n", "org_train_df, org_test_df = train_test_split(org_df,random_state=42)\n", "org_train_X = org_train_df.drop('Cover_Type',1)\n", "org_train_y = org_train_df['Cover_Type']\n", "\n", "org_test_X = org_test_df.drop('Cover_Type',1)\n", "org_test_y = org_test_df['Cover_Type']\n", "org_test_X.head(2)" ] }, { "cell_type": "code", "execution_count": 9, "id": "db3d3b92", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9673328605949619\n", "Wall time: 14min 30s\n" ] } ], "source": [ "%%time\n", "random_forest = RandomForestClassifier(n_estimators=500,oob_score=True)\n", "random_forest.fit(org_train_X, org_train_y)\n", "pred_org_test_y = random_forest.predict(org_test_X)\n", "print(accuracy_score(pred_org_test_y,org_test_y)) # RF" ] }, { "cell_type": "markdown", "id": "50b5f988", "metadata": {}, "source": [ "### 5.2 使用未约简与选择的特征的分数" ] }, { "cell_type": "code", "execution_count": 18, "id": "0dc54e8c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_Noon...Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways)Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology)Soil_Type_4.SUM(X.Elevation)Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways)Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology)Cover_Type
002596.0258.00.0510.06279.051.03.0221.0232.0...1324.050751212.6899251558.36195658.2799891.715981e+091.149499e+09156171328.01.364632e+0926848308.04.0
112590.0212.0-6.0390.06225.056.02.0220.0235.0...1324.050751212.6899251558.36195658.2799891.715981e+091.149499e+09156171328.01.364632e+0926848308.04.0
\n", "

2 rows × 534 columns

\n", "
" ], "text/plain": [ " index Elevation Horizontal_Distance_To_Hydrology \\\n", "0 0 2596.0 258.0 \n", "1 1 2590.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "0 0.0 510.0 \n", "1 -6.0 390.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Aspect Slope Hillshade_9am \\\n", "0 6279.0 51.0 3.0 221.0 \n", "1 6225.0 56.0 2.0 220.0 \n", "\n", " Hillshade_Noon ... Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points) \\\n", "0 232.0 ... 1324.050751 \n", "1 235.0 ... 1324.050751 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology) \\\n", "0 212.689925 \n", "1 212.689925 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways) \\\n", "0 1558.361956 \n", "1 1558.361956 \n", "\n", " Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology) \\\n", "0 58.279989 \n", "1 58.279989 \n", "\n", " Soil_Type_4.SUM(X.Elevation) \\\n", "0 1.715981e+09 \n", "1 1.715981e+09 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points) \\\n", "0 1.149499e+09 \n", "1 1.149499e+09 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology) \\\n", "0 156171328.0 \n", "1 156171328.0 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways) \\\n", "0 1.364632e+09 \n", "1 1.364632e+09 \n", "\n", " Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology) Cover_Type \n", "0 26848308.0 4.0 \n", "1 26848308.0 4.0 \n", "\n", "[2 rows x 534 columns]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.merge(features, y, on=['index'])\n", "df.head(2)" ] }, { "cell_type": "code", "execution_count": 20, "id": "637b3a7e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3256" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "del features, X\n", "gc.collect()" ] }, { "cell_type": "code", "execution_count": 22, "id": "4ac537b8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexElevationHorizontal_Distance_To_HydrologyVertical_Distance_To_HydrologyHorizontal_Distance_To_RoadwaysHorizontal_Distance_To_Fire_PointsAspectSlopeHillshade_9amHillshade_Noon...Soil_Type_4.STD(X.Elevation)Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways)Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology)Soil_Type_4.SUM(X.Elevation)Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points)Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology)Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways)Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology)
2507282507283351.0726.0124.03813.02271.0206.027.0192.0252.0...277.0455171324.050751212.6899251558.36195658.2799891.715981e+091.149499e+09156171328.01.364632e+0926848308.0
2467882467882732.0212.01.01082.0912.0129.07.0231.0236.0...277.0455171324.050751212.6899251558.36195658.2799891.715981e+091.149499e+09156171328.01.364632e+0926848308.0
\n", "

2 rows × 533 columns

\n", "
" ], "text/plain": [ " index Elevation Horizontal_Distance_To_Hydrology \\\n", "250728 250728 3351.0 726.0 \n", "246788 246788 2732.0 212.0 \n", "\n", " Vertical_Distance_To_Hydrology Horizontal_Distance_To_Roadways \\\n", "250728 124.0 3813.0 \n", "246788 1.0 1082.0 \n", "\n", " Horizontal_Distance_To_Fire_Points Aspect Slope Hillshade_9am \\\n", "250728 2271.0 206.0 27.0 192.0 \n", "246788 912.0 129.0 7.0 231.0 \n", "\n", " Hillshade_Noon ... Soil_Type_4.STD(X.Elevation) \\\n", "250728 252.0 ... 277.045517 \n", "246788 236.0 ... 277.045517 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Fire_Points) \\\n", "250728 1324.050751 \n", "246788 1324.050751 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Hydrology) \\\n", "250728 212.689925 \n", "246788 212.689925 \n", "\n", " Soil_Type_4.STD(X.Horizontal_Distance_To_Roadways) \\\n", "250728 1558.361956 \n", "246788 1558.361956 \n", "\n", " Soil_Type_4.STD(X.Vertical_Distance_To_Hydrology) \\\n", "250728 58.279989 \n", "246788 58.279989 \n", "\n", " Soil_Type_4.SUM(X.Elevation) \\\n", "250728 1.715981e+09 \n", "246788 1.715981e+09 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Fire_Points) \\\n", "250728 1.149499e+09 \n", "246788 1.149499e+09 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Hydrology) \\\n", "250728 156171328.0 \n", "246788 156171328.0 \n", "\n", " Soil_Type_4.SUM(X.Horizontal_Distance_To_Roadways) \\\n", "250728 1.364632e+09 \n", "246788 1.364632e+09 \n", "\n", " Soil_Type_4.SUM(X.Vertical_Distance_To_Hydrology) \n", "250728 26848308.0 \n", "246788 26848308.0 \n", "\n", "[2 rows x 533 columns]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_df, test_df = train_test_split(df,random_state=42)\n", "train_X = train_df.drop('Cover_Type',1)\n", "train_y = train_df['Cover_Type']\n", "\n", "test_X = test_df.drop('Cover_Type',1)\n", "test_y = test_df['Cover_Type']\n", "test_X.head(2)" ] }, { "cell_type": "code", "execution_count": 23, "id": "24c7b22f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "45" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "del df, train_df, test_df\n", "gc.collect()" ] }, { "cell_type": "code", "execution_count": 24, "id": "869777ba", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9442352309418738\n", "Wall time: 30min 31s\n" ] } ], "source": [ "%%time\n", "random_forest = RandomForestClassifier(n_estimators=500,oob_score=True)\n", "random_forest.fit(train_X, train_y)\n", "pred_y = random_forest.predict(test_X)\n", "print(accuracy_score(pred_y,test_y)) # RF" ] }, { "cell_type": "markdown", "id": "3739a43c", "metadata": {}, "source": [ "从结果来看,在这个数据集上,不管是增加的特征,还是增加后过滤的特征,效果都比原始特征差。我也咨询了一些朋友他们试了效果都一般,但是kaggle上很多人点赞,如果你们在哪个数据集上试了效果上涨,请联系我。" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }