You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/ko/7-TimeSeries/3-SVR/working/notebook.ipynb

699 lines
16 KiB

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "fv9OoQsMFk5A"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"이 노트북에서는 다음을 시연합니다:\n",
"\n",
"- SVM 회귀 모델 훈련을 위해 2D 시계열 데이터를 준비하는 방법 \n",
"- RBF 커널을 사용하여 SVR을 구현하는 방법 \n",
"- 플롯과 MAPE를 사용하여 모델을 평가하는 방법 \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 모듈 가져오기\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"sys.path.append('../../')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "M687KNlQFp0-"
},
"outputs": [],
"source": [
"import os\n",
"import warnings\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import datetime as dt\n",
"import math\n",
"\n",
"from sklearn.svm import SVR\n",
"from sklearn.preprocessing import MinMaxScaler\n",
"from common.utils import load_data, mape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cj-kfVdMGjWP"
},
"source": [
"## 데이터 준비\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8fywSjC6GsRz"
},
"source": [
"### 데이터 로드\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "aBDkEB11Fumg",
"outputId": "99cf7987-0509-4b73-8cc2-75d7da0d2740"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>load</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2012-01-01 00:00:00</th>\n",
" <td>2698.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 01:00:00</th>\n",
" <td>2558.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 02:00:00</th>\n",
" <td>2444.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 03:00:00</th>\n",
" <td>2402.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 04:00:00</th>\n",
" <td>2403.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" load\n",
"2012-01-01 00:00:00 2698.0\n",
"2012-01-01 01:00:00 2558.0\n",
"2012-01-01 02:00:00 2444.0\n",
"2012-01-01 03:00:00 2402.0\n",
"2012-01-01 04:00:00 2403.0"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"energy = load_data('../../data')[['load']]\n",
"energy.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "O0BWP13rGnh4"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 486
},
"id": "hGaNPKu_Gidk",
"outputId": "7f89b326-9057-4f49-efbe-cb100ebdf76d"
},
"outputs": [],
"source": [
"energy.plot(y='load', subplots=True, figsize=(15, 8), fontsize=12)\n",
"plt.xlabel('timestamp', fontsize=12)\n",
"plt.ylabel('load', fontsize=12)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IPuNor4eGwYY"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ysvsNyONGt0Q"
},
"outputs": [],
"source": [
"train_start_dt = '2014-11-01 00:00:00'\n",
"test_start_dt = '2014-12-30 00:00:00'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 548
},
"id": "SsfdLoPyGy9w",
"outputId": "d6d6c25b-b1f4-47e5-91d1-707e043237d7"
},
"outputs": [],
"source": [
"energy[(energy.index < test_start_dt) & (energy.index >= train_start_dt)][['load']].rename(columns={'load':'train'}) \\\n",
" .join(energy[test_start_dt:][['load']].rename(columns={'load':'test'}), how='outer') \\\n",
" .plot(y=['train', 'test'], figsize=(15, 8), fontsize=12)\n",
"plt.xlabel('timestamp', fontsize=12)\n",
"plt.ylabel('load', fontsize=12)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XbFTqBw6G1Ch"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"이제 데이터를 필터링하고 스케일링하여 훈련을 위한 데이터를 준비해야 합니다.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cYivRdQpHDj3",
"outputId": "a138f746-461c-4fd6-bfa6-0cee094c4aa1"
},
"outputs": [],
"source": [
"train = energy.copy()[(energy.index >= train_start_dt) & (energy.index < test_start_dt)][['load']]\n",
"test = energy.copy()[energy.index >= test_start_dt][['load']]\n",
"\n",
"print('Training data shape: ', train.shape)\n",
"print('Test data shape: ', test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"데이터를 (0, 1) 범위로 조정하세요.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "3DNntGQnZX8G",
"outputId": "210046bc-7a66-4ccd-d70d-aa4a7309949c"
},
"outputs": [],
"source": [
"scaler = MinMaxScaler()\n",
"train['load'] = scaler.fit_transform(train)\n",
"train.head(5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "26Yht-rzZexe",
"outputId": "20326077-a38a-4e78-cc5b-6fd7af95d301"
},
"outputs": [],
"source": [
"test['load'] = scaler.transform(test)\n",
"test.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "x0n6jqxOQ41Z"
},
"source": [
"### 시간 단계로 데이터 생성\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fdmxTZtOQ8xs"
},
"source": [
"우리의 SVR에서는 입력 데이터를 `[batch, timesteps]` 형식으로 변환합니다. 따라서 기존의 `train_data`와 `test_data`를 재구성하여 timesteps를 나타내는 새로운 차원을 추가합니다. 예를 들어, 우리는 `timesteps = 5`로 설정합니다. 따라서 모델에 대한 입력은 첫 4개의 timesteps에 대한 데이터이고, 출력은 5번째 timestep에 대한 데이터가 됩니다.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Rpju-Sc2HFm0"
},
"outputs": [],
"source": [
"# Converting to numpy arrays\n",
"\n",
"train_data = train.values\n",
"test_data = test.values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Selecting the timesteps\n",
"\n",
"timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "O-JrsrsVJhUQ",
"outputId": "c90dbe71-bacc-4ec4-b452-f82fe5aefaef"
},
"outputs": [],
"source": [
"# Converting data to 2D tensor\n",
"\n",
"train_data_timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "exJD8AI7KE4g",
"outputId": "ce90260c-f327-427d-80f2-77307b5a6318"
},
"outputs": [],
"source": [
"# Converting test data to 2D tensor\n",
"\n",
"test_data_timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2u0R2sIsLuq5"
},
"outputs": [],
"source": [
"x_train, y_train = None\n",
"x_test, y_test = None\n",
"\n",
"print(x_train.shape, y_train.shape)\n",
"print(x_test.shape, y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8wIPOtAGLZlh"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EhA403BEPEiD"
},
"outputs": [],
"source": [
"# Create model using RBF kernel\n",
"\n",
"model = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GS0UA3csMbqp",
"outputId": "d86b6f05-5742-4c1d-c2db-c40510bd4f0d"
},
"outputs": [],
"source": [
"# Fit model on training data"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Rz_x8S3UrlcF"
},
"source": [
"### 모델 예측 만들기\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XR0gnt3MnuYS",
"outputId": "157e40ab-9a23-4b66-a885-0d52a24b2364"
},
"outputs": [],
"source": [
"# Making predictions\n",
"\n",
"y_train_pred = None\n",
"y_test_pred = None"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_2epncg-SGzr"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Scaling the predictions\n",
"\n",
"y_train_pred = scaler.inverse_transform(y_train_pred)\n",
"y_test_pred = scaler.inverse_transform(y_test_pred)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xmm_YLXhq7gV",
"outputId": "18392f64-4029-49ac-c71a-a4e2411152a1"
},
"outputs": [],
"source": [
"# Scaling the original values\n",
"\n",
"y_train = scaler.inverse_transform(y_train)\n",
"y_test = scaler.inverse_transform(y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "u3LBj93coHEi",
"outputId": "d4fd49e8-8c6e-4bb0-8ef9-ca0b26d725b4"
},
"outputs": [],
"source": [
"# Extract the timesteps for x-axis\n",
"\n",
"train_timestamps = None\n",
"test_timestamps = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(25,6))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.title(\"Training data prediction\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LnhzcnYtXHCm",
"outputId": "f5f0d711-f18b-4788-ad21-d4470ea2c02b"
},
"outputs": [],
"source": [
"print('MAPE for training data: ', mape(y_train_pred, y_train)*100, '%')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 225
},
"id": "53Q02FoqQH4V",
"outputId": "53e2d59b-5075-4765-ad9e-aed56c966583"
},
"outputs": [],
"source": [
"plt.figure(figsize=(10,3))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "clOAUH-SXCJG",
"outputId": "a3aa85ff-126a-4a4a-cd9e-90b9cc465ef5"
},
"outputs": [],
"source": [
"print('MAPE for testing data: ', mape(y_test_pred, y_test)*100, '%')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DHlKvVCId5ue"
},
"source": [
"## 전체 데이터셋 예측\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cOFJ45vreO0N",
"outputId": "35628e33-ecf9-4966-8036-f7ea86db6f16"
},
"outputs": [],
"source": [
"# Extracting load values as numpy array\n",
"data = None\n",
"\n",
"# Scaling\n",
"data = None\n",
"\n",
"# Transforming to 2D tensor as per model input requirement\n",
"data_timesteps=None\n",
"\n",
"# Selecting inputs and outputs from data\n",
"X, Y = None, None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ESSAdQgwexIi"
},
"outputs": [],
"source": [
"# Make model predictions\n",
"\n",
"# Inverse scale and reshape\n",
"Y_pred = None\n",
"Y = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 328
},
"id": "M_qhihN0RVVX",
"outputId": "a89cb23e-1d35-437f-9d63-8b8907e12f80"
},
"outputs": [],
"source": [
"plt.figure(figsize=(30,8))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "AcN7pMYXVGTK",
"outputId": "7e1c2161-47ce-496c-9d86-7ad9ae0df770"
},
"outputs": [],
"source": [
"print('MAPE: ', mape(Y_pred, Y)*100, '%')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있으나, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서를 해당 언어로 작성된 상태에서 권위 있는 자료로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생하는 오해나 잘못된 해석에 대해 당사는 책임을 지지 않습니다. \n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "Recurrent_Neural_Networks.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"coopTranslator": {
"original_hash": "e86ce102239a14c44585623b9b924a74",
"translation_date": "2025-09-04T01:56:04+00:00",
"source_file": "7-TimeSeries/3-SVR/working/notebook.ipynb",
"language_code": "ko"
}
},
"nbformat": 4,
"nbformat_minor": 1
}