ML-For-Beginners/translations/ru/7-TimeSeries/3-SVR/working/notebook.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fv9OoQsMFk5A"
   },
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "В этом блокноте мы демонстрируем, как:\n",
    "\n",
    "- подготовить двумерные временные ряды для обучения модели регрессора SVM\n",
    "- реализовать SVR с использованием RBF ядра\n",
    "- оценить модель с помощью графиков и MAPE\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Импорт модулей\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append('../../')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "M687KNlQFp0-"
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import warnings\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import datetime as dt\n",
    "import math\n",
    "\n",
    "from sklearn.svm import SVR\n",
    "from sklearn.preprocessing import MinMaxScaler\n",
    "from common.utils import load_data, mape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Cj-kfVdMGjWP"
   },
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8fywSjC6GsRz"
   },
   "source": [
    "### Загрузить данные\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 363
    },
    "id": "aBDkEB11Fumg",
    "outputId": "99cf7987-0509-4b73-8cc2-75d7da0d2740"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>load</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2012-01-01 00:00:00</th>\n",
       "      <td>2698.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012-01-01 01:00:00</th>\n",
       "      <td>2558.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012-01-01 02:00:00</th>\n",
       "      <td>2444.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012-01-01 03:00:00</th>\n",
       "      <td>2402.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012-01-01 04:00:00</th>\n",
       "      <td>2403.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                       load\n",
       "2012-01-01 00:00:00  2698.0\n",
       "2012-01-01 01:00:00  2558.0\n",
       "2012-01-01 02:00:00  2444.0\n",
       "2012-01-01 03:00:00  2402.0\n",
       "2012-01-01 04:00:00  2403.0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "energy = load_data('../../data')[['load']]\n",
    "energy.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "O0BWP13rGnh4"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 486
    },
    "id": "hGaNPKu_Gidk",
    "outputId": "7f89b326-9057-4f49-efbe-cb100ebdf76d"
   },
   "outputs": [],
   "source": [
    "energy.plot(y='load', subplots=True, figsize=(15, 8), fontsize=12)\n",
    "plt.xlabel('timestamp', fontsize=12)\n",
    "plt.ylabel('load', fontsize=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IPuNor4eGwYY"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ysvsNyONGt0Q"
   },
   "outputs": [],
   "source": [
    "train_start_dt = '2014-11-01 00:00:00'\n",
    "test_start_dt = '2014-12-30 00:00:00'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 548
    },
    "id": "SsfdLoPyGy9w",
    "outputId": "d6d6c25b-b1f4-47e5-91d1-707e043237d7"
   },
   "outputs": [],
   "source": [
    "energy[(energy.index < test_start_dt) & (energy.index >= train_start_dt)][['load']].rename(columns={'load':'train'}) \\\n",
    "    .join(energy[test_start_dt:][['load']].rename(columns={'load':'test'}), how='outer') \\\n",
    "    .plot(y=['train', 'test'], figsize=(15, 8), fontsize=12)\n",
    "plt.xlabel('timestamp', fontsize=12)\n",
    "plt.ylabel('load', fontsize=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XbFTqBw6G1Ch"
   },
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Теперь вам нужно подготовить данные для обучения, выполняя фильтрацию и масштабирование ваших данных.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "cYivRdQpHDj3",
    "outputId": "a138f746-461c-4fd6-bfa6-0cee094c4aa1"
   },
   "outputs": [],
   "source": [
    "train = energy.copy()[(energy.index >= train_start_dt) & (energy.index < test_start_dt)][['load']]\n",
    "test = energy.copy()[energy.index >= test_start_dt][['load']]\n",
    "\n",
    "print('Training data shape: ', train.shape)\n",
    "print('Test data shape: ', test.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Масштабируйте данные, чтобы они находились в диапазоне (0, 1).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 363
    },
    "id": "3DNntGQnZX8G",
    "outputId": "210046bc-7a66-4ccd-d70d-aa4a7309949c"
   },
   "outputs": [],
   "source": [
    "scaler = MinMaxScaler()\n",
    "train['load'] = scaler.fit_transform(train)\n",
    "train.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "26Yht-rzZexe",
    "outputId": "20326077-a38a-4e78-cc5b-6fd7af95d301"
   },
   "outputs": [],
   "source": [
    "test['load'] = scaler.transform(test)\n",
    "test.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "x0n6jqxOQ41Z"
   },
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fdmxTZtOQ8xs"
   },
   "source": [
    "Для нашего SVR мы преобразуем входные данные в форму `[batch, timesteps]`. Таким образом, мы изменяем форму существующих `train_data` и `test_data`, добавляя новое измерение, которое соответствует временным шагам. В нашем примере мы берем `timesteps = 5`. Таким образом, входными данными для модели будут данные за первые 4 временных шага, а выходом будут данные за 5 временной шаг.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Rpju-Sc2HFm0"
   },
   "outputs": [],
   "source": [
    "# Converting to numpy arrays\n",
    "\n",
    "train_data = train.values\n",
    "test_data = test.values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Selecting the timesteps\n",
    "\n",
    "timesteps=None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "O-JrsrsVJhUQ",
    "outputId": "c90dbe71-bacc-4ec4-b452-f82fe5aefaef"
   },
   "outputs": [],
   "source": [
    "# Converting data to 2D tensor\n",
    "\n",
    "train_data_timesteps=None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "exJD8AI7KE4g",
    "outputId": "ce90260c-f327-427d-80f2-77307b5a6318"
   },
   "outputs": [],
   "source": [
    "# Converting test data to 2D tensor\n",
    "\n",
    "test_data_timesteps=None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "2u0R2sIsLuq5"
   },
   "outputs": [],
   "source": [
    "x_train, y_train = None\n",
    "x_test, y_test = None\n",
    "\n",
    "print(x_train.shape, y_train.shape)\n",
    "print(x_test.shape, y_test.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8wIPOtAGLZlh"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "EhA403BEPEiD"
   },
   "outputs": [],
   "source": [
    "# Create model using RBF kernel\n",
    "\n",
    "model = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "GS0UA3csMbqp",
    "outputId": "d86b6f05-5742-4c1d-c2db-c40510bd4f0d"
   },
   "outputs": [],
   "source": [
    "# Fit model on training data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Rz_x8S3UrlcF"
   },
   "source": [
    "### Сделать прогноз модели\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "XR0gnt3MnuYS",
    "outputId": "157e40ab-9a23-4b66-a885-0d52a24b2364"
   },
   "outputs": [],
   "source": [
    "# Making predictions\n",
    "\n",
    "y_train_pred = None\n",
    "y_test_pred = None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_2epncg-SGzr"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Scaling the predictions\n",
    "\n",
    "y_train_pred = scaler.inverse_transform(y_train_pred)\n",
    "y_test_pred = scaler.inverse_transform(y_test_pred)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "xmm_YLXhq7gV",
    "outputId": "18392f64-4029-49ac-c71a-a4e2411152a1"
   },
   "outputs": [],
   "source": [
    "# Scaling the original values\n",
    "\n",
    "y_train = scaler.inverse_transform(y_train)\n",
    "y_test = scaler.inverse_transform(y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "u3LBj93coHEi",
    "outputId": "d4fd49e8-8c6e-4bb0-8ef9-ca0b26d725b4"
   },
   "outputs": [],
   "source": [
    "# Extract the timesteps for x-axis\n",
    "\n",
    "train_timestamps = None\n",
    "test_timestamps = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(25,6))\n",
    "# plot original output\n",
    "# plot predicted output\n",
    "plt.legend(['Actual','Predicted'])\n",
    "plt.xlabel('Timestamp')\n",
    "plt.title(\"Training data prediction\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "LnhzcnYtXHCm",
    "outputId": "f5f0d711-f18b-4788-ad21-d4470ea2c02b"
   },
   "outputs": [],
   "source": [
    "print('MAPE for training data: ', mape(y_train_pred, y_train)*100, '%')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 225
    },
    "id": "53Q02FoqQH4V",
    "outputId": "53e2d59b-5075-4765-ad9e-aed56c966583"
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10,3))\n",
    "# plot original output\n",
    "# plot predicted output\n",
    "plt.legend(['Actual','Predicted'])\n",
    "plt.xlabel('Timestamp')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "clOAUH-SXCJG",
    "outputId": "a3aa85ff-126a-4a4a-cd9e-90b9cc465ef5"
   },
   "outputs": [],
   "source": [
    "print('MAPE for testing data: ', mape(y_test_pred, y_test)*100, '%')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DHlKvVCId5ue"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "cOFJ45vreO0N",
    "outputId": "35628e33-ecf9-4966-8036-f7ea86db6f16"
   },
   "outputs": [],
   "source": [
    "# Extracting load values as numpy array\n",
    "data = None\n",
    "\n",
    "# Scaling\n",
    "data = None\n",
    "\n",
    "# Transforming to 2D tensor as per model input requirement\n",
    "data_timesteps=None\n",
    "\n",
    "# Selecting inputs and outputs from data\n",
    "X, Y = None, None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ESSAdQgwexIi"
   },
   "outputs": [],
   "source": [
    "# Make model predictions\n",
    "\n",
    "# Inverse scale and reshape\n",
    "Y_pred = None\n",
    "Y = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 328
    },
    "id": "M_qhihN0RVVX",
    "outputId": "a89cb23e-1d35-437f-9d63-8b8907e12f80"
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(30,8))\n",
    "# plot original output\n",
    "# plot predicted output\n",
    "plt.legend(['Actual','Predicted'])\n",
    "plt.xlabel('Timestamp')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "AcN7pMYXVGTK",
    "outputId": "7e1c2161-47ce-496c-9d86-7ad9ae0df770"
   },
   "outputs": [],
   "source": [
    "print('MAPE: ', mape(Y_pred, Y)*100, '%')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n---\n\n**Отказ от ответственности**:  \nЭтот документ был переведен с помощью сервиса автоматического перевода [Co-op Translator](https://github.com/Azure/co-op-translator). Хотя мы стремимся к точности, пожалуйста, учитывайте, что автоматические переводы могут содержать ошибки или неточности. Оригинальный документ на его исходном языке следует считать авторитетным источником. Для получения критически важной информации рекомендуется профессиональный перевод человеком. Мы не несем ответственности за любые недоразумения или неправильные интерпретации, возникшие в результате использования данного перевода.\n"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "Recurrent_Neural_Networks.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  },
  "coopTranslator": {
   "original_hash": "e86ce102239a14c44585623b9b924a74",
   "translation_date": "2025-08-29T23:24:15+00:00",
   "source_file": "7-TimeSeries/3-SVR/working/notebook.ipynb",
   "language_code": "ru"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}