You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/bg/7-TimeSeries/3-SVR/working/notebook.ipynb

705 lines
17 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "fv9OoQsMFk5A"
},
"source": [
"# Прогнозиране на времеви редове с помощта на регресор с опорни вектори\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"В този тефтер ще демонстрираме как да:\n",
"\n",
"- подготвим двумерни времеви редове за обучение на SVM регресионен модел\n",
"- реализираме SVR с помощта на RBF ядро\n",
"- оценим модела чрез графики и MAPE\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Импортиране на модули\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"sys.path.append('../../')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "M687KNlQFp0-"
},
"outputs": [],
"source": [
"import os\n",
"import warnings\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"import datetime as dt\n",
"import math\n",
"\n",
"from sklearn.svm import SVR\n",
"from sklearn.preprocessing import MinMaxScaler\n",
"from common.utils import load_data, mape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cj-kfVdMGjWP"
},
"source": [
"## Подготовка на данни\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8fywSjC6GsRz"
},
"source": [
"### Зареди данни\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "aBDkEB11Fumg",
"outputId": "99cf7987-0509-4b73-8cc2-75d7da0d2740"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>load</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2012-01-01 00:00:00</th>\n",
" <td>2698.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 01:00:00</th>\n",
" <td>2558.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 02:00:00</th>\n",
" <td>2444.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 03:00:00</th>\n",
" <td>2402.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2012-01-01 04:00:00</th>\n",
" <td>2403.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" load\n",
"2012-01-01 00:00:00 2698.0\n",
"2012-01-01 01:00:00 2558.0\n",
"2012-01-01 02:00:00 2444.0\n",
"2012-01-01 03:00:00 2402.0\n",
"2012-01-01 04:00:00 2403.0"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"energy = load_data('../../data')[['load']]\n",
"energy.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "O0BWP13rGnh4"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 486
},
"id": "hGaNPKu_Gidk",
"outputId": "7f89b326-9057-4f49-efbe-cb100ebdf76d"
},
"outputs": [],
"source": [
"energy.plot(y='load', subplots=True, figsize=(15, 8), fontsize=12)\n",
"plt.xlabel('timestamp', fontsize=12)\n",
"plt.ylabel('load', fontsize=12)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IPuNor4eGwYY"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ysvsNyONGt0Q"
},
"outputs": [],
"source": [
"train_start_dt = '2014-11-01 00:00:00'\n",
"test_start_dt = '2014-12-30 00:00:00'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 548
},
"id": "SsfdLoPyGy9w",
"outputId": "d6d6c25b-b1f4-47e5-91d1-707e043237d7"
},
"outputs": [],
"source": [
"energy[(energy.index < test_start_dt) & (energy.index >= train_start_dt)][['load']].rename(columns={'load':'train'}) \\\n",
" .join(energy[test_start_dt:][['load']].rename(columns={'load':'test'}), how='outer') \\\n",
" .plot(y=['train', 'test'], figsize=(15, 8), fontsize=12)\n",
"plt.xlabel('timestamp', fontsize=12)\n",
"plt.ylabel('load', fontsize=12)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XbFTqBw6G1Ch"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Сега трябва да подготвите данните за обучение, като извършите филтриране и мащабиране на данните си.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cYivRdQpHDj3",
"outputId": "a138f746-461c-4fd6-bfa6-0cee094c4aa1"
},
"outputs": [],
"source": [
"train = energy.copy()[(energy.index >= train_start_dt) & (energy.index < test_start_dt)][['load']]\n",
"test = energy.copy()[energy.index >= test_start_dt][['load']]\n",
"\n",
"print('Training data shape: ', train.shape)\n",
"print('Test data shape: ', test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Мащабирайте данните, за да бъдат в диапазона (0, 1).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "3DNntGQnZX8G",
"outputId": "210046bc-7a66-4ccd-d70d-aa4a7309949c"
},
"outputs": [],
"source": [
"scaler = MinMaxScaler()\n",
"train['load'] = scaler.fit_transform(train)\n",
"train.head(5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"id": "26Yht-rzZexe",
"outputId": "20326077-a38a-4e78-cc5b-6fd7af95d301"
},
"outputs": [],
"source": [
"test['load'] = scaler.transform(test)\n",
"test.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "x0n6jqxOQ41Z"
},
"source": [
"### Създаване на данни с времеви стъпки\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fdmxTZtOQ8xs"
},
"source": [
"За нашия SVR преобразуваме входните данни във формата `[batch, timesteps]`. Затова променяме формата на съществуващите `train_data` и `test_data`, така че да има ново измерение, което се отнася до времевите стъпки. В нашия пример вземаме `timesteps = 5`. Така че входовете към модела са данните за първите 4 времеви стъпки, а изходът ще бъдат данните за 5-тата времева стъпка.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Rpju-Sc2HFm0"
},
"outputs": [],
"source": [
"# Converting to numpy arrays\n",
"\n",
"train_data = train.values\n",
"test_data = test.values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Selecting the timesteps\n",
"\n",
"timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "O-JrsrsVJhUQ",
"outputId": "c90dbe71-bacc-4ec4-b452-f82fe5aefaef"
},
"outputs": [],
"source": [
"# Converting data to 2D tensor\n",
"\n",
"train_data_timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "exJD8AI7KE4g",
"outputId": "ce90260c-f327-427d-80f2-77307b5a6318"
},
"outputs": [],
"source": [
"# Converting test data to 2D tensor\n",
"\n",
"test_data_timesteps=None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2u0R2sIsLuq5"
},
"outputs": [],
"source": [
"x_train, y_train = None\n",
"x_test, y_test = None\n",
"\n",
"print(x_train.shape, y_train.shape)\n",
"print(x_test.shape, y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8wIPOtAGLZlh"
},
"source": [
"## Създаване на SVR модел\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EhA403BEPEiD"
},
"outputs": [],
"source": [
"# Create model using RBF kernel\n",
"\n",
"model = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GS0UA3csMbqp",
"outputId": "d86b6f05-5742-4c1d-c2db-c40510bd4f0d"
},
"outputs": [],
"source": [
"# Fit model on training data"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Rz_x8S3UrlcF"
},
"source": [
"### Направете прогноза на модела\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XR0gnt3MnuYS",
"outputId": "157e40ab-9a23-4b66-a885-0d52a24b2364"
},
"outputs": [],
"source": [
"# Making predictions\n",
"\n",
"y_train_pred = None\n",
"y_test_pred = None"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_2epncg-SGzr"
},
"source": [
"## Анализиране на производителността на модела\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Scaling the predictions\n",
"\n",
"y_train_pred = scaler.inverse_transform(y_train_pred)\n",
"y_test_pred = scaler.inverse_transform(y_test_pred)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xmm_YLXhq7gV",
"outputId": "18392f64-4029-49ac-c71a-a4e2411152a1"
},
"outputs": [],
"source": [
"# Scaling the original values\n",
"\n",
"y_train = scaler.inverse_transform(y_train)\n",
"y_test = scaler.inverse_transform(y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "u3LBj93coHEi",
"outputId": "d4fd49e8-8c6e-4bb0-8ef9-ca0b26d725b4"
},
"outputs": [],
"source": [
"# Extract the timesteps for x-axis\n",
"\n",
"train_timestamps = None\n",
"test_timestamps = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(25,6))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.title(\"Training data prediction\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LnhzcnYtXHCm",
"outputId": "f5f0d711-f18b-4788-ad21-d4470ea2c02b"
},
"outputs": [],
"source": [
"print('MAPE for training data: ', mape(y_train_pred, y_train)*100, '%')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 225
},
"id": "53Q02FoqQH4V",
"outputId": "53e2d59b-5075-4765-ad9e-aed56c966583"
},
"outputs": [],
"source": [
"plt.figure(figsize=(10,3))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "clOAUH-SXCJG",
"outputId": "a3aa85ff-126a-4a4a-cd9e-90b9cc465ef5"
},
"outputs": [],
"source": [
"print('MAPE for testing data: ', mape(y_test_pred, y_test)*100, '%')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DHlKvVCId5ue"
},
"source": [
"## Прогноза за пълния набор от данни\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cOFJ45vreO0N",
"outputId": "35628e33-ecf9-4966-8036-f7ea86db6f16"
},
"outputs": [],
"source": [
"# Extracting load values as numpy array\n",
"data = None\n",
"\n",
"# Scaling\n",
"data = None\n",
"\n",
"# Transforming to 2D tensor as per model input requirement\n",
"data_timesteps=None\n",
"\n",
"# Selecting inputs and outputs from data\n",
"X, Y = None, None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ESSAdQgwexIi"
},
"outputs": [],
"source": [
"# Make model predictions\n",
"\n",
"# Inverse scale and reshape\n",
"Y_pred = None\n",
"Y = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 328
},
"id": "M_qhihN0RVVX",
"outputId": "a89cb23e-1d35-437f-9d63-8b8907e12f80"
},
"outputs": [],
"source": [
"plt.figure(figsize=(30,8))\n",
"# plot original output\n",
"# plot predicted output\n",
"plt.legend(['Actual','Predicted'])\n",
"plt.xlabel('Timestamp')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "AcN7pMYXVGTK",
"outputId": "7e1c2161-47ce-496c-9d86-7ad9ae0df770"
},
"outputs": [],
"source": [
"print('MAPE: ', mape(Y_pred, Y)*100, '%')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Отказ от отговорност**: \nТози документ е преведен с помощта на AI услуга за превод [Co-op Translator](https://github.com/Azure/co-op-translator). Въпреки че се стремим към точност, моля, имайте предвид, че автоматизираните преводи може да съдържат грешки или неточности. Оригиналният документ на неговия роден език трябва да се счита за авторитетен източник. За критична информация се препоръчва професионален човешки превод. Ние не носим отговорност за недоразумения или погрешни интерпретации, произтичащи от използването на този превод.\n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "Recurrent_Neural_Networks.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"coopTranslator": {
"original_hash": "e86ce102239a14c44585623b9b924a74",
"translation_date": "2025-09-04T07:38:36+00:00",
"source_file": "7-TimeSeries/3-SVR/working/notebook.ipynb",
"language_code": "bg"
}
},
"nbformat": 4,
"nbformat_minor": 1
}