In the previous lesson, you learned how to use ARIMA model to make time series predictions. Now you'll be looking at Support Vector Regressor model which is a regressor model used to predict continuous data.
## Introduction
In this lesson, you will discover a specific way to build models with [**SVM**: **S**upport **V**ector **M**achine](https://en.wikipedia.org/wiki/Support-vector_machine) for regression, or **SVR: Support Vector Regressor**.
## SVR in the context of time series
### SVR in the context of time series
Before understanding the importance of SVR in time series prediction, here are some of the important concepts that you need to know:
@ -45,7 +43,6 @@ The first few steps for data preparation are the same as that of the previous le
```python
energy = load_data('./data')[['load']]
energy.head(10)
```
5. Plot all the available energy data from January 2012 to December 2014. There should be no surprises as we saw this data in the last lesson:
@ -57,7 +54,7 @@ The first few steps for data preparation are the same as that of the previous le
plt.show()
```


Now, let's build our SVR model.
@ -101,26 +98,22 @@ Now, you need to prepare the data for training by performing filtering and scali
print('Test data shape: ', test.shape)
```
You can see the shape of the data:
```output
Training data shape: (1416, 1)
Test data shape: (48, 1)
```
2. Scale the data to be in the range (0, 1).
```python
scaler = MinMaxScaler()
train['load'] = scaler.fit_transform(train)
train.head(10)
```
4. Now that you have calibrated the scaled data, you can scale the test data:
4. Now, you scale the test data:
```python
test['load'] = scaler.transform(test)
test.head()
```
### Create data with time-steps
@ -129,7 +122,6 @@ For the SVR, you transform the input data to be of the form `[batch, timesteps]`
```python
# Converting to numpy arrays
train_data = train.values
test_data = test.values
```
@ -137,27 +129,34 @@ test_data = test.values
For this example, we take `timesteps = 5`. So, the inputs to the model are the data for the first 4 timesteps, and the output will be the data for the 5th timestep.
```python
# Selecting the timesteps
timesteps=5
```
```python
# Converting training data to 3D tensor using nested list comprehension
Converting training data to 3D tensor using nested list comprehension:
```python
train_data_timesteps=np.array([[j for j in train_data[i:i+timesteps]] for i in range(0,len(train_data)-timesteps+1)])[:,:,0]
train_data_timesteps.shape
```
```python
# Converting testing data to 3D tensor
```output
(1412, 5)
```
Converting testing data to 3D tensor:
```python
test_data_timesteps=np.array([[j for j in test_data[i:i+timesteps]] for i in range(0,len(test_data)-timesteps+1)])[:,:,0]
test_data_timesteps.shape
```
```python
# Selecting inputs and outputs from training and testing data
```output
(44, 5)
```
Selecting inputs and outputs from training and testing data:
You've built your SVR! Now we need to evaluate it.
@ -207,74 +210,145 @@ You've built your SVR! Now we need to evaluate it.
For evaluation, first we will scale back the data to our original scale. Then, to check the performance, we will plot the original and predicted time series plot, and also print the MAPE result.
#### Check model performance on training and testing data
We extract the timestamps from the dataset to show in the x-axis of our plot. Note that we are using the first ```timesteps-1``` values as out input for the first output, so the timestamps for the output will start after that.
plt.plot(Y, color = 'red', linewidth=2.0, alpha = 0.6)
plt.plot(Y_pred, color = 'blue', linewidth=0.8)
plt.legend(['Actual','Predicted'])
plt.xlabel('Timestamp')
plt.show()
```

```python
print('MAPE: ', mape(Y_pred, Y)*100, '%')
```
```output
MAPE: 2.0572089029888656 %
```
🏆 Very nice plots, showing a model with good accuracy. Well done!
---
## 🚀Challenge
- Try to tweak the hyperparameters (gamma, C, epsilon) while creating the model and evaluate on the data to see which set of hyperparameters give the best results on the testing data.
- Try to use different kernel functions for the model and analyze their performances on the dataset. A helpful document can be found [here](https://scikit-learn.org/stable/modules/svm.html#kernel-functions).