## Original link:http://tecdat.cn/?p=23544

Below is an example on how to use a Long Short Term Memory (LSTM) network to fit an unstable time series.

Annual rainfall data can be quite volatile. Unlike temperature, which usually shows a clear trend across the four seasons, rainfall as a time series can be quite erratic. It is common to see as much rainfall in summer as in winter.

Below is an illustration of the rainfall for a region in November 2020.

As a continuous neural network, the LSTM model can prove advantageous in explaining the volatility of time series.

Using the Ljung-Box test, a p-value less than 0.05 indicates that the residuals in this time series exhibit a random pattern, indicating significant volatility.

`>>> sm.stats.acorr_ljungbox(res.resid, lags=\[10\])`

### Ljung-Box test

### Dickey-Fuller test

# Data manipulation and model configuration

The dataset consists of 722 months of rainfall data.

712 data points were selected for training and validation, i.e. for building the LSTM model. Then, the data from the past 10 months was used as test data to compare with the prediction results of the LSTM model.

Below is a snippet of the dataset.

A matrix of datasets is then formed to regress the time series against past values.

```
# form the dataset matrix
for i in range(len(df)-previous-1):
a = df\[i:(i+previous), 0\]
dataX.append(a)
dataY.append(df\[i + previous, 0\])
```

The data were then normalized with MinMaxScaler.

Setting the previous parameter to 120, the training and validation datasets are established. For reference, previous = 120 indicates that the model uses past values from t – 120 to t – 1 to predict rainfall values at time t.

The choice of the former parameter was a matter of experimentation, but the 120 time periods were chosen to ensure that volatility or extreme values in the time series were identified.

```
# Split the training and validation data
train_size = int(len(df) * 0.8)
val\_size = len(df) - train\_size
train, val = df\[0:train\_size,:\], df\[train\_size:len(df),:\]# Number of previous periods
previous = 120
```

Then, the input is transformed into the format of samples, time steps, features.

```
# Convert input to \[samples, timesteps, features\].
np.reshape(X_train, (shape\[0\], 1, shape\[1\]))
```

# Model training and prediction

The model was trained for 100 epochs with a specified batch size of 712 (equal to the number of data points in the training and validation sets).

```
# Generate LSTM network
model = tf.keras.Sequential()
# list all data in history
print(history.history.keys())
# Summarize accuracy changes
plt.plot(history.history\['loss'\])
```

Below is a graph of the model loss for the training and validation sets.

A plot of predicted versus actual rainfall is also generated.

```
# draw all forecasts
plt.plot(valpredPlot)
```

The prediction results are compared to the validation set on the basis of Mean Directional Accuracy (MDA), Mean Square Root Error (RMSE), and Mean Prediction Error (MFE).

```
mda(Y_val, predictions)0.9090909090909091
>>> mse = mean\_squared\_error(Y_val, predictions)
>>> rmse = sqrt(mse)
>>> forecast_error
>>> mean\_forecast\_error = np.mean(forecast_error)
```

**MDA:**0.909**RMSE:**48.5**MFE:**-1.77

# Make predictions on test data

While the validation set results are quite impressive, it is only by comparing the model predictions to test (or unseen) data that we can have reasonable confidence in the predictive power of the LSTM model.

As mentioned earlier, rainfall data from the past 10 months was used as the test set. The LSTM model was then used to predict the situation 10 months into the future, and the predictions were compared with the actual values.

The previous values up to t-120 are used to predict the value at time t.

```
# test (unseen) predictions
np.array(\[tseries.iloctseries.iloc,t
```

The results obtained are as follows

**MDA:**0.8**RMSE:**49.57**MFE:**-6.94

The average rainfall over the past 10 months was 148.93 mm, and the prediction accuracy showed similar performance to the validation set with low error relative to the average rainfall calculated for the entire test set.

# in conclusion

In this example, you have seen:

- How to prepare data for LSTM models
- Build an LSTM model
- How to test the prediction accuracy of an LSTM
- Advantages of using LSTMs for modeling unstable time series

Most Popular Insights

1.Python for NLP: Multi-Label Text LSTM Neural Network Classification Using Keras

3.python uses LSTM in Keras to solve sequence problems

4.Using PyTorch Machine Learning in Python to Classify and Predict Bank Customer Churn Model

5.R language multivariate Copula GARCH model time series forecast

6.Electricity load time series analysis using GAM (Generalized Additive Model) in r language

8.Empirical research analysis case of R language estimating time-varying VAR model time series