Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

Time:2022-8-7

Original link:http://tecdat.cn/?p=23544 

Below is an example on how to use a Long Short Term Memory (LSTM) network to fit an unstable time series.

Annual rainfall data can be quite volatile. Unlike temperature, which usually shows a clear trend across the four seasons, rainfall as a time series can be quite erratic. It is common to see as much rainfall in summer as in winter.

Below is an illustration of the rainfall for a region in November 2020.

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

As a continuous neural network, the LSTM model can prove advantageous in explaining the volatility of time series.

Using the Ljung-Box test, a p-value less than 0.05 indicates that the residuals in this time series exhibit a random pattern, indicating significant volatility.

>>> sm.stats.acorr_ljungbox(res.resid, lags=\[10\])

Ljung-Box test

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

Dickey-Fuller test

 Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

Data manipulation and model configuration

The dataset consists of 722 months of rainfall data.

712 data points were selected for training and validation, i.e. for building the LSTM model. Then, the data from the past 10 months was used as test data to compare with the prediction results of the LSTM model.

Below is a snippet of the dataset.

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

A matrix of datasets is then formed to regress the time series against past values.

# form the dataset matrix

    for i in range(len(df)-previous-1):
        a = df\[i:(i+previous), 0\]
        dataX.append(a)
        dataY.append(df\[i + previous, 0\])

The data were then normalized with MinMaxScaler.

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

Setting the previous parameter to 120, the training and validation datasets are established. For reference, previous = 120 indicates that the model uses past values ​​from t – 120 to t – 1 to predict rainfall values ​​at time t.

The choice of the former parameter was a matter of experimentation, but the 120 time periods were chosen to ensure that volatility or extreme values ​​in the time series were identified.

# Split the training and validation data
train_size = int(len(df) * 0.8)
val\_size = len(df) - train\_size
train, val = df\[0:train\_size,:\], df\[train\_size:len(df),:\]# Number of previous periods
previous = 120

Then, the input is transformed into the format of samples, time steps, features.

# Convert input to \[samples, timesteps, features\].
np.reshape(X_train, (shape\[0\], 1, shape\[1\]))

Model training and prediction

The model was trained for 100 epochs with a specified batch size of 712 (equal to the number of data points in the training and validation sets).

# Generate LSTM network
model = tf.keras.Sequential()
# list all data in history
print(history.history.keys())
# Summarize accuracy changes
plt.plot(history.history\['loss'\])

Below is a graph of the model loss for the training and validation sets.

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

A plot of predicted versus actual rainfall is also generated.

# draw all forecasts
plt.plot(valpredPlot)

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

The prediction results are compared to the validation set on the basis of Mean Directional Accuracy (MDA), Mean Square Root Error (RMSE), and Mean Prediction Error (MFE).

 mda(Y_val, predictions)0.9090909090909091
>>> mse = mean\_squared\_error(Y_val, predictions)
>>> rmse = sqrt(mse)
>>> forecast_error
>>> mean\_forecast\_error = np.mean(forecast_error)

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

 Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

 Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

  • MDA: 0.909
  • RMSE: 48.5
  • MFE: -1.77

Make predictions on test data

While the validation set results are quite impressive, it is only by comparing the model predictions to test (or unseen) data that we can have reasonable confidence in the predictive power of the LSTM model.

As mentioned earlier, rainfall data from the past 10 months was used as the test set. The LSTM model was then used to predict the situation 10 months into the future, and the predictions were compared with the actual values.

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

The previous values ​​up to t-120 are used to predict the value at time t.

# test (unseen) predictions
np.array(\[tseries.iloctseries.iloc,t

The results obtained are as follows

  • MDA: 0.8
  • RMSE: 49.57
  • MFE: -6.94

The average rainfall over the past 10 months was 148.93 mm, and the prediction accuracy showed similar performance to the validation set with low error relative to the average rainfall calculated for the entire test set.

in conclusion

In this example, you have seen:

  • How to prepare data for LSTM models
  • Build an LSTM model
  • How to test the prediction accuracy of an LSTM
  • Advantages of using LSTMs for modeling unstable time series

Tuoduan tecdat|Python uses LSTM long short-term memory neural network to predict and analyze unstable rainfall time series

Most Popular Insights

1.Python for NLP: Multi-Label Text LSTM Neural Network Classification Using Keras

2.Using Long Short-Term Memory Model LSTM for Time Series Prediction Analysis in Python – Predicting Power Consumption Data

3.python uses LSTM in Keras to solve sequence problems

4.Using PyTorch Machine Learning in Python to Classify and Predict Bank Customer Churn Model

5.R language multivariate Copula GARCH model time series forecast

6.Electricity load time series analysis using GAM (Generalized Additive Model) in r language

7.ARMA, ARIMA (Box-Jenkins), SARIMA and ARIMAX models in R language for forecasting time series numbers

8.Empirical research analysis case of R language estimating time-varying VAR model time series

9.Time Series Analysis with Generalized Additive Model GAM