Time：2020-10-30

# Practical combat of back propagation algorithm

This back propagation algorithm is based on the back propagation algorithm (BP) formula of neural network in the previous article. If you do not understand the back propagation algorithm formula, we strongly recommend that you refer to the previous article.

We will achieve a`4`Layer of full connection network, to complete the task of two categories. The number of network input nodes is`2`The number of nodes in the hidden layer is designed as follows:`25、50`and`25`, two nodes in the output layer, indicating that they belong to the category`1`Probability and category of`2`As shown in the figure below. It’s not used here`Softmax`The function constrains the sum of the network output probability values, but directly uses the mean square error function to calculate and`One-hot`All network activation functions are used to encode the error between real tags`Sigmoid`Function, these designs are to be able to use our gradient propagation formula directly. ``````import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split``````

## 1. Prepare data

``````X, y = datasets.make_moons(n_samples=1000, noise=0.2, random_state=100)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
print(X.shape, y.shape)  # (1000, 2) (1000,)``````
``````(1000, 2) (1000,)

``````
``````def make_plot(X, y, plot_name):
plt.figure(figsize=(12, 8))
plt.title(plot_name, fontsize=30)
plt.scatter(X[y==0, 0], X[y==0, 1])
plt.scatter(X[y==1, 0], X[y==1, 1])``````
``make_plot(X, y, "Classification Dataset Visualization ") `` ## Layer network 2

• By creating a new class`Layer`To implement a network layer, the number of input nodes, output nodes, activation function type and other parameters are needed
• Weight `weights`And bias tensor`bias`It is generated and initialized automatically according to the number of input and output nodes
``````class Layer:
#Full link network layer
def __init__(self, n_input, n_output, activation=None, weights=None, bias=None):
"""
:param int n_ Input: the number of input nodes
:param int n_ Output: number of output nodes
: param STR activation: activation function type
: param weights: weight tensor, generated internally by default class
: param bias: offset, generated internally by default class
"""
self.weights = weights if weights is not None else np.random.randn(n_input, n_output) * np.sqrt(1 / n_output)
self.bias = bias if bias is not None else np.random.rand(n_output) * 0.1
self.activation  =Activation ා activation function type, such as' sigmoid '
self.activation_ Output = none ා the output value o of the activation function
self.error  =None # the intermediate variable used to calculate the delta variable of the current layer
self.delta  =None ා records the delta variable of the current layer, which is used to calculate the gradient

def activate(self, X):
#Forward calculation function
r = np.dot(X, self.weights) + self.bias # [email protected] + b
#Through the activation function, the output o (activation) of the full connection layer is obtained_ output)
self.activation_output = self._apply_activation(r)
return self.activation_output

def _ apply_ Activation (self, R): calculates the output of the activation function
if self.activation is None:
Return R ා no activation function, return directly
elif self.activation == 'relu':
return np.maximum(r, 0)
elif self.activation == 'tanh':
return np.tanh(r)
elif self.activation == 'sigmoid':
return 1 / (1 + np.exp(-r))

return r

def apply_activation_derivative(self, r):
#Calculate the derivative of the activation function
#There is no activation function and the derivative is 1
if self.activation is None:
return np.ones_like(r)
#Derivative of relu function
elif self.activation == 'relu':
grad = np.array(r, copy=True)
grad[r > 0] = 1.
grad[r <= 0] = 0.
return grad
#Derivative realization of tanh function
elif self.activation == 'tanh':
return 1 - r ** 2
#Derivative realization of sigmoid function
elif self.activation == 'sigmoid':
return r * (1 - r)
return r``````

## 3. Network model

• After creating a single layer network class, we implement the`NeuralNetwork`class
• It maintains the network layer of each layer internally`Layer`Class object, which can be accessed through`add_layer`Function appends the network layer,
• To achieve the purpose of creating different network models.
``y_test.flatten().shape # (300,)``
``````(300,)

``````
``````class NeuralNetwork:
def __init__(self):
self._ Layers = [] ා list of network layer objects

def add_layer(self, layer):
self._layers.append(layer)

def feed_forward(self, X):
#Forward propagation (derivation)
for layer in self._layers:
X = layer.activate(X)
return X

def backpropagation(self, X, y, learning_rate):
#Implementation of back propagation algorithm
#Calculate forward to get the final output value
output = self.feed_forward(X)
for i in reversed(range(len(self._ Layers)): reverse loop
layer = self._layers[i]
if layer == self._ Layers [- 1]: ා if it is an output layer
layer.error = y - output
#Calculate the delta of the last layer and refer to the gradient formula of the output layer
layer.delta = layer.error * layer.apply_activation_derivative(output)
Else: if it is a hidden layer
next_layer = self._layers[i + 1]
layer.error = np.dot(next_layer.weights, next_layer.delta)
layer.delta = layer.error*layer.apply_activation_derivative(layer.activation_output)

#Loop update weights
for i in range(len(self._layers)):
layer = self._layers[i]
# o_ I is the output of the previous network layer
o_i = np.atleast_2d(X if i == 0 else self._layers[i - 1].activation_output)
#In gradient descent algorithm, delta is a negative number in the formula, so the plus sign is used here
layer.weights += layer.delta * o_i.T * learning_rate

def train(self, X_train, X_test, y_train, y_test, learning_rate, max_epochs):
#Network training function
#One hot coding
y_onehot = np.zeros((y_train.shape, 2))
y_onehot[np.arange(y_train.shape), y_train] = 1
mses = []
for i in range(max_ Epochs: train 100 epochs
for j in range(len(X_ Train): train one sample at a time
self.backpropagation(X_train[j], y_onehot[j], learning_rate)
if i % 10 == 0:
#Print out MSE loss
mse = np.mean(np.square(y_onehot - self.feed_forward(X_train)))
mses.append(mse)
print('Epoch: #%s, MSE: %f, Accuracy: %.2f%%' %
(i, float(mse), self.accuracy(self.predict(X_test), y_test.flatten()) * 100))

return mses

def accuracy(self, y_ predict, y_ Test: calculation accuracy
return np.sum(y_predict == y_test) / len(y_test)

def predict(self, X_predict):
Y_ predict =  self.feed_ forward(X_ Predict) ා y at this time_ The predict shape is [600 * 2], and the second dimension represents the probability of two outputs
y_predict = np.argmax(y_predict, axis=1)
return y_predict``````

## 4. Network training

``````NN = neuralnetwork() ා instantiate network class
nn.add_ Layer (layer (2, 25, 'sigmoid'))? Hidden layer 1, 2 = > 25
nn.add_ Layer (layer (25, 50, 'sigmoid'))? Hidden layer 2, 25 = > 50
nn.add_ Layer (layer (50, 25, 'sigmoid'))? Hidden layer 3, 50 = > 25
nn.add_ Layer (layer (25, 2, 'sigmoid')) (output layer, 25 = > 2``````
``# nn.train(X_train, X_test, y_train, y_test, learning_rate=0.01, max_epochs=50)``
``````def plot_decision_boundary(model, axis):

x0, x1 = np.meshgrid(
np.linspace(axis, axis, int((axis - axis)*100)).reshape(1, -1),
np.linspace(axis, axis, int((axis - axis)*100)).reshape(-1, 1)
)
X_new = np.c_[x0.ravel(), x1.ravel()]

y_predic = model.predict(X_new)
zz = y_predic.reshape(x0.shape)

from matplotlib.colors import ListedColormap
custom_cmap = ListedColormap(['#EF9A9A', '#FFF590', '#90CAF9'])

plt.contourf(x0, x1, zz, linewidth=5, cmap=custom_cmap)``````
``````plt.figure(figsize=(12, 8))
plot_decision_boundary(nn, [-2, 2.5, -1, 2])
plt.scatter(X[y==0, 0], X[y==0, 1])
plt.scatter(X[y==1, 0], X[y==1, 1])``````
``````<matplotlib.collections.PathCollection at 0x29018d6dfd0>

`````` ``y_predict = nn.predict(X_test)``
``y_predict[:10] # array([1, 1, 0, 1, 0, 0, 0, 1, 1, 1], dtype=int64)``
``````array([1, 1, 0, 1, 0, 0, 0, 1, 1, 1], dtype=int64)

``````
``y_test[:10] # array([1, 1, 0, 1, 0, 0, 0, 1, 1, 1], dtype=int64)``
``````array([1, 1, 0, 1, 0, 0, 0, 1, 1, 1], dtype=int64)

``````
``nn.accuracy(y_predict, y_test.flatten()) # 0.86``
``````0.86
``````

## Choose react or angular 2

Original addressChoosing between React vs. Angular 2The following is the translation of this article, which can be used at your choiceReactperhapsAngular2We need to help when we need to. React has become a cool representative in 2015, but angular.js has changed from a front-end framework that people love to a terrible devil (and not so terrible…) […]