Sigmoid function



Sigmoid function (logistic function) is a very common activation function in neural networks. Let’s have a deep understanding of sigmoid function today.

Function form

Sigmoid function

Function image

Sigmoid function

code implementation

Code run: colab

import matplotlib.pyplot as plt
import numpy as np
import math

x = np.linspace(-10, 10, 100)
z = 1 / (1 + np.exp(-x))

plt.plot(x, z)


Nature and problems

The value range of function value s (x) is (0, 1), which is often used in binary classification problems. The function is smooth and easy to derive. However, as an activation function, it has a large amount of calculation. When calculating the error gradient by back propagation, there is division in the derivation, which is prone to the disappearance of the gradient. When the input is close to positive infinity or negative infinity, the gradient approaches 0 and gradient dispersion occurs (with the increase of the number of network layers, when using the back propagation algorithm to calculate the gradient, the gradient disappears very obviously from the output layer to the first few layers, resulting in the derivative of the overall loss function to the weight of the first few layers is very small. Therefore, when using the gradient descent algorithm, the weight of the first few layers changes very slowly, and even useful features cannot be learned.) . because the sigmoid function value is greater than 0, the weight update can only be updated in one direction, which may affect the convergence speed.


Sigmoid function is a very common activation function in neural networks. It is widely used in logistic regression and in the fields of statistics and machine learning.

  • Original from: rais