Summary:This paper describes how many hidden layers are used in ANN, how many neurons are contained in each hidden layer, the purpose of using hidden layers and neurons, and the results of increasing the number of hidden layers and neurons.

The beginners of Artificial Neural Networks (ANNs) are likely to face some problems. Some of these questions include how many hidden layers are used? How many neurons are contained in each hidden layer? What is the purpose of using hidden layers and neurons? Does increasing the number of hidden layers and neurons always lead to better results?

It’s my pleasure to tell you that all these questions are answerable.

Artificial neural networks are inspired by biological neural networks. For simplicity, it is represented as a series of network layers in computer science. These layers are divided into three categories: input layer, hidden layer and output layer.

It is easy to understand the number of input and output layers and the number of their neurons. Each ANN has a separate input and output layer. The number of neurons in the input layer is equal to the number of input variables in the data being processed. The number of neurons in the output layer is equal to the number of outputs associated with each input. But the real challenge is to know the hidden layers and the number of neurons they contain.

Here are some guidelines that can be followed to know the number of hidden layers in classification problems and the number of neurons contained in each hidden layer:

Based on data, an expected decision boundary is drawn to separate multiple classes.

The decision boundary is expressed as a set of straight lines. Note that such a linear combination must be subordinate to the decision boundary.

The number of selected lines is equal to the number of hidden neurons in the first hidden layer.

A new hidden layer is added to connect the lines created by the previous layer. It should be noted that every time a connection between lines is created in the previous hidden layer, a new hidden layer is added.

The number of hidden neurons in each new hidden layer is equal to the number of connections to be established.

In order to make things clearer, let’s use the previous methods and principles to give a few examples.

## Example 1

Let’s start with the simple classification problem in the following figure. Each sample has two inputs and one output to represent the class label, which is very similar to the XOR problem.

Figure 1

The first question to answer is whether the hidden layer is necessary. To answer this question, the following rules need to be followed:

In artificial neural networks, the hidden layer is necessary only when the data must be separated by non-linearity.

Looking at Figure 2, it seems that classes must be non-linear, and a line is not complete. Therefore, we must use hidden layer to get the best decision boundary. In this case, we may still not use hidden layers, but this will affect the accuracy of classification. So it’s better to use the hidden layer.

To understand why we need hidden layers, we first need to answer the following two important questions:

1. How many hidden layers are needed?

2. What is the number of hidden neurons in each hidden layer?

The first step in the process is to draw the decision boundary that divides the two classes. At least one possible decision boundary correctly separates the data, as shown in the figure:

Figure 2

The idea of using a set of straight lines to represent decision boundaries arises from the use of single-layer perceptrons as building blocks to construct any artificial neural network. A single-layer perceptron is a linear classifier that separates multiple classes according to the linear equation created by the following equation:

y = w_1*x_1 + w_2*x_2+⋯+ w_i*x_i + b

Where x_i denotes input, w_i denotes its weight, B denotes deviation, and Y denotes output. Because each added hidden neuron will increase the number of weights, it is recommended to use the least hidden neurons to complete the task. In addition, the use of more hidden neurons than the actual need will increase the complexity.

Back to our example, artificial neural networks are constructed from multiple perceptron networks, just like networks constructed from multiple straight lines.

In this example, the decision boundary is replaced by a set of straight lines. These lines begin at the point where the boundary curve changes direction. At these points, two straight lines pass through each line in different directions.

Because only at one point, the boundary curve changes direction. The gray circle shown in the figure below requires only two straight lines. In other words, there are two single-layer perceptron networks, each of which produces only one straight line.

Figure 3

Only two straight lines are needed to represent the decision boundary, which tells us that the first hidden layer will have two hidden neurons.

So far, we have a hidden layer with two hidden neurons. Each hidden neuron can be regarded as a linear classifier, which is represented as a straight line as shown in the figure above. A hidden layer will have two outputs, each from a classifier (that is, hidden neurons). However, we need to create a single classifier with an output representing the class label, rather than two classifiers. As a result, the output of two hidden neurons is merged into one output. In other words, the two lines will be connected by another neuron, as shown in the following figure.

Figure 4

Fortunately, we don’t need a single neuron to add another hidden layer to do this, but output neurons do it. Such a neuron will merge the two previously generated lines so that only one line comes from the output of the artificial neural network.

After knowing the number of hidden layers and their neurons, the network architecture is now complete, as shown in the following figure:

Figure 5

## Example 2

Another example of classification is shown in Figure 6, which is similar to the previous example, in which there are two classes, and each sample contains two inputs and one output. The difference between the two examples is the decision boundary, which is more complex in Example 2 than in the previous examples.

Figure 6

According to the guidelines, the first step is to draw the decision boundary, which is shown in Figure 7 (a) in our discussion.

The next step is to divide the decision boundary into a group of straight lines, each of which will be simulated as a perceptron in the artificial neural network. Before drawing these lines, the points where the decision boundary changes direction should be marked as shown in Figure 7 (b).

Figure 7

So how many straight lines do we need? Each point at the top and bottom passes through two straight lines, so there are four. The middle point will have two lines shared from other points. The lines to be created are shown in Figure 8.

Figure 8

Because the first hidden layer will have the same number of hidden layer neurons as the number of straight lines, it has four neurons. In other words, it contains four classifiers, each of which is created by a single-layer perceptron. At present, the artificial neural network will produce four outputs, each of which will come from a classifier. The next step is to connect these classifiers together so that the network produces only a single output. In other words, these lines are connected by other hidden layers to form a separate curve.

The model designer chooses the layout of the network. An available network structure is to construct a second hidden layer with two hidden neurons. The first hidden neuron will connect the first two lines, and the last hidden neuron will connect the last two lines, which is the second hidden layer, as shown in Figure 9.

Figure 9

For this, there will be two separate curves, so there will be two outputs from the network. The next step is to link these curves together in order to get only one output from the whole artificial neural network. In this case, the output layer neurons can be used to make the final connection, rather than adding a new hidden layer. The final result is shown in Figure 10.

Figure 10

After the completion of the network structure design, the complete network architecture is shown in Figure 11.

Figure 11

Author: [Direction]

Read the original text

This article is the original content of Yunqi Community, which can not be reproduced without permission.