Introduction:The academic, industrial and media circles may have different views on the development level of artificial intelligence in today’s scientific and technological circles. I often hear a saying that artificial intelligence based on big data and deep learning is a completely new technological form. Its emergence can comprehensively change the social form of human beings in the future, because it can “learn” autonomously, thus largely replacing human labor.
The academic, industrial and media circles may have different views on the development level of artificial intelligence in today’s scientific and technological circles. I often hear a saying that artificial intelligence based on big data and deep learning is a completely new technological form. Its emergence can comprehensively change the social form of human beings in the future, because it can “learn” autonomously, thus largely replacing human labor.
I think there are two misunderstandings hereFirst, deep learning is not a new technology; second, the “learning” involved in deep learning technology is not the same as human learning, because it can not truly “deeply” understand the information it faces.
Deep learning is not a new technology
From the perspective of the history of technology, the predecessor of deep learning technology is actually the “artificial neural network” technology (also known as “connectionism” technology), which has been popular for a while in the 1980s.
The essence of this technology is to build a simple artificial neural network structure by mathematical modeling, and a typical structure generally includes three layers: input unit layer, intermediate unit layer and output unit layer. After the input cell layer obtains information from the outside, it “decides” whether to send further data information to the intermediate cell layer according to the built-in aggregation algorithm and excitation function of each cell. The process is just like that human neurons can “decide” whether to send data information to other neurons according to the change of potential in their own nucleus after receiving the electric pulse from other neurons Transmission pulse.
It should be noted that whether the overall task of the whole system is about image recognition or natural language processing, only from the operation state of a single computing unit in the system, the observer can not know the nature of the relevant overall task. Rather, the whole system decomposes the macro level identification task into micro information transmission activities among the components of the system in the way of “breaking the whole into parts”, and simulates the information processing process of human mind at the symbolic level through the general trend reflected by these micro information transmission activities.
The basic methods for engineers to adjust the trend of micro information transmission activities of the system are as follows: first, let the system process the input information randomly, and then compare the processing results with the ideal processing results. If the two are not in good agreement, the system triggers its own “back propagation algorithm” to adjust the connection weight between each computing unit in the system, so that the output given by the system is different from the previous output. The larger the connection weight between the two units, the more likely “co excitation” will occur between them, and vice versa. Then, the system compares the actual output with the ideal output again. If the coincidence between the two is still poor, the system starts the back propagation algorithm again until the actual output and the ideal output are consistent with each other.
The system that completes the training process can not only classify the training samples accurately, but also classify the input information that is close to the training samples relatively accurately. For example, if a system has been trained to recognize which photos in the existing photo library are Zhang San’s face, then even a new three photos that have never entered the photo library can be quickly recognized as Zhang San’s face by the system.
If the reader still doesn’t understand the above technical description, we might as well use the following analogy to further understand the operation mechanism of artificial neural network technology. Suppose a foreigner who doesn’t know Chinese goes to Shaolin Temple to learn martial arts, how should the teaching activities between teachers and students be carried out? There are two situations: the first is that they can communicate with each other in language (foreigners know Chinese or Shaolin master knows foreign language). In this way, master can directly teach his foreign apprentices by “giving rules”. This kind of education method can be compared with the number of artificial intelligence paths based on rules.
Another situation is that the master and the apprentice have no language at all. In this case, how should the students learn martial arts? The only way is as follows: the apprentice observes the master’s action first, and then learns it. The master tells the apprentice whether the action is right through simple body communication (for example, if it is right, the master smiles; if it is not right, the master drinks the apprentice). Furthermore, if the master confirms a certain action of the apprentice, the apprentice will remember the action and continue to learn; if not, the apprentice will have to guess what is wrong, and give a new action according to this guess, and continue to wait for the master’s feedback until the master is finally satisfied. Obviously, this kind of martial arts learning efficiency is very low, because the apprentice will waste a lot of time in guessing where his movements go wrong. However, the word “Hu guess” is exactly the essence of the operation of artificial neural network. In a word, such an AI system does not know what the input information it receives actually means – in other words, the designer of the system can not communicate with the system at the symbolic level, just as in the previous example, the master can not communicate with the apprentice. The reason why this kind of “low efficiency” of inefficient learning can be tolerated by computers is that computers have a huge advantage over natural people: computers can make a large number of “Guessing” in a very short physical time, and then choose a more correct solution. Once we have a clear understanding of the mechanism, it is not difficult to find that the working principle of artificial neural network is actually very clumsy.
“Deep learning” should be “deep learning”
So, why does “neural network technology” now have the successor of “deep learning”? What does the new name mean?
We have to admit that “deep learning” is a confusing term, because it will induce many laymen to think that artificial intelligence system can “deeply” understand their own learning content like human beings. But the truth is: according to the standard of human “understanding”, such a system can not achieve the most superficial understanding of the original information.
In order to avoid such misunderstanding, the author is in favor of calling “deep learning” as “deep learning”. The real meaning of “deep learning” technology is to upgrade the traditional artificial neural network technology, that is, to greatly increase the number of hidden cell layers. The advantage of this method is that it can increase the delicacy of the information processing mechanism of the whole system, so that more object features can be settled in more middle layers.
For example, in the deep learning system of face recognition, more intermediate levels can deal with the features of primary pixel, color block edge, line combination, facial features contour in different abstract levels more finely. Such a fine processing method can greatly improve the recognition ability of the whole system.
But we need to see that the mathematical complexity and data diversity of the whole system brought by such “deep” requirements will naturally put forward high requirements for computer hardware and the amount of data for training. This also explains why deep learning technology is becoming more and more popular after the 21st century. It is the rapid development of hardware in the computer field in the past decade and the huge amount of data brought by the popularity of the Internet that provide the basic guarantee for the implementation of deep learning technology.
However, there are two bottlenecks hindering the further intellectualization of neural network deep learning technology
- First, once the system becomes convergent after training, the learning ability of the system will decline, that is to say, the system cannot adjust the weight according to the new input. This is not our ultimate ideal. Our ideal is: assuming that the network converges prematurely due to the limitations of the training sample library itself, it can still independently revise the original input-output mapping when facing new samples, and make this revision take into account both the old history and the new data. However, the existing technology can not support this seemingly grand technological idea. What designers can do now is to return the historical knowledge of the system to zero, put new samples into the sample database, and then start training from the beginning. There is no doubt that we have once again seen the chilling “Sisyphus cycle”.
- Second, as the previous example shows us, in the process of neural network deep learning pattern recognition, designers spend a lot of effort on feature extraction of original samples. Obviously, the same original sample will have different feature extraction patterns in different designers, which will lead to different neural network deep learning modeling directions. For human programmers, this is a good opportunity to reflect their creativity, but for the system itself, it is equal to depriving itself of the opportunity to carry out creative activities. Imagine: can a neural network deep learning structure designed in this way observe the original samples, find the appropriate feature extraction mode, and design its own topology structure? It seems difficult, because it seems to require a meta structure behind the structure, which can give a reflective representation of the structure itself. We are still confused about how this meta structure should be programmed, because it is human beings who realize the function of this meta structure. It’s disappointing that although deep learning technology has these basic defects, the mainstream AI community has been brainwashed to think that deep learning technology is equal to all of AI. A kind of artificial intelligence technology based on small data, which is more flexible and more general, obviously needs more efforts from people. From a purely academic point of view, we are far away from this goal.