As a hot word in recent years, artificial intelligence has always been a powerful force in the scientific and technological circle. With the iteration of smart hardware, smart home products have gradually entered thousands of households, and AI related technologies such as voice recognition and image recognition have also experienced stepped development. How to treat the essence of artificial intelligence? What course has the rapid development of artificial intelligence gone through? This paper introduces several concepts and AI development history in the field of artificial intelligence from the perspective of technology.
1、 Concepts related to artificial intelligence
1、Artificial intelligence(artificial intelligence, AI): it is to make machines as intelligent as people and able to think,
It is the application of machine learning and deep learning in practice. Artificial intelligence is more suitable to be understood as an industry, which generally refers to the production of more intelligent software and hardware. The way to realize artificial intelligence is machine learning.
2、data miningData mining is a non trivial process to extract effective, novel, potential, credible and ultimately understood patterns from a large number of data.
Data mining uses statistics, machine learning, database and other technologies to solve problems; data mining is not only statistical analysis, but also the extension and expansion of statistical analysis methodology, many mining algorithms come from statistics.
3、machine learningThis paper focuses on how to simulate or realize human’s learning behavior to acquire new knowledge or skills. Machine learning is the study of computer algorithms that can be improved automatically through experience.
Machine learning is based on the development of data mining technology. It is just a new branch and subdivision field in the field of data mining, but based on big data technology, it has gradually become the mainstream of learning. It is the core of artificial intelligence and the fundamental way to make computers have intelligence. Its application covers all fields of artificial intelligence.
4、Deep learningDeep learning: it is a relatively shallow learning, a new field in machine learning research, whose motivation is to build and simulate the neural network of human brain for analysis and learning. It mimics the mechanism of the human brain to interpret data, such as images, sounds, and text. The concept of deep learning comes from the research of artificial neural network. In-depth learning, by combining low-level features to form more abstract high-level representation of attribute categories or features, to discover the distributed feature representation of data.
At present, the machines trained by deep learning technology are no less than human beings in image recognition, such as recognizing cats, recognizing the features of cancer cells in blood, and recognizing the tumors in MRI scanning images. In the field of Google alphago learning go and so on, AI has exceeded the limit of human’s current level.
In order to facilitate your understanding, we use the following diagram to express the relationship between the four concepts mentioned above. It should be noted that the diagram shows only a kind of general subordination, in which data mining and artificial intelligence are not completely inclusive.
2、 History of artificial intelligence
(picture from Internet)
It can be seen from the figure that deep learning has experienced two troughs before its rise in 2006. These two troughs also divide the development of neural network into several different stages, which are described below.
1. First generation neural network (1958-1969)
The earliest idea of neural network originated from MP artificial neuron model in 1943. At that time, it was hoped to use computer to simulate the process of human neuron response. The model simplified neurons into three processes: linear weighting of input signal, summation and nonlinear activation (threshold method). As shown in the figure below:
In 1958, Rosenblatt invented the perceptron algorithm. The algorithm uses MP model to classify the input multi-dimensional data, and can use gradient descent method to automatically learn and update the weights from the training samples. In 1962, the method was proved to be convergent, and the effect of theory and practice caused the first wave of neural network.
1. Second generation neural network (1986-1998)
The first time to break the nonlinear curse is Hinton, a modern deep learning bull. In 1986, Hinton invented a BP algorithm suitable for MLP, and used sigmoid for nonlinear mapping, which effectively solved the problem of nonlinear classification and learning. This method causes the second upsurge of neural network.
In 1989, Robert Hecht Nielsen proved the universal approximation theorem of MLP, that is, for a continuous function f in any closed interval, it can be approximated by a BP network with a hidden layer. The discovery of the theorem greatly encouraged the researchers of neural network.
In the same year, Lecun invented the convolutional neural network, lenet, and used it in digital recognition, and achieved good results, but at that time, it did not attract enough attention.
It is worth emphasizing that since 1989, because there is no outstanding method proposed, and the neural network (NN) has been lack of corresponding strict mathematical theory support, the upsurge of neural network has gradually cooled down.
In 1997, the LSTM model was invented. Although the characteristics of the model in sequence modeling are very prominent, it is in the downhill period of NN and has not attracted enough attention.
3. Spring of statistical modeling (1986-2006)
In 1986, the decision tree method was put forward. Soon, ID3, ID4, cart and other improved decision tree methods appeared one after another.
In 1995, linear SVM was proposed by the statistician Vapnik. There are two characteristics of this method: it is derived from perfect mathematical theory (statistics, convex optimization, etc.), and it is in line with people’s intuitive feeling (maximum interval). However, the most important thing is that this method achieved the best results in linear classification at that time.
In 1997, AdaBoost was proposed. This method is the representative of PAC (probabilistic approach correct) theory in machine learning practice, and also gave birth to integration methods. This method achieves the effect of strong classifier through a series of weak classifier integration.
In 2000, kernel SVM was put forward. Kernel SVM used a clever way to map the linear non separable problem of the original space into the linear separable problem of the high-dimensional space. It successfully solved the problem of non-linear classification, and the classification effect was very good. This also ended the NN era.
In 2001, random forest was proposed, which is another representative of the integrated method. The theory of this method is solid, and it can suppress over fitting better than AdaBoost, and the actual effect is very good.
In 2001, a new unified framework graph model was proposed. This method tries to unify the chaotic methods of machine learning, such as naive Bayes, SVM, hidden Markov model, etc., and provides a unified description framework for various learning methods.
4. Rapid development period (2006-2012)
In 2006, the first year of deep learning (DL). In the same year, Hinton proposed a solution to the problem of gradient disappearance in deep network training: unsupervised pre training initializes the weights + supervised training fine-tuning. The main idea is to learn the structure of training data (automatic encoder) by self-learning method, and then to fine tune the structure with supervision. However, due to the lack of particularly effective experimental verification, this paper has not attracted attention.
In 2011, the relu activation function was proposed, which can effectively suppress the gradient disappearance.
In 2011, Microsoft first applied DL to speech recognition and made a significant breakthrough.
5. Outbreak period (2012 ~ now)
In 2012, in order to prove the potential of deep learning, Hinton group participated in the Imagenet image recognition competition for the first time. It won the championship through the CNN network alexnet, and rolled the classification performance of the second place (SVM method). Because of the competition, CNN attracted the attention of many researchers.
Innovation of alexnet:
(1) For the first time, the relu activation function is used, which greatly increases the convergence rate and fundamentally solves the problem of gradient disappearance;
(2) Because the relu method can effectively suppress the gradient disappearance, alexnet abandoned the “pre training + fine tuning” method and completely adopted supervised training. Because of this, the mainstream learning methods of DL also become pure supervised learning;
(3) The lenet5 structure is extended, dropout layer is added to reduce over fitting, and LRN layer enhances generalization ability / reduces over fitting;
(4) For the first time, GPU is used to accelerate the calculation.
Conclusion: as one of the most influential technologies in the 21st century, ai not only beats us in playing go and data mining, which are not good at, but also challenges us in image recognition, speech recognition and other fields. Nowadays, artificial intelligence is also converging and evolving with Internet of things, quantum computing, cloud computing and many other technologies, and developing at a speed beyond our imagination. And it took only a few decades for all of this to happen and evolve