Tag：gradient

Time：2021618
This article is published in:Walker AI Reinforcement learning（Reinforcement Learning，RL）Reinforcement learning, also known as reinforcement learning, refers to a kind of continuous learning problems from the interaction (with the environment) and the methods to solve such problems. Reinforcement learning can be described as an agent learning continuously from the interaction with the environment to achieve a […]

Time：2021612
By Rashida nasrin suckyCompile VKSource: medium Neural networks have been developed to simulate the human brain. Although we haven’t done that yet, neural network is very effective in machine learning. It was very popular in the 1980s and 1990s and has become more and more popular recently. The computer is fast enough to run a […]

Time：2021510
By vagif AliyevCompile VKSource: towards Data Science Linear regression is probably one of the most common algorithms, which must be known by machine learning practitioners. This is usually the first time for beginners to come into contact with machine learning algorithm. It is very important to understand its operation mode for better understanding it. So, […]

Time：202159
The purpose of neural network is to find suitable parameters to make the value of loss function as small as possible. The process of solving this problem is called optimization. The algorithm used to solve this problem is called the optimizer. 1、 BGD、SGD、MSGD Bgd: the most primitive gradient descent algorithm, which calculates the loss of […]

Time：202155
Working principle of gradient descent algorithm in machine learning By nikil_ REDDYCompile VKSource: analytics vidhya introduce Gradient descent algorithm is one of the most commonly used machine learning algorithms in industry. But it confuses a lot of new people. If you’re new to machine learning, the math behind gradient descent isn’t easy. In this article, […]

Time：2021429
Link to the original text: https://blog.csdn.net/doufangzheng/article/details/104023161 catalogue preface Receptive field computing onedimensional twodimensional Effective receptive field Intuitive understanding Theoretical proof Effective receptive field The distribution of effective receptive field is different for different activation functions Effect of training on effective receptive field other How to deal with Gaussian distribution of sensing field effectively reflection Anchor […]

Time：2021416
preface We often set up batch when we train the network_ Size, this batch_ What’s the use of size? How big should a data set of 10000 maps be? What’s the difference between setting it to 1, 10, 100 or 10000? #Network training method for handwritten digit recognition network.fit( train_images, train_labels, epochs=5, batch_size=128) Batch gradient […]

Time：2021413
The article comes from the official account of machine learning alchemy, and can get massive learning materials if it returns to alchemy. This chapter is about pytorch’s dynamic graph mechanism and tensorflow’s static graph mechanism (the latest version of TF also supports dynamic graph). A preliminary derivation of dynamic graph Computational graphs are used to […]

Time：202141
sketch The basic idea of normalization is actually quite intuitive: because the distribution of the active input value (that is, x = Wu + B, u is the input) of the deep neural network before the nonlinear transformation gradually shifts or changes with the deepening of the network or during the training process,The reason why […]

Time：2021331
1、 RNN overview The assumptions of artificial neural network and convolutional neural network are as followsThe elements are independent of each otherBut in many cases of life, this assumption does not hold, such as you write a meaningful paragraph“It takes only one second to meet someone, three seconds to like someone, and one minute to […]

Time：2021318
The goal of activation functions is to make neural networks nonlinear. The activation function is continuous and differentiable. Continuous: when the input value changes slightly, the output value also changes slightly; Differentiable: in the domain of definition, there is a derivative everywhere; Common activation functions: sigmoid, tanh, relu. sigmoid Sigmoid is a smooth step function […]

Time：2021316
Dictum: Life is just a series of trying to make up your mind. — T. Fuller Unlike the value based RL method, which approximates the value function and calculates the deterministic strategy, the strategy based RL method takes the learning of strategy from the probability set\(P(as)\)Transform to strategy function\(\pi(as)\)And the optimal strategy is obtained by […]