• ## Federal learning: dividing non IID samples according to Dirichlet distribution

Time：2022-7-14

We areRandom sampling and probability distribution in python (II)This paper introduces how to sample a probability distribution with Python’s existing library. The Dirichlet distribution among them will not be unfamiliar to you. The probability density function of this distribution is \[P(\bm{x}; \bm{\alpha}) \propto \prod_{i=1}^{k} x_{i}^{\alpha_{i}-1} \\ \bm{x}=(x_1,x_2,…,x_k),\quad x_i > 0 , \quad \sum_{i=1}^k x_i = […]

• ## Intuitive and popular explanation from entropy to cross entropy loss

Time：2022-7-9

For beginners of machine learning and data science, they must be clear about the concepts of entropy and cross entropy. They are the key basis for building trees, dimensionality reduction and image classification. In this article, I will try to explain the concept of entropy from the perspective of information theory. When I first tried […]

• ## Why is the cross entropy and KL divergence approximately equal as a loss function

Time：2022-2-13

In this article, we will introduce the concepts of entropy, cross entropy and Kullback Leibler divergence , and learn how to approximate them to be equal. Although KL divergence was initially recommended, it is a common practice to use cross entropy in the loss function when building the generated countermeasure network . This often causes […]

• ## Knowledge distillation

Time：2022-2-9

1、 Introduction https://zhuanlan.zhihu.com/p/258721998 https://zhuanlan.zhihu.com/p/90049906 https://zhuanlan.zhihu.com/p/353472061 （1） . definition The way of knowledge distillation is to use the soft label output by teacher network as a label to train student network. image.png For example, in the figure above, we train the student network to have the same output as the teacher network. The advantage of this […]

• ## Detailed interpretation and mapping of qqplot of 10x single cell (10x spatial transcriptome) Seurat analysis

Time：2022-2-6

The following picture should be familiar to everyone jsplots-1.png This graph is a functionJackStrawPlot()We should all know how many principal components (PCS) are used for downstream analysis. Let’s take a look at the explanation of this figure. Plots the results of the JackStraw analysis for PCA significance. For each PC, plots a QQ-plot comparing the […]

• ## Exercises in Chapter 1 of statistical learning methods

Time：2021-12-21

Exercise 1.1 The three elements of statistical learning method are: model, strategy and algorithm. The model requires a function$$Y=f_\theta(X)$$Or conditional probability distribution$$P_\theta(Y|X)$$express. The strategy is to find an appropriate loss function to represent the error between the predicted value and the real value, and then construct the risk function. The risk function is the objective […]

• ## Exercises in Chapter 4 of statistical learning methods

Time：2021-12-14

Exercise 4.1 A priori probability and conditional probability of naive Bayes are derived by maximum likelihood estimation method Hypothetical data set$$T = \{(x^{(1)} , y^{(1)}), (x^{(2)} , y^{(2)}), … , (x^{(M)} , y^{(M)})\}$$ ， hypothesis$$P(Y=c_k) = \theta_k$$, then$$P(Y \ne c_k) = 1 – \theta_k$$。 It is assumed that the value in the dataset is$$c_k$$The number […]

• ## Attention mechanism in deep learning

Time：2021-11-26

RNN has its own weakness in machine translation. Attention appears to overcome this weakness. Therefore, to understand attention, we must understand two things: What are the weaknesses of RNN in machine translation How does attention overcome this weakness This paper attempts to understand the attention mechanism from the perspective of answering these two questions. catalogue […]

• ## Statistical learning 1: naive Bayesian model

Time：2021-11-8

Model Introduction to generation model We define the sample space as$$\mathcal{X} \subseteq \mathbb{R}^n$$, output space is$$\mathcal{Y} = \{c_1, c_2, …, c_K\}$$。$$\textbf{X}$$Is a random vector in the input space, and its value is$$\textbf{x}$$, satisfied$$\textbf{x} \in \mathcal{X}$$；$$Y$$Is a random variable in the output space, and its value is$$y$$, satisfied$$y \in \mathcal{Y}$$。 We will have a capacity of$$m$$The […]

• ## Cross entropy loss function NN. Cross entropy loss ()

Time：2021-6-22

nn.CrossEntropyLoss() 1. Introduction When using pytorch deep learning framework to do multi classification, the cross entropy loss function NN. Crossentropyloss () 2. Information quantity and entropy Amount of information:It is used to measure the uncertainty of an event; The greater the probability of an event, the smaller the uncertainty, the smaller the amount of information […]

• ## The visualization of data probability distribution in MATLAB

Time：2021-3-13

Matlab visualization, we sometimes through data statistics and can not know or clear what kind of data distribution, this paper is based on this situation, the distribution of data to do a simple probability distribution of visualization, interested users to understand it! Software name: Matlab r2017b 64 bit Chinese Special Edition (with cracking file + […]

• ## Algorithm Engineering 3. Mathematical basis, probability theory and statistics

Time：2021-1-28

Traditional machine learning can be said to use probability theory everywhere. probability theory 1. Total probability formula and Bayesian formula Total probability formula$$P(A)=\sum\limits_{j=1}^{n}P(B_j)P(A|B_j)$$Bayes formula$$P(B_i|A)=\dfrac{P(A,B_i)}{P(A)}=\dfrac{P(B_i)P(A|B_i)}{\sum\limits_{j=1}^{n}P(B_j)P(A|B_j)}$$Bayesian formula is the core tool of Bayesian statistics. Bayesian school thinks that the probability of event occurrence is not as simple as frequency school, but should add human priors, so that […]