# Algorithm engineering lion 5. Exponential distribution family

Time：2020-10-27

#### 1. Definition

Exponential distribution family refers to a class of distribution functions with specific forms, which are as follows:
$$p (Y | / ETA) = B (y) e ^ {ETA ^ TT (y) – A (/ ETA)} = \ dfrac {B (y) e ^ {ETA ^ TT (y)}} {e ^ {a (ETA)}} begin {cases} ETA: parameter vector / natural parameter, usually real number \ \ A: logarithmic partition function / logarithm regularization \ \ t (y): sufficient statistics, usually t (y) = y \ \ B: bottom observation value / end {cases}$$
Exponential distribution family this form is given a, B, t defines a probability distribution set with parameter η

#### 2. Logarithmic regularization

The above formula is transformed into:
$$P(y|\eta)e^{a(\eta)}=b(y)e^{\eta^TT(y)}$$
For both sides, y integral at the same time:
$$\int P(y|\eta)e^{a(\eta)}dy=\int b(y)e^{\eta^TT(y)}dy$$
On the left, the integral of just conditional probability is 1, which is reduced to:
$$e^{a(\eta)}=\int b(y)e^{\eta^TT(y)}dy$$
Logarithm:
$$a(\eta)=\ln\int b(y)e^{\eta^TT(y)}dy$$
Now it’s clear at a glance that logarithm is regularized

#### 3. Common exponential distribution family

Normal distribution – total noise
Bernoulli distribution LR (01)
Beta distribution
Dirichlet distribution

#### 4. Examples of derivation of exponential distribution family

##### Gaussian distribution

Its distribution is as follows: {2} {2} {2} {2} {2} {2} {2} {2} {2}} {2} {2}}
$$P(y|\eta)=\dfrac{1}{\sqrt{2\pi}}e^{-\log\sigma}\cdot e^{-\dfrac{x^2}{2\sigma^2}}=\dfrac{1}{\sqrt{2\pi}}e^{-\dfrac{1}{2\sigma^2}x^2-\log\sigma}$$
This is the form of exponential distribution family

##### Binomial distribution

\begin{aligned} P(y|\eta) & = \large\phi^y(1-\phi)^{1-y} \\\ & = \large e^{\normalsize{y\log\phi+(1-y)\log(1-\phi)}} \\\ & =\large e^{\large{\log\frac{\phi}{1-\phi}y+\log(1-\phi)}}\end{aligned}

#### 5. Maximum entropy

The exponential distribution family satisfies the idea of maximum entropy, that is, the distribution derived from the empirical distribution in the form of maximum entropy is the exponential distribution family.
For any function, the empirical expectation is $E_ {\tilde{P}}(f(x))=\Delta$。 therefore:
$$max\{H(P)\}=min\{\sum\limits_{k=1}^{K}p_k\log p_k\},\quad s.t.\sum\limits_{k=1}^{K}p_k=1,E_{\tilde{P}}(f(x))=\Delta$$
The generalized Lagrange function is constructed
$$L=\sum\limits_{k=1}^{K}p_k\log p_k+\lambda_0(1-\sum\limits_{k=1}^{K}p_k)+\lambda^T(\Delta-E_pf(x))$$
The derivative of P (x) is as follows:
$$\frac{\partial L}{\partial P(x)}=\sum\limits_{k=1}^{K}\log P(x)+1-\lambda_0-\lambda^Tf(x)=0$$
The solution is as follows:
$$P(x)=e^{\lambda^Tf(x)+\lambda_0-1}$$

#### 6. Generalized linear model (GLM)

The generalized linear model includes linear model, LR and softmax. The reason why we need to mention the generalized linear model is that it is derived from the exponential distribution family

• Suppose y follows the exponential distribution family with X, θ as parameters and η as natural parameters
• Learning: $H (x) = e (t (y) | x)$

## Array of algorithms — sum of three numbers

Sum of three numbers difficultysecondary Here is an array of N integersnums, judgmentnumsAre there three elements a, B, C in a such that a + B + C = 0? Please find all triples that satisfy the condition and do not repeat.be careful:The answer cannot contain duplicate triples. Example:Given array nums = [- 1, 0, […]