Exercises in Chapter 10 of statistical learning methods

Time:2021-11-29

Exercise 10.1

By the question,\(T=4, N=3,M=2\)

According to algorithm 10.3

The first step is to calculate the final period\(\beta\)

\(\beta_4(1) = 1, \beta_4(2) = 1, \beta_4(3) = 1\)

The second step is to calculate each intermediate period\(\beta\)

\(\beta_3(1) = a_{11}b_1(o_4)\beta_4(1) + a_{12}b_2(o_4)\beta_4(2) + a_{13}b_3(o_4)\beta_4(3) = 0.46\)

\(\beta_3(2) = a_{21}b_1(o_4)\beta_4(1) + a_{22}b_2(o_4)\beta_4(2) + a_{23}b_3(o_4)\beta_4(3) = 0.51\)

\(\beta_3(3) = a_{31}b_1(o_4)\beta_4(1) + a_{32}b_2(o_4)\beta_4(2) + a_{33}b_3(o_4)\beta_4(3) = 0.43\)

\(\beta_2(1) = a_{11}b_1(o_3)\beta_3(1) + a_{12}b_2(o_3)\beta_3(2) + a_{13}b_3(o_3)\beta_3(3) = 0.2461\)

\(\beta_2(2) = a_{21}b_1(o_3)\beta_3(1) + a_{22}b_2(o_3)\beta_3(2) + a_{23}b_3(o_3)\beta_3(3) = 0.2312\)

\(\beta_2(3) = a_{31}b_1(o_3)\beta_3(1) + a_{32}b_2(o_3)\beta_3(2) + a_{33}b_3(o_3)\beta_3(3) = 0.2577\)

\(\beta_1(1) = a_{11}b_1(o_2)\beta_2(1) + a_{12}b_2(o_2)\beta_2(2) + a_{13}b_3(o_2)\beta_2(3) = 0.112462\)

\(\beta_1(2) = a_{21}b_1(o_2)\beta_2(1) + a_{22}b_2(o_2)\beta_2(2) + a_{23}b_3(o_2)\beta_2(3) = 0.121737\)

\(\beta_1(3) = a_{31}b_1(o_2)\beta_2(1) + a_{32}b_2(o_2)\beta_2(2) + a_{33}b_3(o_2)\beta_2(3) = 0.104881\)

The third step is calculation\(P(O|\lambda)\)

\(P(O|\lambda) = \pi_1b_1(o_1)\beta_1(1) + \pi_2b_2(o_1)\beta_1(2) + \pi_3b_3(o_1)\beta_1(3) = 0.0601088\)

Exercise 10.2

By definition,\(P(i_4 = q_3|O,\lambda) = \gamma_4(3)\)

According to the formula\(\gamma_4(3) = \frac{\alpha_4(3) \beta_4(3)}{P(O|\lambda)} = \frac{\alpha_4(3) \beta_4(3)}{\sum \alpha_4(j) \beta_4(j)}\)

Through program calculation, we can get\(P(i_4 = q_3|O,\lambda) = \gamma_4(3) = 0.536952\)

Exercise 10.3

According to algorithm 10.5

The first step is initialization

\(\delta_1(1) = \pi_1 b_1(o_1) = 0.2*0.5=0.1\)\(\psi_1(1) = 0\)

\(\delta_1(2) = \pi_2 b_2(o_1) = 0.4*0.4=0.16\)\(\psi_1(2) = 0\)

\(\delta_1(3) = \pi_3 b_3(o_1) = 0.4*0.7=0.28\)\(\psi_1(3) = 0\)

The second step is recursion

\(\delta_2(1) = \mathop{max} \limits_j [\delta_1(j)a_{j1}] b_1(o_2) = max\{0.1*0.5, 0.16*0.3, 0.28*0.2\}*0.5=0.028\)\(\psi_2(1) = 3\)

\(\delta_2(2) = \mathop{max} \limits_j [\delta_1(j)a_{j2}] b_2(o_2) = max\{0.1*0.2, 0.16*0.5, 0.28*0.3\}*0.6=0.0504\)\(\psi_2(2) = 3\)

\(\delta_2(3) = \mathop{max} \limits_j [\delta_1(j)a_{j3}] b_3(o_2) = max\{0.1*0.3, 0.16*0.2, 0.28*0.5\}*0.3=0.042\)\(\psi_2(3) = 3\)

\(\delta_3(1) = \mathop{max} \limits_j [\delta_2(j)a_{j1}] b_1(o_3) = max\{0.028*0.5, 0.0504*0.3, 0.042*0.2\}*0.5=0.00756\)\(\psi_3(1) = 2\)

\(\delta_3(2) = \mathop{max} \limits_j [\delta_2(j)a_{j2}] b_2(o_3) = max\{0.028*0.2, 0.0504*0.5, 0.042*0.3\}*0.4=0.01008\)\(\psi_3(2) = 2\)

\(\delta_3(3) = \mathop{max} \limits_j [\delta_2(j)a_{j3}] b_3(o_3) = max\{0.028*0.3, 0.0504*0.2, 0.042*0.5\}*0.7=0.0147\)\(\psi_3(3) = 3\)

\(\delta_4(1) = \mathop{max} \limits_j [\delta_3(j)a_{j1}] b_1(o_4) = max\{0.00756*0.5, 0.01008*0.3, 0.0147*0.2\}*0.5=0.00189\)\(\psi_4(1) = 1\)

\(\delta_4(2) = \mathop{max} \limits_j [\delta_3(j)a_{j2}] b_2(o_4) = max\{0.00756*0.2, 0.01008*0.5, 0.0147*0.3\}*0.6=0.003024\)\(\psi_4(2) = 2\)

\(\delta_4(3) = \mathop{max} \limits_j [\delta_3(j)a_{j3}] b_3(o_4) = max\{0.00756*0.3, 0.01008*0.2, 0.0147*0.5\}*0.3=0.002205\)\(\psi_4(3) = 3\)

The third step is termination

\(P^* = \mathop{max} \limits_i \delta_4(i) = 0,003024\)

\(i_4^* = \mathop{\arg\max} \limits_i [\delta_4(i)] = 2\)

The fourth step is optimal path backtracking

\(i_3^* = \psi_4(i_4^*) = 2\)

\(i_2^* = \psi_3(i_3^*) = 2\)

\(i_1^* = \psi_2(i_2^*) = 3\)

Therefore, the optimal path\(I^* = (i_1^*,i_2^*,i_3^*,i_4^*)=(3,2,2,2)\)

Exercise 10.4

Prove with forward probability and backward probability:\(P(O|\lambda) = \sum \limits_{i=1}^N \sum \limits_{j=1}^N \alpha_t(i)a_{ij}b_j(o_{t+1})\beta_{t+1}(j)\)

\(\begin{aligned} P(O|\lambda) &= P(o_1,o_2,…,o_T|\lambda) \\ &= \sum_{i=1}^N P(o_1,..,o_t,i_t=q_i|\lambda) P(o_{t+1},..,o_T|i_t=q_i,\lambda) \\ &= \sum_{i=1}^N \sum_{j=1}^N P(o_1,..,o_t,i_t=q_i|\lambda) P(o_{t+1},i_{t+1}=q_j|i_t=q_i,\lambda)P(o_{t+2},..,o_T|i_{t+1}=q_j,\lambda) \\ &= \sum_{i=1}^N \sum_{j=1}^N [P(o_1,..,o_t,i_t=q_i|\lambda) P(o_{t+1}|i_{t+1}=q_j,\lambda) P(i_{t+1}=q_j|i_t=q_i,\lambda) \\ & \quad \quad \quad \quad P(o_{t+2},..,o_T|i_{t+1}=q_j,\lambda)] \\ &= \sum_{i=1}^N \sum_{j=1}^N \alpha_t(i) a_{ij} b_j(o_{t+1}) \beta_{t+1}(j),{\quad}t=1,2,…,T-1 \end{aligned}\)

Exercise 10.5

Viterbi algorithm:

initialization:\(\delta_1(i) = \pi_1b_i(o_1)\)

Recurrence:\(\delta_{t+1}(i) = \mathop{max} \limits_j [\delta_ta_{ji}]b_i(o_{t+1})\)

Forward algorithm:

Initial value:\(\alpha_1(i) = \pi_ib_i(o_1)\)

Recurrence:\(\alpha_{t+1}(i) = [\sum \limits_j \alpha_t(j)a_{ji}]b_i(o_{t+1})\)

Viterbi algorithm needs to select the maximum value based on the calculation results of the previous period

The forward algorithm directly calculates the results of the previous period

Recommended Today

On the mutation mechanism of Clickhouse (with source code analysis)

Recently studied a bit of CH code.I found an interesting word, mutation.The word Google has the meaning of mutation, but more relevant articles translate this as “revision”. The previous article analyzed background_ pool_ Size parameter.This parameter is related to the background asynchronous worker pool merge.The asynchronous merge and mutation work in Clickhouse kernel is completed […]