next up previous
Next: Monte-Carlo approaches Up: The general setup for Previous: Calculation of and :


Recursion for conditional distributions: The Baum-Welch Algorithm

The recursion of section 1.3 have serious underflow problems in numerical applications, and $ \alpha$ and $ \beta$ do not represent useful variables. Introduce forward and backward variables $ \tilde\alpha$ and $ \tilde\beta$ :

$\displaystyle \boxed{ \tilde\alpha_t(s) = p(s_t=s\vert x^t) }$ (13)

which allows to write
$\displaystyle \tilde\alpha_t(s_t)$ $\displaystyle =$ $\displaystyle p(s_t\vert x^t) = \frac{1}{p(x^t)} p(s_t,x^t)$  
  $\displaystyle =$ $\displaystyle \frac{1}{p(x^t)}\sum_{s_{t-1}} \psi(s_t,s_{t-1},x_t) p(s_{t-1},x^{t-1})$  
  $\displaystyle =$ $\displaystyle \frac{p(x^{t-1})}{p(x^t)} \sum_{s_{t-1}} \psi(s_t,s_{t-1},x_t) p(s_{t-1}\vert x^{t-1})$  

so the recursion is
$\displaystyle \tilde\alpha_t(s)$ $\displaystyle =$ $\displaystyle \frac{1}{\kappa_t} \sum_{s^\prime} \psi(s,s^\prime,x_t) \,
\tilde\alpha_{t-1}(s^\prime)$ (14)
$\displaystyle \kappa_t$ $\displaystyle =$ $\displaystyle \frac{p(x^t)}{p(x^{t-1})} = p(x_t\vert x^{t-1}) \quad .$ (15)

The new variable $ \kappa_t$ is just the normalizing constant for $ \sum_{s^\prime} \psi(s,s^\prime,x_t)\tilde\alpha_{t-1}(s^\prime)$ , therefore
$\displaystyle \tilde a_t(s)$ $\displaystyle =$ $\displaystyle \sum_{s^\prime} \psi(s,s^\prime,x_t) \, \tilde\alpha_{t-1}(s^\prime)$ (16)
$\displaystyle \kappa_t$ $\displaystyle =$ $\displaystyle \sum_s a_t(s)$ (17)
$\displaystyle \tilde\alpha_t(s)$ $\displaystyle =$ $\displaystyle \frac{1}{\kappa_t} \, a_t(s) \quad .$ (18)

Next, normalise $ \beta$ as follows:

$\displaystyle \boxed{ \tilde\beta_t(s^\prime) = \frac{p(x^{[t+1}\vert s_t=s^\prime)}{p(x^{[t+1}\vert x^t)} }$ (19)

and, using (19) and (12) the backward recursion for $ \tilde\beta$ becomes
$\displaystyle \tilde\beta_t(s^\prime)$ $\displaystyle =$ $\displaystyle \frac{1}{p(x^{[t+1}\vert x^t)} \, \beta_t(s^\prime)$ (20)
  $\displaystyle =$ $\displaystyle \frac{p(x^{[t}\vert x^{t-1})}{p(x^{[t+1}\vert x^t)} \sum_s \psi(s,s^\prime,x_{t+1}) \,\tilde\beta_{t+1}(s)$ (21)

Because of
$\displaystyle p(x^{[t}\vert x^{t-1})$ $\displaystyle =$ $\displaystyle \frac{p(x)}{p(x^{t-1})} = \frac{p(x^{[t+1}\vert x^t) \, p(x^t)}{p(x^{t-1})}$  
  $\displaystyle =$ $\displaystyle p(x^{[t+1}\vert x^t) \, p(x_t\vert x^{t-1}) = \kappa_t \, p(x^{[t+1}\vert x^t)$  

the recursion becomes

$\displaystyle \boxed{ \tilde\beta_t(s^\prime) = \kappa_t \sum_s \psi(s,s^\prime,x_{t+1}) \,\tilde\beta_{t+1}(s) }$ (22)

Since
$\displaystyle \tilde\beta_{t-1}(s_{T-1})$ $\displaystyle =$ $\displaystyle \frac{x_T\vert s_{T-1}}{p(x_T\vert x^{t-1})}$  
  $\displaystyle =$ $\displaystyle \frac{\int ds_T p(x_T\vert s_T) \, p(s_T\vert s_{T-1})}{\kappa_T}$  

the backward iteration is properly initialised by defining

$\displaystyle \tilde\beta_T(s_T) = 1$ (23)

Finally, the desired quantities $ \gamma $ and $ \Omega $ in the M-step are related to $ \tilde\alpha$ and $ \tilde\beta$ via

$\displaystyle \gamma_t(s_t) = p(s_t\vert x)$ $\displaystyle =$ $\displaystyle \frac{p(s_t x^t x^{[t+1})}{p(x)} = \frac{p(x^{[t+1}\vert s_t) \, p(s_t\vert x^t) \, p(x^t)}{p(x)}$  
  $\displaystyle =$ $\displaystyle \frac{p(x^{[t+1}\vert s_t)}{p(x^{[t+1}\vert x^t)} \, p(s_t\vert x^t) = \tilde\alpha_t(s_t) \, \tilde\beta_t(s_t)$  

and
$\displaystyle \Omega_t(s_{t-1}s_t)$ $\displaystyle =$ $\displaystyle p(s_{t-1}s_t\vert x) = \frac{1}{p(x)} \, p(s_{t-1}s_tx^{t-1}x^{[t})$  
  $\displaystyle =$ $\displaystyle \frac{1}{p(x)} \, p(x^{t-1}) \, p(x^t\vert s_t) \, p(x^{[t+1}\vert s_t)$  
  $\displaystyle =$ $\displaystyle \frac{1}{\kappa_t} \, \tilde\alpha_{t-1}(s_{t-1}) \, p(s_t\vert s_{t-1}) \, p(x_t\vert s_t) \, \tilde\beta(s_t) \quad .$  


next up previous
Next: Monte-Carlo approaches Up: The general setup for Previous: Calculation of and :
Markus Mayer 2009-06-22