隐马尔可夫模型问题1:求模型观测序列的概率

背景

隐马尔可夫模型关注的三个问题中的第一个是找到模型观察序列的概率。

蛮力解决方案

已知HMM模型的参数%5Clambda%20%3D%20%5B%7B%5Cbf%7BA%2CB%2C%5Cpi%20%7D%7D%5D%7B%5Cbf%7BA%7D%7D是隐藏状态转移概率矩阵,%7B%5Cbf%7BB%7D%7D是观测状态概率矩阵。对于隐藏状态的初始概率分布记作%7B%5Cbf%7B%5Cpi%20%7D%7D。已知观测序列O%20%3D%20%5C%7B%20%7Bo_1%7D%2C%7Bo_2%7D%2C%20%5Ccdots%20%2C%7Bo_i%7D%2C%20%5Ccdots%20%2C%7Bo_M%7D%5C%7D,现在我们需要求出P%28O%7C%5Clambda%20%29出现的条件概率。首先已经知道%7B%5Cbf%7BB%7D%7D,即隐藏状态到观测状态的概率,已经知道%7B%5Cbf%7BA%7D%7D,即隐藏态之间的转移概率。然后,我们可以直接使用暴力求解的方式。暴力求解过程如下所示:
(1) 任意隐藏序列出现的概率表示为:
P%28I%7C%5Clambda%20%29%20%3D%20%7B%5Cpi%20_%7B%7Bi_1%7D%7D%7D%7Ba_%7B%7Bi_1%7D%7Bi_2%7D%7D%7D%7Ba_%7B%7Bi_2%7D%7Bi_3%7D%7D%7D%20%5Ccdots%20%7Ba_%7B%7Bi_%7BT%20-%201%7D%7D%7Bi_T%7D%7D%7D
(2) 已知隐藏序列,求观察序列的概率,表示为:
P%28O%7CI%2C%5Clambda%20%29%20%3D%20%7Bb_%7B%7Bi_1%7D%7D%7D%28%7Bo_1%7D%29%7Bb_%7B%7Bi_2%7D%7D%7D%28%7Bo_2%7D%29%20%5Ccdots%20%7Bb_%7B%7Bi_T%7D%7D%7D%28%7Bo_T%7D%29
(3) 隐藏序列和观察序列的联合概率表示为:
P%28O%2CI%7C%5Clambda%20%29%20%3D%20P%28I%7C%5Clambda%20%29P%28O%7CI%2C%5Clambda%20%29%20%3D%20%7B%5Cpi%20_%7B%7Bi_1%7D%7D%7D%7Bb_%7B%7Bi_1%7D%7D%7D%28%7Bo_1%7D%29%7Ba_%7B%7Bi_1%7D%7Bi_2%7D%7D%7D%7Bb_%7B%7Bi_2%7D%7D%7D%28%7Bo_2%7D%29%20%5Ccdots%20%7Ba_%7B%7Bi_%7BT%20-%201%7D%7D%7Bi_T%7D%7D%7D%7Bb_%7B%7Bi_T%7D%7D%7D%28%7Bo_T%7D%29
(4) 求边缘概率分布,即观测序列O在模型%5Clambda下的概率。
P%28O%7C%5Clambda%20%29%20%3D%20%5Csum%5Climits_I%20%7BP%28O%2CI%7C%5Clambda%20%29%7D%20%3D%20%5Csum%5Climits_%7B%7Bi_1%7D%2Ci%7B%7D_2%2C%20%5Ccdots%20%2C%7Bi_T%7D%7D%20%7B%7B%5Cpi%20_%7B%7Bi_1%7D%7D%7D%7Bb_%7B%7Bi_1%7D%7D%7D%28%7Bo_1%7D%29%7Ba_%7B%7Bi_1%7D%7Bi_2%7D%7D%7D%7Bb_%7B%7Bi_2%7D%7D%7D%28%7Bo_2%7D%29%20%5Ccdots%20%7Ba_%7B%7Bi_%7BT%20-%201%7D%7D%7Bi_T%7D%7D%7D%7Bb_%7B%7Bi_T%7D%7D%7D%28%7Bo_T%7D%29%7D
蛮力求解方法只适用于隐藏状态很少的模型。如果隐藏状态太多,会导致计算量巨大。隐藏状态是未知的,需要考虑所有隐藏状态。如果状态数为M,则复杂度为%7BM%5ET%7D时间,T表示隐藏序列的长度。总时间复杂度为T%7BM%5ET%7D

前向算法求HMM观测序列概率

前向算法的本质是动态规划,通过子问题找到全局最优解,这里称为局部状态。对于前向算法,局部状态是前向概率,它是指在给定时刻从隐藏状态到观察状态的概率。 t时刻隐藏状态的前向概率定义为:
%7B%5Calpha%20_1%7D%28i%29%20%3D%20%7B%5Cpi%20_i%7D%7Bb_i%7D%28%7Bo_1%7D%29%2C%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20i%20%3D%201%2C2%2C%20%5Ccdots%20%2CN
那么,时间t%20%2B%201%2Ct%20%2B%202%2C%20%5Ccdots%20%2CT的概率是递归的:
%7B%5Calpha%20_%7Bt%20%2B%201%7D%7D%28i%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bj%20%3D%201%7D%5EN%20%7B%7B%5Calpha%20_t%7D%28j%29%7Ba_%7Bji%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_i%7D%28%7Bo_%7Bt%20%2B%201%7D%7D%29%2C%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20i%20%3D%201%2C2%2C%20%5Ccdots%20%2CN
最后,计算给定模型下观察序列的概率:
P%28O%7C%5Clambda%20%29%20%3D%20%5Csum%5Climits_%7Bi%20%3D%201%7D%5EN%20%7B%7B%5Calpha%20_T%7D%28i%29%7D

正向算法解决方案示例

给定三个盒子,每个盒子包含两个彩色球,红色和白色。三个盒子里的球数分别是:

盒子名称1号盒子2号盒子3号盒子
红色球数量547
白色球数量563

从不同的盒子中取球,并且从1号盒子取球的概率为0.2,从2号盒子取球的概率是0.4,从3号盒子取球的概率是0.4, 总体概率为1。 即初始的状态分布:
%5Cprod%20%3D%20%7B%280.2%2C0.4%2C0.4%29%5ET%7D
二、状态转移概率矩阵为:
%7B%5Cbf%7BA%7D%7D%20%3D%20%5Cleft%28%20%7B%5Cbegin%7Bmatrix%7D%20%7B0.5%7D%26%7B0.2%7D%26%7B0.3%7D%5C%5C%20%7B0.3%7D%26%7B0.5%7D%26%7B0.2%7D%5C%5C%20%7B0.2%7D%26%7B0.3%7D%26%7B0.5%7D%20%5Cend%7Bmatrix%7D%7D%20%5Cright%29
观察到的状态概率矩阵为:
%7B%5Cbf%7BB%7D%7D%20%3D%20%5Cleft%28%20%7B%5Cbegin%7Bmatrix%7D%20%7B0.5%7D%26%7B0.5%7D%5C%5C%20%7B0.4%7D%26%7B0.6%7D%5C%5C%20%7B0.7%7D%26%7B0.3%7D%20%5Cend%7Bmatrix%7D%7D%20%5Cright%29
求解问题,给定一个观察序列{red, white, red},找到这个观察序列的概率是多少?
具体解决过程如下:
(1) 在第1时刻观察到红色球,此时计算三个状态的前向概率:
盒子1的前向概率:
%7B%5Calpha%20_1%7D%281%29%20%3D%20%7B%5Cpi%20_1%7D%7Bb_1%7D%28%7Bo_1%7D%29%20%3D%200.2%20%5Ctimes%200.5%20%3D%200.1
盒子2的前向概率:
%7B%5Calpha%20_1%7D%282%29%20%3D%20%7B%5Cpi%20_2%7D%7Bb_2%7D%28%7Bo_1%7D%29%20%3D%200.4%20%5Ctimes%200.4%20%3D%200.16
盒子3的前向概率:
%7B%5Calpha%20_1%7D%283%29%20%3D%20%7B%5Cpi%20_3%7D%7Bb_3%7D%28%7Bo_1%7D%29%20%3D%200.4%20%5Ctimes%200.7%20%3D%200.28

(2) 依据递推式,递推第2时刻三个状态的前向概率,观察为白色球,
盒子1的前向概率:
%7B%5Calpha%20_2%7D%281%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_1%7D%28i%29%7Ba_%7Bi1%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_1%7D%28%7Bo_2%7D%29%20%3D%20%5B0.1%2A0.5%20%2B%200.16%2A0.3%20%2B%200.28%2A0.2%5D%20%5Ctimes%200.5%20%3D%200.077

盒子2的前向概率:
%7B%5Calpha%20_2%7D%282%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_1%7D%28i%29%7Ba_%7Bi2%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_2%7D%28%7Bo_2%7D%29%20%3D%20%5B0.1%2A0.%7B%5Crm%7B2%7D%7D%20%2B%200.16%2A0.%7B%5Crm%7B5%7D%7D%20%2B%200.28%2A0.%7B%5Crm%7B3%7D%7D%5D%20%5Ctimes%200.%7B%5Crm%7B6%7D%7D%20%3D%200.%7B%5Crm%7B1104%7D%7D

盒子3的前向概率:
%7B%5Calpha%20_2%7D%28%7B%5Crm%7B3%7D%7D%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_1%7D%28i%29%7Ba_%7Bi%7B%5Crm%7B3%7D%7D%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_%7B%5Crm%7B3%7D%7D%7D%28%7Bo_2%7D%29%20%3D%20%5B0.1%2A0.%7B%5Crm%7B3%7D%7D%20%2B%200.16%2A0.%7B%5Crm%7B2%7D%7D%20%2B%200.28%2A0.%7B%5Crm%7B5%7D%7D%5D%20%5Ctimes%200.%7B%5Crm%7B3%7D%7D%20%3D%200.%7B%5Crm%7B0606%7D%7D

(3) 依据递推式,递推第3时刻三个状态的前向概率, 观察为红色球,
盒子1的前向概率:
%7B%5Calpha%20_%7B%5Crm%7B3%7D%7D%7D%281%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_2%7D%28i%29%7B%5Calpha%20_%7Bi1%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_1%7D%28%7Bo_3%7D%29%20%3D%20%5B0.077%2A0.5%20%2B%200.1104%2A0.3%20%2B%200.0606%2A0.2%5D%20%5Ctimes%200.5%20%3D%200.04187
盒子2的前向概率:
%7B%5Calpha%20_%7B%5Crm%7B3%7D%7D%7D%282%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_2%7D%28i%29%7B%5Calpha%20_%7Bi2%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_2%7D%28%7Bo_3%7D%29%20%3D%20%5B0.077%2A0.2%20%2B%200.1104%2A0.5%20%2B%200.0606%2A0.3%5D%20%5Ctimes%200.4%20%3D%200.03551

盒子3的前向概率:
%7B%5Calpha%20_%7B%5Crm%7B3%7D%7D%7D%283%29%20%3D%20%5Cleft%5B%20%7B%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_3%7D%28i%29%7B%5Calpha%20_%7Bi3%7D%7D%7D%20%7D%20%5Cright%5D%7Bb_3%7D%28%7Bo_3%7D%29%20%3D%20%5B0.077%2A0.3%20%2B%200.1104%2A0.2%20%2B%200.0606%2A0.5%5D%20%5Ctimes%200.7%20%3D%200.05284
最后,我们找到观察序列为 {red, white, red} 的概率为:
P%28O%7C%5Clambda%20%29%20%3D%20%5Csum%5Climits_%7Bi%20%3D%201%7D%5E3%20%7B%7B%5Calpha%20_3%7D%28i%29%7D%20%3D%200.13022

后向算法求HMM观测序列概率

后向算法求HMM观测序列的概率和前向算法求HMM观测序列的概率是相似的。它们之间的区别主要在动态规划的递推式恰好是相反的。
t时刻隐藏状态的前向概率定义为:
%7B%5Cbeta%20_T%7D%28i%29%20%3D%201%2C%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20i%20%3D%201%2C2%2C%20%5Ccdots%20%2CN

然后,递归推导出每个隐藏状态在时间T%20-%201%2CT%20-%202%2C%20%5Ccdots%20%2C1的后向概率:
%7B%5Cbeta%20_t%7D%28i%29%20%3D%20%5Csum%5Climits_%7Bj%20%3D%201%7D%5EN%20%7B%7Ba_%7Bij%7D%7D%7Bb_j%7D%28%7Bo_%7Bt%20%2B%201%7D%7D%29%7B%5Cbeta%20_%7Bt%20%2B%201%7D%7D%28j%29%7D%20%2C%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20%7B%5Ckern%201pt%7D%20i%20%3D%201%2C2%2C%20%5Ccdots%20%2CN

最后,计算给定模型下观察序列的概率:
P%28O%7C%5Clambda%20%29%20%3D%20%5Csum%5Climits_%7Bi%20%3D%201%7D%5EN%20%7B%7B%5Cpi%20_i%7D%7Bb_i%7D%28%7Bo_1%7D%29%7B%5Cbeta%20_1%7D%28i%29%7D

文章出处登录后可见!

已经登录?立即刷新

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
乘风的头像乘风管理团队
上一篇 2022年3月21日 下午5:29
下一篇 2022年3月21日 下午5:47

相关推荐