1 概率公式
条件概率:![P(A|B)=\frac{P(AB)}{P(B)}](https://latex.csdn.net/eq?P%28A%7CB%29%3D%5Cfrac%7BP%28AB%29%7D%7BP%28B%29%7D)
全概率公式:![P(A)=\sum_{i}P(A|B_{i})P(B_{i})](https://latex.csdn.net/eq?P%28A%29%3D%5Csum_%7Bi%7DP%28A%7CB_%7Bi%7D%29P%28B_%7Bi%7D%29)
贝叶斯公式(Bayes):![P(B_{i}|A) =\frac{P(A|B_{i})P(B_{i})}{\sum_{j}P(A|B_{j})P(B_{j})}](https://latex.csdn.net/eq?P%28B_%7Bi%7D%7CA%29%20%3D%5Cfrac%7BP%28A%7CB_%7Bi%7D%29P%28B_%7Bi%7D%29%7D%7B%5Csum_%7Bj%7DP%28A%7CB_%7Bj%7D%29P%28B_%7Bj%7D%29%7D)
2 贝叶斯公式
2.1 贝叶斯公式带来的思考
给定某些样本
,在这些样本中计算某结论
出现的概率,即![P(A_{i}|D)](https://latex.csdn.net/eq?P%28A_%7Bi%7D%7CD%29)
贝叶斯公式
样本给定,则对于任何
是常数,仅为归一化因子。
:忽略![P(A_{i})](https://latex.csdn.net/eq?P%28A_%7Bi%7D%29)
:若这些结论
的先验概率相等(或近似),则可以由此推导。
2.2 贝叶斯公式的应用
金条问题:
![](https://img-blog.csdnimg.cn/489a96ccbdea48a5a52c7e26170165db.png)
设这三个箱子为B=1,B=2,B=3, 两块贵金属为M=G(金条),M=S(银条)
所以已知:![P(B=1)=P(B=2)=P(B=3)=\frac{1}{3}](https://latex.csdn.net/eq?P%28B%3D1%29%3DP%28B%3D2%29%3DP%28B%3D3%29%3D%5Cfrac%7B1%7D%7B3%7D)
![P(M=G|B=1)=1,P(M=S|B=1)=0](https://latex.csdn.net/eq?P%28M%3DG%7CB%3D1%29%3D1%2CP%28M%3DS%7CB%3D1%29%3D0)
![P(M=G|B=2)=0,P(M=S|B=2)=1](https://latex.csdn.net/eq?P%28M%3DG%7CB%3D2%29%3D0%2CP%28M%3DS%7CB%3D2%29%3D1)
![P(M=G|B=3)=\frac{1}{2},P(M=S|B=3)=\frac{1}{2}](https://latex.csdn.net/eq?P%28M%3DG%7CB%3D3%29%3D%5Cfrac%7B1%7D%7B2%7D%2CP%28M%3DS%7CB%3D3%29%3D%5Cfrac%7B1%7D%7B2%7D)
问题就转化为求![P(B=1|M=G)=?](https://latex.csdn.net/eq?P%28B%3D1%7CM%3DG%29%3D%3F)
解答:![P(B=1|M=G)=\frac{P(B=1,M=G)}{P(M=G)}=\frac{\frac{1}{3}}{\frac{1}{3}+0+\frac{1}{3}\cdot \frac{1}{2}}=\frac{2}{3}](https://latex.csdn.net/eq?P%28B%3D1%7CM%3DG%29%3D%5Cfrac%7BP%28B%3D1%2CM%3DG%29%7D%7BP%28M%3DG%29%7D%3D%5Cfrac%7B%5Cfrac%7B1%7D%7B3%7D%7D%7B%5Cfrac%7B1%7D%7B3%7D+0+%5Cfrac%7B1%7D%7B3%7D%5Ccdot%20%5Cfrac%7B1%7D%7B2%7D%7D%3D%5Cfrac%7B2%7D%7B3%7D)
2.3 贝叶斯网络
- 把某个研究系统中涉及到的随机变量,根据是否条件独立绘制在一个有向图中,就形成了贝叶斯网络。
- 贝叶斯网络(Bayesian Network),又称有向无环图模型,是一种概率图模型之一,根据概率图的拓扑结构,考察一组随机变量
及其
组条件概率分布。
- 概率图模型分为马尔可夫网络模型(无向图)和贝叶斯网络模型(有向图)。
- 一般而言,贝叶斯网络的有向无环图中的节点表示随机变量,它们可以是可观察到的变量,或隐变量、未知参数等。连接两个节点的箭头代表此两个随机变量是具有因果关系(或非条件独立)。若两个节点间以一个单箭头连接在一起,表示其中一个节点是“因(parents)”,另一个是‘果(children)”,两节点就会产生一个条件概率值。
- 一个简单的贝叶斯网络
2.4 全贝叶斯网络
每一对结点之间都有边连接
![p(x_{1},...,x_{K})=p(x_{K}|x_{1},...,x_{K-1})...p(x_{2}|x_{1})p(x_{1})](https://latex.csdn.net/eq?p%28x_%7B1%7D%2C...%2Cx_%7BK%7D%29%3Dp%28x_%7BK%7D%7Cx_%7B1%7D%2C...%2Cx_%7BK-1%7D%29...p%28x_%7B2%7D%7Cx_%7B1%7D%29p%28x_%7B1%7D%29)
![P(X_{1}=x_{1},...,X_{n}=x_{n})=\prod_{i=1}^{n}P(X_{i}=x_{i}|X_{i+1}=x_{i+1},...,X_{n}=x_{n})](https://latex.csdn.net/eq?P%28X_%7B1%7D%3Dx_%7B1%7D%2C...%2CX_%7Bn%7D%3Dx_%7Bn%7D%29%3D%5Cprod_%7Bi%3D1%7D%5E%7Bn%7DP%28X_%7Bi%7D%3Dx_%7Bi%7D%7CX_%7Bi+1%7D%3Dx_%7Bi+1%7D%2C...%2CX_%7Bn%7D%3Dx_%7Bn%7D%29)
举例说明:当K=5时![p(x_{1},...,x_{5})=p(x_{5}|x_{1}...x_{4})p(x_{4}|x_{1}...x_{3})p(x_{3}|x_{2}x_{1})p(x_{1})](https://latex.csdn.net/eq?p%28x_%7B1%7D%2C...%2Cx_%7B5%7D%29%3Dp%28x_%7B5%7D%7Cx_%7B1%7D...x_%7B4%7D%29p%28x_%7B4%7D%7Cx_%7B1%7D...x_%7B3%7D%29p%28x_%7B3%7D%7Cx_%7B2%7Dx_%7B1%7D%29p%28x_%7B1%7D%29)
![](https://img-blog.csdnimg.cn/32d62f02aec74c59a196bc7ccb7e7713.png)
2.5 "正常"的贝叶斯网络
- 有些边缺失
- 如下图所示:直观上
独立,
在
给定条件下独立
-
的联合分布为:
![](https://img-blog.csdnimg.cn/8200bc91774a4e5eadb24edd2a049cff.png)
举例说明:
例一:
![](https://img-blog.csdnimg.cn/575ab950c66c4fab97276f4538f16bb5.png)
由于呼吸困难(D)所造成的原因有肺癌(C)和支气管炎(B),所以才有上表(CPD)。
例二:
![](https://img-blog.csdnimg.cn/c219953ab6044bbeb0bf1e1b09241215.png)
全部随机变量的联合分布为:
![P(j,m,a,\overline{b},\overline{e})=P(j|a)P(m|a)P(a|\overline{b},\overline{e})P(\overline{b})P(\overline{e})=0.9\times 0.7\times 0.001\times 0.999\times 0.998\approx 0.00063](https://latex.csdn.net/eq?P%28j%2Cm%2Ca%2C%5Coverline%7Bb%7D%2C%5Coverline%7Be%7D%29%3DP%28j%7Ca%29P%28m%7Ca%29P%28a%7C%5Coverline%7Bb%7D%2C%5Coverline%7Be%7D%29P%28%5Coverline%7Bb%7D%29P%28%5Coverline%7Be%7D%29%3D0.9%5Ctimes%200.7%5Ctimes%200.001%5Ctimes%200.999%5Ctimes%200.998%5Capprox%200.00063)
实际上,如果需要求联合分布,仅需给出拓扑图,以及各个随机变量之间的概率分布表即可。
2.6 “特殊”的贝叶斯网络
![](https://img-blog.csdnimg.cn/675f47025858466d8e3c59ad7fe1a44e.png)
通过贝叶斯网络判定条件独立:
(1)情况一:tail-to-tail
由图可看出:![P(a,b,c)=P(c)\cdot P(a|c)\cdot P(b|c)](https://latex.csdn.net/eq?P%28a%2Cb%2Cc%29%3DP%28c%29%5Ccdot%20P%28a%7Cc%29%5Ccdot%20P%28b%7Cc%29)
所以:![P(a,b,c)/P(c)=P(a|c)P(b|c)](https://latex.csdn.net/eq?P%28a%2Cb%2Cc%29/P%28c%29%3DP%28a%7Cc%29P%28b%7Cc%29)
又因为:![P(a,b|c)=P(a,b,c)/P(c)](https://latex.csdn.net/eq?P%28a%2Cb%7Cc%29%3DP%28a%2Cb%2Cc%29/P%28c%29)
所以:![P(a,b|c)=P(a|c)P(b|c)](https://latex.csdn.net/eq?P%28a%2Cb%7Cc%29%3DP%28a%7Cc%29P%28b%7Cc%29)
即在c给定条件下,a和b被阻断,是独立的。
![](https://img-blog.csdnimg.cn/9b00336602de4af88776730609e842ca.png)
(2)情况二:head-to-tail
由于![P(a,b,c)=P(a)\cdot P(c|a)\cdot P(b|c)](https://latex.csdn.net/eq?P%28a%2Cb%2Cc%29%3DP%28a%29%5Ccdot%20P%28c%7Ca%29%5Ccdot%20P%28b%7Cc%29)
所以有:
![P(a,b|c)=P(a,b,c)/P(c)=[P(a)\cdot P(c|a)\cdot P(b|c)] /P(c)=[P(a,c)\cdot P(b|c)]/P(c)=P(a|c)\cdot P(b|c)](https://latex.csdn.net/eq?P%28a%2Cb%7Cc%29%3DP%28a%2Cb%2Cc%29/P%28c%29%3D%5BP%28a%29%5Ccdot%20P%28c%7Ca%29%5Ccdot%20P%28b%7Cc%29%5D%20/P%28c%29%3D%5BP%28a%2Cc%29%5Ccdot%20P%28b%7Cc%29%5D/P%28c%29%3DP%28a%7Cc%29%5Ccdot%20P%28b%7Cc%29)
即在c给定条件下,a和b被阻断,是独立的。
![](https://img-blog.csdnimg.cn/370b41eb80984db682f38f5c29d71ad6.png)
(3)情况三:head-to-head
由于![P(a,b,c)=P(a)\cdot P(b)\cdot P(c|a,b)](https://latex.csdn.net/eq?P%28a%2Cb%2Cc%29%3DP%28a%29%5Ccdot%20P%28b%29%5Ccdot%20P%28c%7Ca%2Cb%29)
所以有:![\sum_{c}P(a,b,c)=\sum_{c}P(a)\cdot P(b)\cdot P(c|a,b)](https://latex.csdn.net/eq?%5Csum_%7Bc%7DP%28a%2Cb%2Cc%29%3D%5Csum_%7Bc%7DP%28a%29%5Ccdot%20P%28b%29%5Ccdot%20P%28c%7Ca%2Cb%29)
从而:![P(a,b)=P(a)\cdot P(b)](https://latex.csdn.net/eq?P%28a%2Cb%29%3DP%28a%29%5Ccdot%20P%28b%29)
即在c未知的条件下,a和b被阻断,是独立的。
![](https://img-blog.csdnimg.cn/388d17fd8190462285f080f7fc8147c0.png)
2.7 将上述结点推广至结点集
![](https://img-blog.csdnimg.cn/f3c13845a94641179371949fd81c84c3.png)
ps:有D-separation可知,在
给定的条件下,
的分布和
条件独立。即:
的分布状态只和
有关,和其他变量条件独立,这种顺次演变的随机过程模型,叫做马尔科夫模型。
![P(X_{n+1}=x|X_{0},X_{1},X_{2},...,X_{n})=P(X_{n+1}=x|X_{n})](https://latex.csdn.net/eq?P%28X_%7Bn+1%7D%3Dx%7CX_%7B0%7D%2CX_%7B1%7D%2CX_%7B2%7D%2C...%2CX_%7Bn%7D%29%3DP%28X_%7Bn+1%7D%3Dx%7CX_%7Bn%7D%29)
- 隐马尔科夫模型(HMM,Hidden Markov Model)可用标注问题,在语音识别、NLP、生物信息、模式识别等领域被实践证明是有效的算法。
- HMM是关于时序的概率模型,描述由一个隐藏的马尔科夫链随机生成不可观测的状态随机序列,再由各个状态生成一个观测而产生观测随机序列的过程。
- 隐马尔科夫模型随机生成的状态的序列,称为状态序列;每个状态生成一个观测,由此产生的观测随机序列,称为观测序列。序列的每个位置可看做是一个时刻。空间序列也可使用该模型,如分析DNA。
2.8 贝叶斯网络的用途
![](https://img-blog.csdnimg.cn/8acba2cabae847feaa1fcda5dfaca289.png)