|
|
|
@ -341,6 +341,247 @@
|
|
|
|
|
"plt.legend(frameon=True, framealpha=0.9,fontsize=16)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### PYMC3概述\n",
|
|
|
|
|
"**马氏链的平稳性**\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"对于先验分布,如果能找到一个转移矩阵,那么在n步之后就会收敛到一个平稳分布,而这个分布就是我们要的后验分布。得到平稳分布后,根据平稳性,继续乘上这个转移概率矩阵,平稳分布依然不会改变,所以我们就从得到平稳分布开始每次对其进行抽样。\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"<img src=\"assets/20201128211430.png\" width=\"30%\">\n",
|
|
|
|
|
"<img src=\"assets/20201128211519.png\" width=\"30%\">\n",
|
|
|
|
|
"越到后面几代人,值越平稳不变"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [
|
|
|
|
|
{
|
|
|
|
|
"data": {
|
|
|
|
|
"text/html": [
|
|
|
|
|
"\n",
|
|
|
|
|
" <div>\n",
|
|
|
|
|
" <style>\n",
|
|
|
|
|
" /* Turns off some styling */\n",
|
|
|
|
|
" progress {\n",
|
|
|
|
|
" /* gets rid of default border in Firefox and Opera. */\n",
|
|
|
|
|
" border: none;\n",
|
|
|
|
|
" /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
|
|
|
|
|
" background-size: auto;\n",
|
|
|
|
|
" }\n",
|
|
|
|
|
" .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
|
|
|
|
|
" background: #F44336;\n",
|
|
|
|
|
" }\n",
|
|
|
|
|
" </style>\n",
|
|
|
|
|
" <progress value='6' class='' max='6' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
|
|
|
|
|
" 100.00% [6/6 00:00<00:00 logp = -3,398.7, ||grad|| = 1,989]\n",
|
|
|
|
|
" </div>\n",
|
|
|
|
|
" "
|
|
|
|
|
],
|
|
|
|
|
"text/plain": [
|
|
|
|
|
"<IPython.core.display.HTML object>"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"output_type": "display_data"
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "stdout",
|
|
|
|
|
"output_type": "stream",
|
|
|
|
|
"text": [
|
|
|
|
|
"\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "stderr",
|
|
|
|
|
"output_type": "stream",
|
|
|
|
|
"text": [
|
|
|
|
|
"Multiprocess sampling (2 chains in 2 jobs)\n",
|
|
|
|
|
"Metropolis: [mu]\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"data": {
|
|
|
|
|
"text/html": [
|
|
|
|
|
"\n",
|
|
|
|
|
" <div>\n",
|
|
|
|
|
" <style>\n",
|
|
|
|
|
" /* Turns off some styling */\n",
|
|
|
|
|
" progress {\n",
|
|
|
|
|
" /* gets rid of default border in Firefox and Opera. */\n",
|
|
|
|
|
" border: none;\n",
|
|
|
|
|
" /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
|
|
|
|
|
" background-size: auto;\n",
|
|
|
|
|
" }\n",
|
|
|
|
|
" .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
|
|
|
|
|
" background: #F44336;\n",
|
|
|
|
|
" }\n",
|
|
|
|
|
" </style>\n",
|
|
|
|
|
" <progress value='370422' class='' max='402000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
|
|
|
|
|
" 92.14% [370422/402000 1:06:04<05:38 Sampling 2 chains, 0 divergences]\n",
|
|
|
|
|
" </div>\n",
|
|
|
|
|
" "
|
|
|
|
|
],
|
|
|
|
|
"text/plain": [
|
|
|
|
|
"<IPython.core.display.HTML object>"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"output_type": "display_data"
|
|
|
|
|
}
|
|
|
|
|
],
|
|
|
|
|
"source": [
|
|
|
|
|
"with pm.Model() as model: # 构造模型\n",
|
|
|
|
|
" mu = pm.Uniform('mu', lower=0, upper=60) # 指定一个均值,服从正态分布且是0-60之间\n",
|
|
|
|
|
" # 指定泊松分布,并把数据放进去\n",
|
|
|
|
|
" likelihood = pm.Poisson('likelihood', mu=mu, observed=messages['time_delay_seconds'].values)\n",
|
|
|
|
|
"\n",
|
|
|
|
|
" start = pm.find_MAP()\n",
|
|
|
|
|
" step = pm.Metropolis()\n",
|
|
|
|
|
" trace = pm.sample(200000, step, start=start, progressbar=True)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"上面的代码通过遍历 μ 的后验分布的高概率区域,收集了 200,000 个 μ 的可信样本。下图(左)显示这些值的分布,平均值与频率论的估值(红线)几乎一样。但是,我们同时也得到了不确定度,μ 值介于 17 到 19 之间都是可信的。我们稍后会看到这个不确定度很有价值。\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"<img src=\"assets/20201128211719.png\" width=\"50%\">"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### PyMC3概述\n",
|
|
|
|
|
"PyMC3是一个用于概率编程的Python库,提供了一套非常简洁直观的语法,非常接近统计学中描述概率模型的语法,可读性很高。其核心计算部分基于NumPy和Theano。Theano是用于深度学习的Python库,可以高效地定义、优化和求解多维数组的数学表达式。"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# 定义抛硬币问题,假设知道真实的参数θ,用theta_real来表示\n",
|
|
|
|
|
"np.random.seed(1)\n",
|
|
|
|
|
"n_experiments = 100\n",
|
|
|
|
|
"theta_real = 0.35\n",
|
|
|
|
|
"data = stats.bernoulli.rvs(p=theta_real, size=n_experiments)\n",
|
|
|
|
|
"print(data)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"现在有了数据,需要再指定模型。前面我们通过似然和先验的概率分布完成。对于似然可以用二项分布来描述,对于先验可以用参数为α=β=1的beta分布描述。这个beta分布再[0,1]区间内均匀分布是一样的。可以用数学表达式描述如下:\n",
|
|
|
|
|
"<img src=\"assets/20201128213453.png\" width=\"30%\">"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"with pm.Model() as our_first_model: # 构造模型\n",
|
|
|
|
|
" theta = pm.Beta('theta', alpha=1, beta=1) # 指定先验\n",
|
|
|
|
|
" y = pm.Bernoulli('y', p=theta, observed=data) # 用和先验相同的语法描述似然,唯一不同的是用observed变量传递观测到的数据\n",
|
|
|
|
|
"\n",
|
|
|
|
|
" start = pm.find_MAP() # 返回最大后验,为采样方法提供一个初始点\n",
|
|
|
|
|
" step = pm.Metropolis() # 定义采样方法,用Metropolis-Hastings算法,PyMC3也会根据不同参数的特征自动赋予一个采样器\n",
|
|
|
|
|
" trace = pm.sample(1000, step=step, start=start) # 执行推断,其中第一个参数是采样次数,2、3分别是采样方法和初始点"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"**诊断采样过程**\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"现在我们根据有限数量的样本对后验做出近似,接下来就是坚持我们的近似是否合理。我们可以做一些测试,可视化、定量等。这些测试会尝试从样本中发现问题,但并不能证明我们得到的分布是正确的,我们只能提供证据证明样本看起来是合理的。如果我们通过样本发现问题,解决办法有如下几种。\n",
|
|
|
|
|
"<ul>\n",
|
|
|
|
|
" <li>增加样本次数。\n",
|
|
|
|
|
" <li>从样本链的前面部分去掉一定数量的样本,称为老化(Burn-in)\n",
|
|
|
|
|
" <li>重新参数化模型,也就是换一种不同但却等价的方式描述模型。\n",
|
|
|
|
|
"</ul>\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"**收敛性**\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"通常,我们要做的第一件事就是查看结果长什么样,利用traceplot函数"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"burnin = 100 # 指定老化,前100个\n",
|
|
|
|
|
"chain = trace[burnin:]\n",
|
|
|
|
|
"pm.traceplot(chain, lines={'theta':theta_real})"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"对于未观測到的变量,我们得到了两幅图。左图是一个核密度估计( Kernel Density Estimation,KDE)图。右图描绘的是每一步采样过程中得到的采样值。注意图中红色的线表示变量 theta real的值。\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"在得到这些图之后,我们需要观察什么呢? 首先,KDE图看起来应该是光滑的曲线。通常,随着数据的增加,根据中心极限定理,参数的分布会趋近于高斯分布。右侧的图看起来应该像白噪声,也就是说有很好的混合度( mixing),我们看不到任何可以识别的模式,也看不到向上或者向下的曲线,相反,我们希望看到曲线在某个值附近震荡。对于多峰分布或者离散分布,我们希望曲线不要在某个值或区域停留过多时间,我们希望看到采样值在多个区间自由移动。此外,我们希望迹表现出稳定的相似性,也就是说,前10%看起来跟后50%或者10%差不多。再次强调,我们不希望看到规律的模式,相反我们期望看到的是噪声。下图展示了一些迹呈现较好混合度(右侧)与较差混合度(左侧)的对比。\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"<img src=\"assets/20201128215037.png\" width=\"50%\">\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"如果迹的前面部分跟其他部分看起来不太一样,那就意味着需要进行老化处理,如果其他部分没有呈现稳定的相似性或者可以看到某种模式,那这意味着需要更多的采样,或者需要更换采样方法或者参数化方法。对于一些复杂的模型,我们可能需要结合使用前面所有的策略。"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### 模型决策\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"一种定量地检测收敛性的方法是 Gelman- Rubin检验。该检验的思想是比较不同迹之间的差异和迹内部的差异,因此,需要多组迹来进行该检验。理想状态下,我们希望得到R=1,根据经验,如果得到的值低于1.1,那么可以认为是收做的了,更高的值则意味着没有收敛。"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"pm.gelman_rubin(chain)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"函数 summary提供了对后验的文字描述,它可以提供后验的均值、标准差和HPD区间(最大后验密度可信区间)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"pm.summary(chain)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"其中,返回值之一是 mc_error,这是对采样引入误差的估计值,该值考虑的是所有的采样值并非真的彼此独立。 mc error是迹中不同块的均值的标准差,每一块是迹中的一部分。"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|