Add 卡方检验

pull/2/head
benjas 4 years ago
parent c2b55f79f8
commit 582043dbb5

@ -34,7 +34,7 @@
"metadata": {},
"source": [
"### 假设校验的基本思想\n",
"<img src=\"assets/20201114091803.png\" width=\"70%\">"
"<img src=\"assets/20201114091803.png\" width=\"50%\">"
]
},
{
@ -130,7 +130,7 @@
"metadata": {},
"source": [
"### 左侧检验与右侧检验\n",
"<img src=\"assets/20201114095426.png\" width=\"100%\">"
"<img src=\"assets/20201114095426.png\" width=\"70%\">"
]
},
{
@ -147,7 +147,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"assets/20201114100216.png\" width=\"70%\">"
"<img src=\"assets/20201114100216.png\" width=\"30%\">"
]
},
{
@ -202,7 +202,7 @@
"source": [
"### 总体均值检验\n",
"什么时候用Z检验什么时候用T检验\n",
"<img src=\"assets/20201114101438.png\" width=\"70%\">\n",
"<img src=\"assets/20201114101438.png\" width=\"50%\">\n",
"一般是用T检验"
]
},
@ -320,10 +320,10 @@
"source": [
"这里是双侧即二分之一α1-α/2=0.975,查表\n",
"<br>网上搜索:统计分布临界值\n",
"<img src=\"assets/20201115111943.png\" width=\"70%\">\n",
"<img src=\"assets/20201115111943.png\" width=\"50%\">\n",
"<br><br>\n",
"1.9+0.6=1.96统计量为10.4比1.96大意味着面积肯定小于1.96临界值(α/2的面积\n",
"<img src=\"assets/20201115112520.png\" width=\"70%\">\n",
"<img src=\"assets/20201115112520.png\" width=\"50%\">\n",
"根据双侧检验若p值 < α/2拒绝H0"
]
},
@ -366,7 +366,7 @@
"<br>-2.83在-1.96左侧也就是p值的面积小于α/2α = 0.05的水平上拒绝H0\n",
"<br>结论:\n",
"<br>有证据表明新机床加工的零件的椭圆度与以前有显著差异\n",
"<img src=\"assets/20201115151902.png\" width=\"50%\">"
"<img src=\"assets/20201115151902.png\" width=\"30%\">"
]
},
{
@ -394,7 +394,7 @@
"<br>2.4在1.645右侧也就是p值的面积小于αα = 0.05的水平上拒绝H0\n",
"<br>结论:\n",
"<br>有证据表明新生产的灯泡的使用寿命有显著提高\n",
"<img src=\"assets/20201115152913.png\" width=\"50%\">"
"<img src=\"assets/20201115152913.png\" width=\"30%\">"
]
},
{
@ -464,7 +464,7 @@
"metadata": {},
"source": [
"网上搜索t分布临界值表\n",
"<img src=\"assets/20201116212408.png\" width=\"70%\">"
"<img src=\"assets/20201116212408.png\" width=\"50%\">"
]
},
{
@ -514,7 +514,7 @@
"**实例1**\n",
"<br>\n",
"有12名接种卡介苗的儿童8周虐用两批不同的结核菌素一批是标准结核菌素一批是新制结核菌素分别注射在儿童的前臂两种结核菌素的皮肤浸润反应平均直径(mm)如表所示,问两种结核菌素的反应性有无差别。\n",
"<img src=\"assets/20201116215635.png\" width=\"70%\">"
"<img src=\"assets/20201116215635.png\" width=\"50%\">"
]
},
{
@ -593,7 +593,7 @@
"**实例2**\n",
"<br>\n",
"25例糖尿病患者随机分成两组甲组单纯用药物治疗乙组采用药物治疗合并饮食疗法二个月后测空腹血糖(mmoL)如表所示,问两种疗法治疗后患者血糖值是否相同?\n",
"<img src=\"assets/20201116221815.png\" width=\"70%\">"
"<img src=\"assets/20201116221815.png\" width=\"50%\">"
]
},
{
@ -641,9 +641,104 @@
" <li>图示法:常用的图示法包括P-P图法和QQ图法。图中数据呈直线关系可认为呈正态分布不呈直线关系可认为呈偏态分布\n",
" <li>偏度检验:主要计算偏度系数,H0O:G1=0,总体分布对称H1:G1#0,总体分布不对称\n",
"\n",
"<img src=\"assets/20201117073547.png\" width=\"70%\">\n",
"<img src=\"assets/20201117073547.png\" width=\"30%\">\n",
" <li>峰值检验主要计算峰度系数H0: G2=0总体分布为正态峰H1G2≠0总体分布不是正态峰\n",
" <img src=\"assets/20201117075130.png\" width=\"70%\">"
" <img src=\"assets/20201117075130.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 卡方检验Chi-square test\n",
"用于检验两个(或多个)率或构成比之间差别是否有统计学意义,配对卡方检验检验配对计数资料的差异是否有统计学意义。\n",
"\n",
"### 基本思想\n",
"检验实际频数(A)和理论频数(T的差别是否由抽样误差所引起的。也就是由样本率(或样本构成比)来推断总体率或构成比"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**实例1**\n",
"<br>\n",
"两种药物治疗青溃疡有效率的比较\n",
"<img src=\"assets/20201117223133.png\" width=\"30%\">\n",
"理论频数与实际频数的差别:\n",
"$$\n",
"X^2 = \\sum \\frac{(A_{RC}-T_{RC})^2}{T_{RC}}\n",
"$$\n",
"ARC是位于R行C列交叉处的实际频数TRC是位于R行C列交叉处的理论频数。(ARC-TRC)反映实际频数与理论频数的差距除以TRC为的是考虑相对差距。所以x^2值反映了实际频数与理论频数的吻合程度X^2值大说明实际频数与理论频数的差距大。^2值的大小除了与实际频数和理论频数的差的大小关外还与它们的行、列数有关。即自由度的大小。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 理论频数根据假设来计算\n",
"无效假设是A药组与B药组的总体有效率相等均等于合计的阳性率(110/165)。那么理论上A药组的85例中阳性人数应为85(110/165)=56.67阴性人数为85(55/165)=28.33同理B药组的80例中阳性人数应为80(110/165)=53.33阴性人数为80(55/165)=26.67。\n",
"<img src=\"assets/20201117224924.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"基本公式: \n",
"$$\n",
"X^2 = \\sum \\frac{(A-T)^2}{T}\n",
"$$\n",
"$$\n",
"= \\frac{[a-\\frac{(a+b)(a+c)}{a+b+c+d}]^2}{\\frac{(a+b)(a+c)}{a+b+c+d}}\n",
"+ \\frac{[b-\\frac{(a+b)(b+d)}{a+b+c+d}]^2}{\\frac{(a+b)(b+d)}{a+b+c+d}}\n",
"+...\n",
"+ \\frac{[d-\\frac{(c+d)(b+d)}{a+b+c+d}]^2}{\\frac{(c+d)(b+d)}{a+b+c+d}}\n",
"$$\n",
"$$\n",
"= \\frac{(ad-bc)^2 * n}{(a+b)(c+d)(a+c)(b+d)}\n",
"$$\n",
"$$\n",
"v = 1\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"若检验假设H0:π1=π2成立四个格子的实际频数A与理论频数T相差不应该很大即统计量x2不应该很大。如果x2值很大即相对应的P值很小若P≤α则反过来推断A与T相差太大超出了抽样误差允许的范围从而怀疑H0的正确性继而拒绝Ho接受其对立假设H1,即π1≠π2。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**实例2**\n",
"<br>\n",
"某药品检验所随机抽取574名成年人研究抗生素的耐药性。问两种人群的耐药率是否一致?\n",
"\n",
"<img src=\"assets/20201117225550.png\" width=\"30%\">\n",
"<img src=\"assets/20201117225642.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(1) 建立假设并确定检验水准\n",
"<ul>\n",
" <li>H0:两种人群对该抗生素的耐药率相同,即π1=m2;(两总体率相等)\n",
" <li>H1:两种人群对该抗生素的耐药率不同,即π1≠m2:(两总体不相等)\n",
" <li>α=0.05\n",
"</ul>\n",
"(2) 计算检验统计量\n",
"$$\n",
"X^2 = \\frac{(180-17410)^2}{17410}+\\frac{(215-22090)^2}{22090}+...\n",
"+\\frac{(106-10010)^2}{10010} = 23.12\n",
"$$\n",
"(3) 得出结果\n",
"査表确定P值,P>0.05,得出结论。按0.05水准不拒绝H0可以认为两组人群对该抗生素的耐药率的差异无统计学意义。"
]
},
{

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

@ -646,6 +646,101 @@
" <img src=\"assets/20201117075130.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 卡方检验Chi-square test\n",
"用于检验两个(或多个)率或构成比之间差别是否有统计学意义,配对卡方检验检验配对计数资料的差异是否有统计学意义。\n",
"\n",
"### 基本思想\n",
"检验实际频数(A)和理论频数(T的差别是否由抽样误差所引起的。也就是由样本率(或样本构成比)来推断总体率或构成比"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**实例1**\n",
"<br>\n",
"两种药物治疗青溃疡有效率的比较\n",
"<img src=\"assets/20201117223133.png\" width=\"30%\">\n",
"理论频数与实际频数的差别:\n",
"$$\n",
"X^2 = \\sum \\frac{(A_{RC}-T_{RC})^2}{T_{RC}}\n",
"$$\n",
"ARC是位于R行C列交叉处的实际频数TRC是位于R行C列交叉处的理论频数。(ARC-TRC)反映实际频数与理论频数的差距除以TRC为的是考虑相对差距。所以x^2值反映了实际频数与理论频数的吻合程度X^2值大说明实际频数与理论频数的差距大。^2值的大小除了与实际频数和理论频数的差的大小关外还与它们的行、列数有关。即自由度的大小。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 理论频数根据假设来计算\n",
"无效假设是A药组与B药组的总体有效率相等均等于合计的阳性率(110/165)。那么理论上A药组的85例中阳性人数应为85(110/165)=56.67阴性人数为85(55/165)=28.33同理B药组的80例中阳性人数应为80(110/165)=53.33阴性人数为80(55/165)=26.67。\n",
"<img src=\"assets/20201117224924.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"基本公式: \n",
"$$\n",
"X^2 = \\sum \\frac{(A-T)^2}{T}\n",
"$$\n",
"$$\n",
"= \\frac{[a-\\frac{(a+b)(a+c)}{a+b+c+d}]^2}{\\frac{(a+b)(a+c)}{a+b+c+d}}\n",
"+ \\frac{[b-\\frac{(a+b)(b+d)}{a+b+c+d}]^2}{\\frac{(a+b)(b+d)}{a+b+c+d}}\n",
"+...\n",
"+ \\frac{[d-\\frac{(c+d)(b+d)}{a+b+c+d}]^2}{\\frac{(c+d)(b+d)}{a+b+c+d}}\n",
"$$\n",
"$$\n",
"= \\frac{(ad-bc)^2 * n}{(a+b)(c+d)(a+c)(b+d)}\n",
"$$\n",
"$$\n",
"v = 1\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"若检验假设H0:π1=π2成立四个格子的实际频数A与理论频数T相差不应该很大即统计量x2不应该很大。如果x2值很大即相对应的P值很小若P≤α则反过来推断A与T相差太大超出了抽样误差允许的范围从而怀疑H0的正确性继而拒绝Ho接受其对立假设H1,即π1≠π2。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**实例2**\n",
"<br>\n",
"某药品检验所随机抽取574名成年人研究抗生素的耐药性。问两种人群的耐药率是否一致?\n",
"\n",
"<img src=\"assets/20201117225550.png\" width=\"30%\">\n",
"<img src=\"assets/20201117225642.png\" width=\"30%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(1) 建立假设并确定检验水准\n",
"<ul>\n",
" <li>H0:两种人群对该抗生素的耐药率相同,即π1=m2;(两总体率相等)\n",
" <li>H1:两种人群对该抗生素的耐药率不同,即π1≠m2:(两总体不相等)\n",
" <li>α=0.05\n",
"</ul>\n",
"(2) 计算检验统计量\n",
"$$\n",
"X^2 = \\frac{(180-17410)^2}{17410}+\\frac{(215-22090)^2}{22090}+...\n",
"+\\frac{(106-10010)^2}{10010} = 23.12\n",
"$$\n",
"(3) 得出结果\n",
"査表确定P值,P>0.05,得出结论。按0.05水准不拒绝H0可以认为两组人群对该抗生素的耐药率的差异无统计学意义。"
]
},
{
"cell_type": "code",
"execution_count": null,

Loading…
Cancel
Save