正态分布与医学参考值范围normaldistribution&medicalreferencerange67.375.473.170.975.172.678.268.873.871.566.575.170.768.973.372.376.574.375.975.467.271.876.270.670.775.673.372.476.667.380.874.373.971.679.969.380.375.773.581.274.472.577.167.374.168.076.470.471.075.873.678.168.772.677.672.274.272.176.369.771.175.773.572.778.372.577.268.274.272.376.570.571.283.773.775.874.772.669.566.076.177.780.583.164.175.176.377.865.275.072.778.871.171.872.976.171.275.272.979.573.975.273.179.581.874.581.674.5表某地108名正常成年女子血清总蛋白(g/L)含量组段⑴频数,f⑵组中数,X⑶f·X⑷=⑵⑶f·X2⑸=⑵⑶264.0~66.0~68.0~70.0~72.0~74.0~76.0~78.0~80.0~82.0~84.0合计2681525231476210865.067.069.071.073.075.077.079.081.083.0-130.0402.0552.01065.01825.01725.01078.0553.0486.0166.07982.08450.026934.038088.075615.0133225.0129375.083006.043687.039366.013778.0591524.0表108名正常成年女子血清总蛋白(g/L)频数分布05101520256466687072747678808284血清总蛋白图*某地108名正常成年女子血清总蛋白(g/L)含量直方图051015202530353.74.14.54.95.35.7人数红细胞数/(1012/L)图某地150名正常成年男子红细胞数(1012/L)频数分布图.0.1.2.30246810xf(x).0.1.2.30246810xf(x)正态分布正态分布(normaldistribution)也叫高斯分布(Gaussiandistribution),是最常见、最重要的一种连续型分布。一、正态分布的数学形式二、标准正态分布三、曲线下面积四、正态性检验五、正态分布的应用xf(x)一.正态分布的数学形式22()(2)1()2xfxexf(x)x☺f(X)=随机变量X的频数,称为概率密度函数(probabilitydensityfunction)☺=总体方差,=总体均值☺X~N(,)☺以X为横坐标,f(X)为纵坐标,绘制的曲线就是正态曲线(normalcurve)222()(2)1()2(,)xfNXe正态分布的特征形态参数位置参数值越小。,越远离;取最大值,处=在轴为渐进线。左右对称,两端以为中心,=系上方,以正态曲线位于直角坐标,.3)(2/1)()(.2.1xfxfxfXxXxf(x))(f图4-5正态分布位置变换示意图00.10.20.30.40.5-4-2024xf(x)2(0,)N2(1,)N2(1.5,)N图4-5正态分布形态变换示意图00.10.20.30.40.5-4-3-2-101234xf(x)(0,1)N2(0,2)N2(0,0.5)N二.标准正态分布(standardnormaldistribution)两个参数:0,1,记为N(0,1)22(,)(0,1);1()exp,22uNXNuufuX经变换:一般正态分布转化为标准正态分布其中xXu1u0标准正态分布N(0,1)一般正态分布N(,)22()(2)1()2xxFxedx正态曲线下的面积分布有一定的规律。求其一区间的面积,可通过下面积分公式得到。概率是曲线下的面积!()()dXFXfxxXf(X)XXf(X)ab)()(d)()(aFbFxxfbXaPba()()daFafxx()()d1()aFafxxFaXf(X)a-a()(0,1)fxN0附表1标准正态曲线下面积(z)u0.000.010.02……0.060.070.080.09-3.00.00130.00130.0012……0.00110.00110.00100.0010-2.90.00190.00180.0018……0.00150.00150.00140.0014………………………………………………-2.50.00620.00600.0059……0.00520.00510.00490.0048………………………………………………-1.90.02870.02810.0274……0.02500.02440.02390.0233………………………………………………-0.10.46020.45620.4522……0.43640.43250.42860.42470.00.50000.49600.4920……0.47610.47210.46810.4641三.曲线下面积p466附表1(1.96)(1.96)[1(1.96)](1.96)12(1.96)120(1.961.0.96)0.9255Pu(2.58)(2.58)[1(2.58)](2.58)(2.582.58)12(2.50.98)120.00499Pu0-11-1.961.96-2.582.5868.27%95.00%99.00%曲线下面积分布规律N(0,1)1.961.96.58.5868.27%95.00%99.00%N(,)标准正态分布正态分布面积或概率-1~1μ±σ68.27%-1.96~1.96μ±1.96σ95.00%-2.58~2.58μ±2.58σ99.00%例某地108名正常成年女子的血清总蛋白(g/L)如下表,试估计该地正常女子血清总蛋白68.0g/L,78.0g/L,≥78.0g/L所占正常女子总人数的百分比。67.375.473.170.975.172.678.268.873.871.566.575.170.768.973.372.376.574.375.975.467.271.876.270.670.775.673.372.476.667.380.874.373.971.679.969.380.375.773.581.274.472.577.167.374.168.076.470.471.075.873.678.168.772.677.672.274.272.176.369.771.175.773.572.778.372.577.268.274.272.376.570.571.283.773.775.874.772.669.566.076.177.780.583.164.175.176.377.865.275.072.778.871.171.872.976.171.275.272.979.573.975.273.179.581.874.581.674.5表某地108名正常成年女子血清总蛋白(g/L)含量组段⑴频数,f⑵组中数,X⑶f·X⑷=⑵⑶f·X2⑸=⑵⑶264.0~66.0~68.0~70.0~72.0~74.0~76.0~78.0~80.0~82.0~84.0合计2681525231476210865.067.069.071.073.075.077.079.081.083.0-130.0402.0552.01065.01825.01725.01078.0553.0486.0166.07982.08450.026934.038088.075615.0133225.0129375.083006.043687.039366.013778.0591524.0表108名正常成年女子血清总蛋白(g/L)频数分布解:1.由频数分布判断,基本符合正态分布规律。2.计算均数、标准差,)/(9.31108108/0.79820.591524),/(9.731080.79822LgSLgX3.进行u变换,样本量较大,故用样本均数4.代替,S代替。4.估计u1和u2的分布函数,查附表1,得2105.19.39.730.7851.19.39.730.68uuXu=1469.0)0.78(8531.0)0.78(8531.01469.0105.1105.10655.0)0.68(0655.051.1XPXPXP,故,)()=(,,故)=(5.下结论。四.正态性检验(normalitytest)正态分布的两个特征:1.正态对称性2.正态峰:偏度、峰度方法:1.图示法Q-Q图,P-P图2.计算法xf(x)NormalQ-QPlotofBLOODObservedValue90807060ExpectedNormalValue90807060图108个原始数据的Q-Q图NormalP-PPlotofBLOODObservedCumProb1.00.75.50.250.00ExpectedCumProb1.00.75.50.250.00正态分布的应用1.估计医学参考值范围:利用正态曲线面积分布规律;2.质量控制:如控制实验中的随机误差;3.正态分布是许多统计方法的理论基础:如t分布、c分布、F分布等都是在正态分布的基础上推导出来的。医学参考值范围Medicalreferencerange参考值(referencevalue)范围(传统叫法“正常值范围”)指个体观察值的散布范围。如成人红细胞的总数4000~10000个/mm3概念扩展:规定食品、空气、水、土壤等卫生标准;流行病学:据潜伏期确定接触者的留验期限。参考值范围步骤:1.从“正常人”总体中抽样:明确研究总体;2.统一测定方法以控制系统误差;3.判断是否需要分组(如性别、年龄)确定;4.根据专业知识决定单侧还是双侧。单侧上限异常正常异常正常双侧下限双侧上限异常单侧下限异常正常参考值范围的计算方法:1.正态分布法2.百分位数法1.961.96.58.5868.27%95.00%99.00%N(,)1.正态分布法--适用于正态分布资料双侧(1-a)正常值范围:单侧(1-a)正常值范围:)()(2/下限上限SuXSuXSuXaaa双侧95%正常值范围:单侧95%正常值范围:1.961.64()1.64()XSXSXS上限下限例估计某地108名成年女子血清总蛋白(均数为73.9g/L,标准差为3.9g/L)95%参考值范围。1.9673.91.963.966.3(/)1.9673.91.963.981.5(/)XSgLXSgL下限:上限:故该地正常成年女子血清总蛋白的95%参考值范围是66.3~81.5g/L。解:因血清总蛋白过多或过少均为异常,故按双侧估计正常成年女子血清总蛋白的95%参考值范围。2.百分位数法适用于偏态分布资料双侧95%正常值范围:P2.5~P97.5单侧95%正常值范围:P95(上限)P5(下限)1.直接计算2.频数表法肌红蛋白含量(g/mL)人数累积频数累积频率(%)0~221.545~353.8510~91410.7715~122620.0020~154131.5425~276852.313