第十九章生存分析方积乾中山大学公共卫生学院2013.5Survivaltime:从规定的观察起点到某一特定终点事件出现的时间长度三要素:观察起点、终点事件和时间的度量例如,膀胱肿瘤患者从手术切除到死亡的时间急性白血病患者从药物化疗到完全缓解的时间乳腺增生症妇女从阳性体征消失至首次复发的时间第一节生存分析基本概念一、生存时间表19-1膀胱肿瘤患者生存资料变量赋值表变量(1)因素(2)分组及赋值(3)变量(1)因素(2)分组及赋值(3)age年龄岁start手术日期月/日/年grade肿瘤分级I级:1;II级:2;III级:3end终止观察日期月/日/年size肿瘤大小(cm)<3.0:0;≥3.0:1t生存时间月relapse是否复发未复发:0;复发:1status生存结局删失:0;死亡:1表19-26例膀胱肿瘤患者生存资料原始记录表id(1)age(2)grade(3)size(4)relapse(5)start(6)end(7)t(8)status(9)结局(10)16210002/10/199612/30/2000590删失(存活)26410003/05/199608/12/2000541死亡35220104/09/199612/03/1999440删失(失访)46010006/06/199610/27/2000530删失(死于其它)55921007/20/199606/21/1998231死亡65911108/19/199609/10/1999371死亡1.完全数据(completedata)按随访结局,2号、5号和6号患者2.删失数据(censoreddata)未能观察到终点事件发生,生存时间未知产生删失数据的原因:(1)研究结束时终点事件尚未发生,如1号患者(2)失访,如3号患者(3)病人因死于其它原因,如4号患者删失数据常在其右上角标记“+”本章假定删失的发生是随机的二、死亡概率与生存概率死亡概率(probabilityofdeath)某时段开始时存活的个体,在该时段内死亡的可能性。年死亡概率:年初尚存人口在今后1年内死亡的可能性某年年初人口数某年内死亡人数q生存概率(probabilityofsurvival)qp1三、生存率(survivalrate)生存函数(survivalfunction):个体kt时刻仍存活的概率若无删失数据,直接法计算生存率的公式为观察总例数时刻仍存活的例数kkkttTPtS)()(ˆ若有删失数据,须分时段),(),...,,(),,0(1211kkttttt计算生存概率各时段上的kPPP,...,,21kkkkkptSppptTPtS)(ˆ)()(ˆ121又称累积生存概率(cumulativeprobabilityofsurvival)第二节生存曲线的估计非参数法估计生存率:寿命表法(lifetablemethod):适用于粗略生存时间、大样本乘积极限法(productlimitmethod):适用于精确生存时间一、寿命表法例21-1374名某恶性肿瘤患者随访资料表21-3寿命表法估计生存率计算表序号确诊后年数期内死亡数期内删失数期初病例数期初有效例数死亡概率生存概率生存率生存率标准误iitidicininiqip)(ˆitSˆ()iStS(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)10~900374374.090/374.0=0.24060.75940.75940.022121~760284284.076/284.0=0.26760.73240.7594×0.7324=0.55620.025732~510208208.051/208.0=0.24520.75480.5562×0.7548=0.41980.025543~2512157151.025/151.0=0.16560.83440.4198×0.8344=0.35030.024854~205120117.520/117.5=0.17020.82980.3503×0.8298=0.29070.023965~799590.57/90.5=0.07730.92270.2907×0.9227=0.26820.023576~497974.54/74.5=0.05370.94630.2682×0.9463=0.25380.023387~136664.51/64.5=0.01550.98450.2538×0.9845=0.24990.023398~356259.53/59.5=0.05040.94960.2499×0.9496=0.23730.0232109~10255451.52/51.5=0.03880.96120.2373×0.9612=0.22810.0232注:生存时间长于10年者47例。111iiicdniicn5.0生存曲线(survivalcurve)以生存时间为横轴,生存率为纵轴,将各个时间点所对应的生存率连接在一起的曲线序号时间(月)死亡数删失数期初例数死亡概率生存概率生存率生存率标准误iitidiciniqip)(ˆitSˆ()iStS(1)(2)(3)(4)(5)(6)(7)(8)(9)11410141/14=0.07140.92860.92860.068821910131/13=0.07690.92310.9286×0.9231=0.85720.093532610121/12=0.08330.91670.8572×0.9167=0.78580.109742810111/11=0.09090.90910.7858×0.9091=0.71440.120752910101/10=0.10000.90000.7144×0.9000=0.64290.12816321091/9=0.11110.88890.6429×0.8889=0.57150.13237361081/8=0.12500.87500.5715×0.8750=0.50010.13368401071/7=0.14290.85710.5001×0.8571=0.42860.13239421061/6=0.16670.83330.4286×0.8333=0.35710.12811044+0150/5=0.00001.00000.3571×1.0000=0.35710.128111451041/4=0.25000.75000.3571×0.7500=0.26780.12331253+0130/3=0.00001.00000.2678×1.0000=0.26780.123313541021/2=0.50000.50000.2678×0.5000=0.13390.11301459+0110/1=0.00001.00000.1339×1.0000=0.13390.1130111iiicdn例19-2已知14例肿瘤<3.0cm的膀胱肿瘤患者的生存时间(月):14192628293236404244+4553+5459+试估计生存函数。二、乘积极限法Kaplan-Meier法生存曲线为阶梯形曲线中位生存期(mediansurvivaltime)又称半数生存期----恰有50%的个体存活的时间肿瘤<3.0cm组中位生存期≈36(月)肿瘤≥3.0cm组中位生存期≈20(月)思考:若各时间点生存率均大于50%,中位生存期?三、生存率的区间估计1.方法1标准误ˆ()ˆ()()iijjiStttjjjdSStnnd大样本时,生存率近似地服从正态分布总体生存率的(1-)置信区间ˆ/2()ˆ()iiStStZS例:ti=28时,4ˆ()1111ˆ()0.71440.1207()1413131212111110ijjiStttjjjdSStnnd总体生存率的95%置信区间0.7144±1.96×0.1207=(0.4778,0.9509)序号时间(月)死亡数删失数期初例数死亡概率生存概率生存率生存率标准误iitidiciniqip)(ˆitSˆ()iStS(1)(2)(3)(4)(5)(6)(7)(8)(9)11410141/14=0.07140.92860.92860.068821910131/13=0.07690.92310.9286×0.9231=0.85720.093532610121/12=0.08330.91670.8572×0.9167=0.78580.109742810111/11=0.09090.90910.7858×0.9091=0.71440.120752910101/10=0.10000.90000.7144×0.9000=0.64290.12816321091/9=0.11110.88890.6429×0.8889=0.57150.13237361081/8=0.12500.87500.5715×0.8750=0.50010.13368401071/7=0.14290.85710.5001×0.8571=0.42860.13239421061/6=0.16670.83330.4286×0.8333=0.35710.12811044+0150/5=0.00001.00000.3571×1.0000=0.35710.128111451041/4=0.25000.75000.3571×0.7500=0.26780.12331253+0130/3=0.00001.00000.2678×1.0000=0.26780.123313541021/2=0.50000.50000.2678×0.5000=0.13390.11301459+0110/1=0.00001.00000.1339×1.0000=0.13390.1130111iiicdn第三节生存曲线的比较例19-2试比较膀胱肿瘤小于3.0cm和大于或等于3.0cm患者的生存曲线,就总体而言,是否有差别。<3.0cm14192628293236404244+4553+5459+≥3.0cm67910111213202325273034374350一、log-rank检验0H:)()(21tStS,即两总体生存曲线相同1H:)()(21tStS,即两总体生存曲线不同=0.051.将两组资料统一按生存时间(it)由小到大排序67910111213141920232526272829303234363740424344+455053+5459+也称时序检验肿瘤<3.0cm组肿瘤≥3.0cm组合计序号i(1)时间(月)it(2)1in(3)1id(4)1iT(5)2in(6)2id(7)2iT(8)in(9)id(10)161400.46671610.5333301271400.48271510.5172291391400.50001410.5000281410140141/27131131/27271511140141/26121121/26261612140141/25111111/25251713140141/24101101/24241814141141/239091/23231919131131/229091/22221…………………………合计——1117.5416—169.4584—2722gggATT22(1117.5416)169.45846.9617.54169.4584()查2界值表,1,得010.0005.0P,拒绝0H,可认为两条生存曲线不同二、应用及其注意事项1.log-rank检验也适用于寿命表资料及多组比较2.相对死亡比(relativedeathratio)TAR相对危险度(relativerisk,RR)1122/11/17.54160.37/16/9.4584ATRRAT3.log-rank检验属单因素分析方法应用条件:影响生存率的混杂因素在组间均衡4.以上介绍的