Spearman与Pearson汇报人:屠星月内容一.Spearman相关系数的含义二.Spearman相关分析的特点三.Spearman与Pearson相关分析的对比四.结果的解释五.对我的研究数据的适用性一Spearman相关系数的含义Spearmanrank-ordercorrelationcoefficient•两个连续或等级/秩变量之间的单调相关(monotoneassociation)关系。•单调函数描述两个变量之间的关系的程度(howwelltherelationshipbetweentwovariablescanbedescribedusingamonotonicfunction)•Pearson相关系数的一个特例(系数计算中的X,Y被其秩代替):RxandRyaretheranksofthexandyvariables,respectivelyisthesquareofthedifferencebetweenthecorrespondingranksofxiandyi,andnisthenumberobservations.𝑑𝑖2二Spearman相关分析的特点•不对变量的分布做假设:非参数(自由分布)•可用于等级数据(ordinaldata:Themostfundamentalrequisiteistobeabletomeasureourobservedcorrespondencebyaplainnumericalsymbol)•当数据分布导致,pearson相关系数不适用或者产生误导的使用。whenthedistributionofdatamakesPearson’scorrelationcoefficientundesirableormisleading•当数据为定性的变量,或者含有异常值时尤其适用。三Spearman与Pearson相关分析的对比三Spearman与Pearson相关分析的对比三Spearman与Pearson相关分析的对比1.对同样的数据,Spearman相关统计相关,而pearson可能显著也可能不显著thesignificanceofSpearman’scorrelationcanleadtothesignificanceornon-significanceofPearson’scorrelationcoefficientevenforbigsetsofdata2.对同样的数据,二者所得的相关系数符号有可能不一样(ItispossibletomeetasituationwherePearson’scoefficientisnegativewhileSpearman’scoefficientispositive.)3.注意:确保不要针对两个变量的相关强度,对spearman相关系数过度解释MakesurenottooverinterpretSpearman’srankcorrelationcoefficientasasignificantmeasureofthestrengthoftheassociationsbetweentwovariables.四结果的解释EffectSize(Biologicalrelevance/correlationcoefficient/r)Statisticalsignificance(P)两个变量一起变化的强度和方向Extenttowhichtwovariablestendtochangetogether(strengthanddirection)如果原假设H0为真,得到现有的数据的可能性。如果这个可能性小于α(0.05),就认为是小概率事件(这是不太可能的),因此拒绝H0。α-第一类错误的可能性(原假设为真的情况下,拒绝原假设的概率)Statisticalpower•原假设错误的情况下,拒绝原假设的可能性•1−β(β-第二类错误的可能性,原假设为假,接受原假设的概率)theabilityofatesttocorrectlyrejectthenullhypothesis•Powertest的作用:在进行假设检验之前,进行powertest,确保样本量足够大(Calculatingthepowerofatestbeforehandwillhelpyouensurethatthesamplesizeislargeenoughforthepurposeofthetest.),应该大于80%四结果的解释r与P值1.相关系数r(值和符号)2.1-α置信区间3.P值呈现分析结果时应当包含的信息当统计显著,但相关系数不大,则有可能是其他变量的影响四结果的解释r的大小BenedictK,TheCorrelationCoefficient-ExplainedinThreeSteps,=ugd4k3dC_8Y四结果的解释1.不要在非统计意义的“显著性”“显著地”这样的词2.而用“统计上显著的”,不要试图在P值的量级上进行判断。(tomakeclearthatyoudonotintendtomakeanyvaluejudgementonmagnitudewhenyouusetheterm“significant”)3.Whenyourefertobiologicalsignificance(whenusingeffectsizes)usetheexpression‘biologicalrelevance’4.统计显著不代表相关性强(Keepinmindthatastatisticallysignificantresultdoesnotentailthatthemagnitudeoftheeffectisrelevantbiologically(andthatontopofthatitissupportedbystatistics)5.“统计上显著的”仅仅代表P值小于预先设定的阈值α6.当P值高于α时,需保证已经提前做过powertest,再接受原假设7.r不一定要很大才表示相关,应当基于知识判断生物学相关(Biologicalrelevanceisdecidedapriorianditisnotnecessarilyalargevalue.Youhavetojudgethemagnitudeyouconsiderbiologicallyrelevantonacasebycasebasis.)8.不确定性。对所有的估计参数都提供置信区间(这也是对P值的补充)9.记住P值不是拒绝原假设的直接测量。P值越小,不代表“越好”。OnlyusingBayesianstatisticsyouobtainatruemeasureofevidence.)10.P值只是表明,如果原假设为真时,获得现有数据的可能性。Rememberthatallyourp-valueprovidesyouwithistheprobabilityofhavingobtainedyourdata,ormoreextremedata,andonlyifthenullhypothesisthatyoustatedisabsolutelytrue.Nootherinterpretationofwhatap-valuemeansiscorrect.六参考文献1.Sullivan,G.M.andR.Feinn(2012).Usingeffectsize-orwhythePvalueisnotenough.Journalofgraduatemedicaleducation4(3):279-282.2.Support,M.E.AcomparisonofthePearsonandSpearmancorrelationmethods.“3.Hauke,J.andT.Kossowski(2011).ComparisonofvaluesofPearson'sandSpearman'scorrelationcoefficientsonthesamesetsofdata.Quaestionesgeographicae30(2):87-93.4.Martinez-Abrain,A.(2008).Statisticalsignificanceandbiologicalrelevance:Acallforamorecautiousinterpretationofresultsinecology.actaoecologica34(1):9-11.5.PowerofaStatisticalTest-MoreSteam.com6.BenedictK,TheCorrelationCoefficient-ExplainedinThreeSteps,=ugd4k3dC_8Y7.AmandaRockinson-Szapkiw,HowtoCalculateStatisticalPowerUsingSPSS,=sAkew2lK-co谢谢!