UnderstandingHypothesisTests:SignificanceLevels(Alpha)andPvaluesinStatisticsWhatdosignificancelevelsandPvaluesmeaninhypothesistests?Whatisstatisticalsignificanceanyway?Inthispost,I’llcontinuetofocusonconceptsandgraphstohelpyougainamoreintuitiveunderstandingofhowhypothesistestsworkinstatistics.Tobringittolife,I’lladdthesignificancelevelandPvaluetothegraphinmypreviouspostinordertoperformagraphicalversionofthe1samplet-test.It’seasiertounderstandwhenyoucanseewhatstatisticalsignificancetrulymeans!Here’swhereweleftoffinmylastpost.Wewanttodeterminewhetheroursamplemean(330.6)indicatesthatthisyear'saverageenergycostissignificantlydifferentfromlastyear’saverageenergycostof$260.Theprobabilitydistributionplotaboveshowsthedistributionofsamplemeanswe’dobtainundertheassumptionthatthenullhypothesisistrue(populationmean=260)andwerepeatedlydrewalargenumberofrandomsamples.Ileftyouwithaquestion:wheredowedrawthelineforstatisticalsignificanceonthegraph?Nowwe'lladdinthesignificancelevelandthePvalue,whicharethedecision-makingtoolswe'llneed.We'llusethesetoolstotestthefollowinghypotheses:Nullhypothesis:Thepopulationmeanequalsthehypothesizedmean(260).Alternativehypothesis:Thepopulationmeandiffersfromthehypothesizedmean(260).WhatIstheSignificanceLevel(Alpha)?Thesignificancelevel,alsodenotedasalphaorα,istheprobabilityofrejectingthenullhypothesiswhenitistrue.Forexample,asignificancelevelof0.05indicatesa5%riskofconcludingthatadifferenceexistswhenthereisnoactualdifference.Thesetypesofdefinitionscanbehardtounderstandbecauseoftheirtechnicalnature.Apicturemakestheconceptsmucheasiertocomprehend!Thesignificanceleveldetermineshowfaroutfromthenullhypothesisvaluewe'lldrawthatlineonthegraph.Tographasignificancelevelof0.05,weneedtoshadethe5%ofthedistributionthatisfurthestawayfromthenullhypothesis.Inthegraphabove,thetwoshadedareasareequidistantfromthenullhypothesisvalueandeachareahasaprobabilityof0.025,foratotalof0.05.Instatistics,wecalltheseshadedareasthecriticalregionforatwo-tailedtest.Ifthepopulationmeanis260,we’dexpecttoobtainasamplemeanthatfallsinthecriticalregion5%ofthetime.Thecriticalregiondefineshowfarawayoursamplestatisticmustbefromthenullhypothesisvaluebeforewecansayitisunusualenoughtorejectthenullhypothesis.Oursamplemean(330.6)fallswithinthecriticalregion,whichindicatesitisstatisticallysignificantatthe0.05level.Wecanalsoseeifitisstatisticallysignificantusingtheothercommonsignificancelevelof0.01.Thetwoshadedareaseachhaveaprobabilityof0.005,whichaddsuptoatotalprobabilityof0.01.Thistimeoursamplemeandoesnotfallwithinthecriticalregionandwefailtorejectthenullhypothesis.Thiscomparisonshowswhyyouneedtochooseyoursignificancelevelbeforeyoubeginyourstudy.Itprotectsyoufromchoosingasignificancelevelbecauseitconvenientlygivesyousignificantresults!Thankstothegraph,wewereabletodeterminethatourresultsarestatisticallysignificantatthe0.05levelwithoutusingaPvalue.However,whenyouusethenumericoutputproducedbystatisticalsoftware,you’llneedtocomparethePvaluetoyoursignificanceleveltomakethisdetermination.WhatArePvalues?P-valuesaretheprobabilityofobtaininganeffectatleastasextremeastheoneinyoursampledata,assumingthetruthofthenullhypothesis.ThisdefinitionofPvalues,whiletechnicallycorrect,isabitconvoluted.It’seasiertounderstandwithagraph!TographthePvalueforourexampledataset,weneedtodeterminethedistancebetweenthesamplemeanandthenullhypothesisvalue(330.6-260=70.6).Next,wecangraphtheprobabilityofobtainingasamplemeanthatisatleastasextremeinbothtailsofthedistribution(260+/-70.6).Inthegraphabove,thetwoshadedareaseachhaveaprobabilityof0.01556,foratotalprobability0.03112.Thisprobabilityrepresentsthelikelihoodofobtainingasamplemeanthatisatleastasextremeasoursamplemeaninbothtailsofthedistributionifthepopulationmeanis260.That’sourPvalue!WhenaPvalueislessthanorequaltothesignificancelevel,yourejectthenullhypothesis.IfwetakethePvalueforourexampleandcompareittothecommonsignificancelevels,itmatchesthepreviousgraphicalresults.ThePvalueof0.03112isstatisticallysignificantatanalphalevelof0.05,butnotatthe0.01level.Ifwesticktoasignificancelevelof0.05,wecanconcludethattheaverageenergycostforthepopulationisgreaterthan260.AcommonmistakeistointerprettheP-valueastheprobabilitythatthenullhypothesisistrue.Tounderstandwhythisinterpretationisincorrect,pleasereadmyblogpostHowtoCorrectlyInterpretPValues.DiscussionaboutStatisticallySignificantResultsAhypothesistestevaluatestwomutuallyexclusivestatementsaboutapopulationtodeterminewhichstatementisbestsupportedbythesampledata.Atestresultisstatisticallysignificantwhenthesamplestatisticisunusualenoughrelativetothenullhypothesisthatwecanrejectthenullhypothesisfortheentirepopulation.“Unusualenough”inahypothesistestisdefinedby:Theassumptionthatthenullhypothesisistrue—thegraphsarecenteredonthenullhypothesisvalue.Thesignificancelevel—howfaroutdowedrawthelineforthecriticalregion?Oursamplestatistic—doesitfallinthecriticalregion?Keepinmindthatthereisnomagicsignificancelevelthatdistinguishesbetweenthestudiesthathaveatrueeffectandthosethatdon’twith100%accuracy