EstimatingMultilevelModelsusingSPSS,Stata,SAS,andRJeremyJ.AlbrightandDaniM.MarinovaJuly14,20101Multileveldataarepervasiveinthesocialsciences.1Studentsmaybenestedwithinschools,voterswithindistricts,orworkerswithin rms,tonameafewexam-ples.Statisticalmethodsthatexplicitlytakeintoaccounthierarchicallystructureddatahavegainedpopularityinrecentyears,andtherenowexistseveralspecial-purposestatisticalprogramsdesignedspeci callyforestimatingmultilevelmodels(e.g.HLM,MLwiN).Inaddition,theincreasinguseofofmultilevelmodels alsoknownashierarchicallinearandmixede ectsmodels hasledgeneralpurposepackagessuchasSPSS,Stata,SAS,andRtointroducetheirownproceduresforhandlingnesteddata.Nonetheless,researchersmayfacetwochallengeswhenattemptingtodeterminetheappropriatesyntaxforestimatingmultilevel/mixedmodelswithgeneralpurposesoftware.First,manyusersfromthesocialsciencescometomultilevelmodelingwithabackgroundinregressionmodels,whereasmuchofthesoftwaredocumenta-tionutilizesexamplesfromexperimentaldisciplines[duetothefactthatmultilevelmodelingmethodologyevolvedoutofANOVAmethodsforanalyzingexperimentswithrandome ects(Searle,Casella,andMcCulloch,1992)].Second,notationformultilevelmodelsisofteninconsistentacrossdisciplines(Ferron1997).ThepurposeofthisdocumentistodemonstratehowtoestimatemultilevelmodelsusingSPSS,StataSAS,andR.It rstseekstoclarifythevocabularyofmultilevelmodelsbyde ningwhatismeantby xede ects,randome ects,andvariancecomponents.Itthencomparesthemodelbuildingnotationfrequentlyemployedinapplicationsfromthesocialscienceswiththemoregeneralmatrixnotationfound1Jeremywrotetheoriginaldocument.DaniwrotethesectiononRandrewrotepartsofthesectiononStata.2inmuchofthesoftwaredocumentation.Thesyntaxforcenteringvariablesandestimatingmultilevelmodelsisthenpresentedforeachpackage.1VocabularyofMixedandMultilevelModelsModelsformultileveldatahavedevelopedoutofmethodsforanalyzingexperi-mentswithrandome ects.Thusitisimportantforthoseinterestedinusinghierar-chicallinearmodelstohaveaminimalunderstandingofthelanguageexperimentalresearchersusetodi erentiatebetweene ectsconsideredtoberandomor xed.Inanidealexperiment,theresearcherisinterestedinwhetherornotthepresenceorabsenceofonefactora ectsscoresonanoutcomevariable.2Doesaparticularpillreducecholesterolmorethanaplacebo?Canbehavioralmodi cationreduceaparticularphobiabetterthanpsychoanalysisornotreatment?Thefactorsintheseexperimentsaresaidtobe xed becausethesame, xedlevelswouldbeincludedinreplicationsofthestudy (MaxwellandDelaney,pg.469).Thatis,theresearcherisonlyinterestedintheexactcategoriesofthefactorthatappearintheexperiment.Thetypicalmodelforaone-factorexperimentis:yij=+j+eij(1)wherethescoreonthedependentvariableforindividualiisequaltothegrandmean2Intheparlanceofexperiments,afactorisacategoricalvariable.Thetermcovariatereferstocontinuousindependentvariables.3ofthesample(),thee ectofreceivingtreatmentj,andanindividualerrortermeij.Ingeneral,somekindofconstraintisplacedonthealphavalues,suchthattheysumtozeroandthemodelisidenti ed.Inaddition,itisassumedthattheerrorsareindependentandnormallydistributedwithconstantvariance.Insomeexperiments,however,aparticularfactormaynotbe xedandperfectlyreplicableacrossexperiments.Instead,thedistinctcategoriespresentintheexper-imentrepresentarandomsamplefromalargerpopulation.Forexample,di erentnursesmayadministeranexperimentaldrugtosubjects.Usuallythee ectofaspeci cnurseisnotoftheoreticalinterest,buttheresearcherwillwanttocontrolforthepossibilitythatanindependentcaregivere ectispresentbeyondthe xeddruge ectbeinginvestigated.Insuchcasestheresearchermayaddatermtocontrolfortherandome ect:yij=+j+k+()jk+eij(2)whererepresentsthee ectofthekthleveloftherandome ect,andrepresentstheinteractionbetweentherandomand xede ects.Amodelthatcontainsonly xede ectsandnorandome ects,suchasequation1,isknownasa xede ectsmodel.Onethatincludesonlyrandome ectsandno xede ectsistermedarandome ectsmodel.Equation2isactuallyanexampleofamixede ectsmodelbecauseitcontainsbothrandomand xede ects.Whilethenotationinequation2fortherandome ectisthesameasforthe xede ect(thatis,botharedenotedbysubscriptedGreekletters),animportantdi erenceexistsinthetestsforthedrugandnursefactors.Forthe xede ect,the4researcherisinterestedinonlythoselevelsincludedintheexperiment,andthenullhypothesisisthattherearenodi erencesinthemeansofeachtreatmentgroup:H0:1=2=:::=jH1:j6=j0Fortherandome ectinthedrugexample,theresearcherisnotinterestedintheparticularnursespersebutinsteadwishestogeneralizeaboutthepotentiale ectsofdrawingdi erentnursesfromthelargerpopulation.Thenullhypothesisfortherandome ectisthereforethatitsvarianceisequaltozero:H0:2=0H1:20Theestimatedvarianceisknownasavariancecomponent,anditsestimationisanessentialstepinmixede ectsmodels.Oftentimesinexperimentalsettings,therandome ectsarenuisancesthatne-cessitatestatisticalcontrols.Intheaboveexample,thee ectofthedrugwastheprimaryinterest,whereasthenursefactorwaspotentiallyconfoundingbuttheoreti-callyuninteresting.Itisnonethelessnecessarytoincludetherelevantran