CausalEffectsinNonexperimentalStudies:ReevaluatingtheEvaluationofTrainingProgramsAuthor(s):RajeevH.DehejiaandSadekWahbaReviewedwork(s):Source:JournaloftheAmericanStatisticalAssociation,Vol.94,No.448(Dec.,1999),pp.1053-1062Publishedby:AmericanStatisticalAssociationStableURL::22/04/201214:00YouruseoftheJSTORarchiveindicatesyouracceptanceoftheTerms&ConditionsofUse,availableat.@jstor.org.AmericanStatisticalAssociationiscollaboratingwithJSTORtodigitize,preserveandextendaccesstoJournaloftheAmericanStatisticalAssociation.(NSW)Demonstration,alabortrainingprogram,onpostinterventionearnings.WeusedatafromLalonde'sevaluationofnonexperimentalmethodsthatcombinethetreatedunitsfromarandomizedevaluationoftheNSWwithnonexperimentalcomparisonunitsdrawnfromsurveydatasets.Weapplypropensityscoremethodstothiscompositedatasetanddemonstratethat,relativetotheestimatorsthatLalondeevaluates,propensityscoreestimatesofthetreatmentimpactaremuchclosertotheexperimentalbenchmarkestimate.Propensityscoremethodsassumethatthevariablesassociatedwithassignmenttotreatmentareobserved(referredtoasignorabletreatmentassignment,orselectiononobservables).Evenunderthisassumption,itisdifficulttocontrolfordifferencesbetweenthetreatmentandcomparisongroupswhentheyaredissimilarandwhentherearemanypreinterventionvariables.Theestimatedpropensityscore(theprobabilityofassignmenttotreatment,conditionalonpreinterventionvariables)summarizesthepreinterventionvariables.Thisoffersadiagnosticonthecomparabilityofthetreatmentandcomparisongroups,becauseonehasonlytocomparetheestimatedpropensityscoreacrossthetwogroups.Wediscussseveralmethods(suchasstratificationandmatching)thatusethepropensityscoretoestimatethetreatmentimpact.Whentherangeofestimatedpropensityscoresofthetreatmentandcomparisongroupsoverlap,thesemethodscanestimatethetreatmentimpactforthetreatmentgroup.Asensitivityanalysisshowsthatourestimatesarenotsensitivetothespecificationoftheestimatedpropensityscore,butaresensitivetotheassumptionofselectiononobservables.Weconcludethatwhenthetreatmentandcomparisongroupsoverlap,andwhenthevariablesdeterminingassignmenttotreatmentareobserved,thesemethodsprovideameanstoestimatethetreatmentimpact.Eventhoughpropensityscoremethodsarenotalwaysapplicable,theyofferadiagnosticonthequalityofnonexperimentalcomparisongroupsintermsofobservablepreinterventionvariables.KEYWORDS:Matching;Programevaluation;Propensityscore.1.INTRODUCTIONThisarticlediscussestheestimationoftreatmenteffectsinobservationalstudies.Thisissuehasbeenthefocusofmuchattentionbecauserandomizedexperimentscannotal-waysbeimplementedandhasbeenaddressedinteraliabyLalonde(1986),whosedataweuseherein.Lalondeesti-matedtheimpactoftheNationalSupportedWork(NSW)Demonstration,alabortrainingprogram,onpostinterven-tionincomelevels.Heuseddatafromarandomizedeval-uationoftheprogramandexaminedtheextenttowhichnonexperimentalestimatorscanreplicatetheunbiasedex-perimentalestimateofthetreatmentimpactwhenappliedtoacompositedatasetofexperimentaltreatmentunitsandnonexperimentalcomparisonunits.Heconcludedthatstan-dardnonexperimentalestimatorssuchasregression,fixed-effects,andlatentvariableselectionmodelsareeitherin-accuraterelativetotheexperimentalbenchmarkorsensi-tivetothespecificationusedintheregression.Lalonde'sresultshavebeeninfluentialinrenewingthedebateonex-perimentalversusnonexperimentalevaluations(seeManskiandGarfinkel1992)andinspurringasearchforalterna-tiveestimatorsandspecificationtests(see,e.g.,HeckmanRajeevDehejiaisAssistantProfessor,DepartmentofEconomicsandSIPA,ColumbiaUniversity,NewYork,NY10027(E-mail:dehejia@columbia.edu).SadekWahbaisVicePresident,MorganStanley&Co.In-corporated,NewYork,NY10036(E-mail:wahbas@ms.com).ThisworkwaspartiallysupportedbyagrantfromtheSocialSciencesandHu-manitiesResearchCouncilofCanada(firstauthor)andaWorldBankFellowship(secondauthor).Theauthorsgratefullyacknowledgeanasso-ciateeditor,twoanonymousreferees,GaryChamberlain,GuidoImbens,andDonaldRubin,whosedetailedcommentsandsuggestionsgreatlyim-provedthearticle.TheythankRobertLalondeforprovidingthedatafromhis1986studyandsubstantialhelpinrecreatingtheoriginaldataset,andalsothankJoshuaAngrist,GeorgeCave,DavidCutler,LawrenceKatz,CarolineMinter-Hoxby,andJeffreySmith.andHotz1989;Manski,Sandefur,McLanahan,andPowers1992).Inthisarticleweapplypropensityscoremethods(Rosen-baumandRubin1983)toLalonde'sdataset.Thepropen-sityscoreisdefinedastheprobabilityofassignmenttotreatment,conditionaloncova