Robust estimation and outlier detection for overdi

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

RobustEstimationandOutlierDetetionforOverdispersedMultinomialModelsofCountDataWalterR.Mebane,Jr.yJasjeetS.SekhonzJuly24,2003EarlierversionsofthispaperwerepresentedinseminarsatHarvardUniversity,Wash-ingtonUniversityandBinghamtonUniversity{SUNY,atthe2002AnnualMeetingoftheAmerianPolitialSieneAssoiation,the2002PolitialMethodologySummerMeeting,andthe2002AnnualMeetingoftheMidwestPolitialSieneAssoiation,andsigniantlydierentversionsofsomepartswerepresentedatthe2001JointStatistialMeetings,andatthe2001PolitialMethodologySummerMeeting.WethankJonathanWandforontri-butionstoearlierversionsofthiswork,ToddRieandLamark,In.,forgeneroussupportandprovisionofomputingresoures,JohnJaksonforgivingushisFORTRANodeandPolanddata,andGaryKingforhelpfulomments.Theauthorsshareequalresponsibilityforallerrors.yProfessor,DepartmentofGovernment,CornellUniversity.217WhiteHall,Ithaa,NY14853{4601(Phone:607-255-2868;Fax:607-255-4530;E-mail:wrm1ornell.edu).zAssistantProfessor,DepartmentofGovernment,HarvardUniversity.34Kirk-landStreet,Cambridge,MA02138(Phone:617-496-2426;Fax:617-496-5149;E-mail:jasjeetsekhonharvard.edu).AbstratRobustEstimationandOutlierDetetionforOverdispersedMultinomialModelsofCountDataWedeveloparobustestimator|thehyperbolitangent(tanh)estimator|foroverdispersedmultinomialregressionmodelsofountdata.Thetanhestimatorprovidesaurateestimatesandreliableinferenesevenwhenthespeiedmodelisnotgoodforasmuhashalfofthedata.Seriouslyill-ttedounts|outliers|areidentiedaspartoftheestimation.AMonteCarlosamplingexperimentshowsthatthetanhestimatorproduesgoodresultsatpratialsamplesizesevenwhentenperentofthedataaregeneratedbyasigniantlydierentproess.Theexperimentshowsthat,withontaminateddata,estimationfailsusingfourotherestimators:thenonrobustmaximumlikelihoodestimator,theadditivelogistimodelandtwoSURmodels.UsingthetanhestimatortoanalyzedatafromFloridaforthe2000presidentialeletionmatheswell-knownfeaturesoftheeletionthattheotherfourestimatorsfailtoapture.Inananalysisofdatafromthe1993Polishparliamentaryeletion,thetanhestimatorgivessharperinferenesthandoesapreviouslyproposedheterosedastiSURmodel.IntrodutionRegressionmodelsforvetorsofountsareommonlyusedinavarietyofsubstantiveelds.Countmodelshavebeenusedininternationalrelations(Shrodt1995)andtoanalyzedo-mestipolitialviolene(Wang,Dixon,Muller,andSeligson1993).Othersoialsieneap-pliationsinluderesearhonlaborrelations(Card1990),therelationshipbetweenpatentsandR&D(Hausman,Hall,andGrilihes1984),andmodelsofhouseholdfertilitydeisions(FamoyeandWang1997).Reentworkanalyzingountsinpolitialsieneinludesstud-iesofhildareservies(BrattonandRay2002),genderinlegislatures(MDonagh2002),genderandeduationaloutomes(Keiser,Wilkins,Meier,andHolland2002),negativeam-paigning(KahnandKenney2002;LauandPomper2002),andvotes(Canes-Wrone,Brady,andCogan2002;MonroeandRose2002).Inmostoftheseasesthemostnaturalmodelfortheountsisthebasimultinomialregressionmodel(e.g.CameronandTrivedi1998,270;MCullaghandNelder1989,164{174).Countsofthiskindmeasurethedistributionofeventsamonganitesetofalternatives,whereeaheventgeneratesoneoutome.Forvoteounts,thealternativesaretheandidatesorpartiesthatareompetingforapartiularoÆe,andthemultinomialmodelisrelevantwheneahvoterastsonevote.Themodeldoesnotexamineeahindividualseparatelybutinsteadanalyzesaggregatesthatorrespondtotheunitofobservation.Forvoteountstheaggregatesareusuallylegallydenedvotingdistrits,suhaspreints,orlargerunitssuhaslegislativedistrits,ountiesorprovines.Observationsinthismodelmeasurethenumberofindividualsineahunitwhohooseeahalternative.Amultinomialmodeltreatsthenumberofindividualsineahobservationalunitasxed,andestimationfousesonhowtheproportionexpetedtohooseeahalternativedependsontheregressors.Eahexpetedproportionorrespondstotheprobabilityofmakingeahhoieaordingtothemultinomialmodel.Usuallytheseprobabilitiesaredenedaslogistifuntionsoflinearombinationsoftheregressors(seeequation(1)below),andtheproblemistoestimatevaluesfortheunknownoeÆientparametersinthoselinearombinations.Inthebasimodeltheprobabilitiesandthetotaloftheountsforeahobservationare1bothneessarytodenethestatistialdistributionofthedata,inludingthemeanandthevariane.Oneofthemostimportantreasonstouseamultinomialmodelisthattheountsareheterosedasti:thevarianeoftheountsandonsequentlystatistialpropertiesofparameterestimates,suhastheestimates’standarderrors,dependonboththeprobabilitiesandtheobservationtotals.Thesituationisanalogoustothereasonswhyoneshouldusealogitorprobitmodelandnotordinaryleastsquares(thelinearprobabilitymodel)withbinaryhoiedata.Unfortunately,reentanalysesofountdatainpolitialsiene,suhasBrattonandRay(2002),Canes-Wroneetal.(2002),Keiseretal.(2002),KahnandKenney(2002),LauandPomper(2002),MDonagh(2002)andMonroeandRose(2002),reduetheountstoperentagesorproportionsandignoreheterosedastiity.Asweshallillustrateinasamplingexperiment,ignoringheterosedastiitygenerallyresultsininorretstatistialinferenes.Inpratiethebasimodelhasprovedtobeinadequateforvoteounts.Aproblemthathasbeenwidelyreognizedisthataggregatevotedatausuallyexhibitgrea

1 / 47
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功