Summary Bayesian Inference for Categorical Data An

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

BayesianInferenceforCategoricalDataAnalysisAlanAgrestiDepartmentofStatisticsUniversityofFloridaGainesville,Florida,USA32611-8545PhoneUSA(352)392-1941,Fax(352)392-5175e-mailaa@stat.u.eduDavidB.HitchcockDepartmentofStatisticsUniversityofSouthCarolinaColumbia,SC,USA29208e-mailhitchcock@stat.sc.edu1BayesianInferenceforCategoricalDataAnalysisSummaryThisarticlesurveysBayesianmethodsforcategoricaldataanalysis,withprimaryem-phasisoncontingencytableanalysis.EarlyinnovationswereproposedbyGood(1953,1956,1965)forsmoothingproportionsincontingencytablesandbyLindley(1964)forinferenceaboutoddsratios.TheseapproachesprimarilyusedconjugatebetaandDirichletpriors.Altham(1969,1971)presentedBayesiananalogsofsmall-samplefrequentisttestsfor22tablesusingsuchpriors.Analternativeapproachusingnormalpriorsforlogitsreceivedconsiderableattentioninthe1970sbyLeonardandothers(e.g.,Leonard1972).Adoptedusuallyinahierarchicalform,thelogit-normalapproachallowsgreaterexibilityandscopeforgeneralization.The1970salsosawconsiderableinterestinloglinearmodeling.Thead-ventofmoderncomputationalmethodssincethemid-1980shasledtoagrowingliteratureonfullyBayesiananalyseswithmodelsforcategoricaldata,withmainemphasisongeneral-izedlinearmodelssuchaslogisticregressionforbinaryandmulti-categoryresponsevariables.Keywords:Betadistribution;Binomialdistribution;Dirichletdistribution;EmpiricalBayes;Graphicalmodels;Hierarchicalmodels;Logisticregression;Loglinearmodels;MarkovchainMonteCarlo;Matchedpairs;Multinomialdistribution;Oddsratio;Smoothing.21Introduction1.1Abriefhistoryupto1965ThepurposeofthisarticleistosurveyBayesianmethodsforanalyzingcategoricaldata.ThestartingplaceisthelandmarkworkbyBayes(1763)andbyLaplace(1774)onesti-matingabinomialparameter.Theybothusedauniformpriordistributionforthebinomialparameter.Dale(1999)andStigler(1986,pp.100-136)summarizedthiswork,Stigler(1982)discussedwhatBayesimpliedbyhisuseofauniformprior,andHald(1998)discussedlaterdevelopments.Forcontingencytables,thesampleproportionsareordinarymaximumlikelihood(ML)estimatorsofmultinomialcellprobabilities.Whendataaresparse,thesecanhaveundesir-ablefeatures.Forinstance,foracellwithasamplingzero,0.0isusuallyanunappealingestimate.EarlyapplicationsofBayesianmethodstocontingencytablesinvolvedsmoothingcellcountstoimproveestimationofcellprobabilitieswithsmallsamples.MuchofthisappearedinvariousworksbyI.J.Good.Good(1953)usedauniformpriordistributionoverseveralcategoriesinestimatingthepopulationproportionsofanimalsofvariousspecies.Good(1956)usedlog-normalandgammapriorsinestimatingassociationfactorsincontingencytables.Foraparticularcell,theassociationfactorisde nedtobetheprobabilityofthatcelldividedbyitsprobabilityassumingindependence(i.e.,theproductofthemarginalprobabilities).Good's(1965)monographsummarizedtheuseofBayesianmethodsforestimatingmultinomialprobabilitiesincontingencytables,usingaDirichletpriordistribution.GoodalsowasinnovativeinhisearlyuseofhierarchicalandempiricalBayesianapproaches.Hisinterestinthisareaapparentlyevolvedoutofhisserviceasthemainstatisticalassistantin1941toAlanTuringonintelligenceissuesduringWorldWarII(e.g.,seeGood1980).Inaninuentialarticle,Lindley(1964)focusedonestimatingsummarymeasuresofassociationincontingencytables.Forinstance,usingaDirichletpriordistributionforthemultinomialprobabilities,hefoundtheposteriordistributionofcontrastsoflogprobabilities,suchasthelogoddsratio.EarlycriticsoftheBayesianapproachincludedR.A.Fisher.Forinstance,inhisbookStatisticalMethodsandScienti cInferencein1956,Fisherchallenged1theuseofauniformpriorforthebinomialparameter,notingthatuniformpriorsonotherscaleswouldleadtodi erentresults.(Interestingly,Fisherwasthe rsttousetheterm\Bayesian,startingin1950.SeeFienberg(2005)foradetaileddiscussionoftheevolutionoftheterm.FienbergnotesthatthemoderngrowthofBayesianmethodsfollowedthepopularizationinthe1950softheterm\Bayesianby,inparticular,L.J.Savage,I.J.Good,H.Rai aandR.Schlaifer.)1.2OutlineofthisarticleLeonardandHsu(1994)selectivelyreviewedthegrowthofBayesianapproachestocategoricaldataanalysissincethegroundbreakingworkbyGoodandbyLindley.Muchofthisreviewfocusedonresearchinthe1970sbyLeonardthatevolvednaturallyoutofLindley(1964).AnencyclopediaarticlebyAlbert(2004)focusedonmorerecentdevelopments,suchasmodelselectionissues.OfthemanybookspublishedinrecentyearsontheBayesianapproach,themostcompletecoverageofcategoricaldataanalysisisthechapterofO'HaganandForster(2004)ondiscretedatamodelsandthetextbyCongdon(2005).Thepurposeofourarticleistoprovideasomewhatbroaderoverview,intermsofcover-ingamuchwidervarietyoftopicsthanthesepublishedsurveys.Wedothisbyorganizingthesectionsaccordingtothestructureofthecategoricaldata.Section2beginswithestima-tionofbinomialandmultinomialparameters,continuingintoestimationofcellprobabilitiesincontingencytablesandrelatedparametersforloglinearmodels(Section3).Section4discussesBayesiananalogsofsomeclassicalcon denceintervalsandsigni cancetests.Sec-tion5dealswithextensionstotheregressionmodelingofcategoricalresponsevariables.Computationalaspectsared

1 / 48
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功