Monte Carlo Implementation of Gaussian Process Mod

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

arXiv:physics/9701026v2[physics.data-an]28Jan1997TechnicalReportNo.9702,DepartmentofStatistics,UniversityofTorontoMonteCarloImplementationofGaussianProcessModelsforBayesianRegressionandClassificationRadfordM.NealDepartmentofStatisticsandDepartmentofComputerScienceUniversityofToronto,Toronto,Ontario,Canadaradford@stat.utoronto.ca20January1997Abstract.Gaussianprocessesareanaturalwayofdefiningpriordistributionsoverfunc-tionsofoneormoreinputvariables.Inasimplenonparametricregressionproblem,wheresuchafunctiongivesthemeanofaGaussiandistributionforanobservedresponse,aGaussianprocessmodelcaneasilybeimplementedusingmatrixcomputationsthatarefeasiblefordatasetsofuptoaboutathousandcases.HyperparametersthatdefinethecovariancefunctionoftheGaussianprocesscanbesampledusingMarkovchainmethods.Regressionmodelswherethenoisehasatdistributionandlogisticorprobitmodelsforclassificationapplicationscanbeimplementedbysamplingaswellforlatentvaluesun-derlyingtheobservations.Softwareisnowavailablethatimplementsthesemethodsusingcovariancefunctionswithhierarchicalparameterizations.Modelsdefinedinthiswaycandiscoverhigh-levelpropertiesofthedata,suchaswhichinputsarerelevanttopredictingtheresponse.1IntroductionAnonparametricBayesianregressionmodelmustbebasedonapriordistributionovertheinfinite-dimensionalspaceofpossibleregressionfunctions.IthasbeenknownformanyyearsthatsuchpriorsoverfunctionscanbedefinedusingGaussianprocesses(O’Hagan1978),andessentiallythesamemodelhaslongbeenusedinspatialstatisticsunderthenameof“kriging”.Gaussianprocessesseemtohavebeenlargelyignoredasgeneral-purposere-gressionmodels,however,apartfromthespecialcaseofsmoothingsplines(Wahba1978),andsomeapplicationstomodelingnoise-freedatafromcomputerexperiments(eg,Sack,Welch,Mitchell,andWynn1989).Recently,IhaveshownthatmanyBayesianregressionmodelsbasedonneuralnetworksconvergetoGaussianprocessesinthelimitofaninfinitenetwork(Neal1996).ThishasmotivatedexaminationofGaussianprocessmodelsforthehigh-dimensionalapplicationstowhichneuralnetworksaretypicallyapplied(WilliamsandRasmussen1996).TheempiricalworkofRasmussen(1996)hasdemonstratedthatGaus-sianprocessmodelshavebetterpredictiveperformancethanseveralothernonparametric1regressionmethodsoverarangeoftaskswithvaryingcharacteristics.Theconceptualsim-plicity,flexibility,andgoodperformanceofGaussianprocessmodelsshouldmakethemveryattractiveforawiderangeofproblems.OnereasonforthepreviousneglectofGaussianprocessregressionmaybethatinastraightforwardimplementationitinvolvesmatrixoperationswhosetimerequirementsgrowasthecubeofthenumberofcases,andwhosespacerequirementsgrowasthesquareofthenumberofcases.Twentyyearsago,thismayhavelimiteduseofsuchmodelstodatasetswithlessthanaboutahundredcases,butwithmoderncomputers,itisfeasibletoapplyGaussianprocessmodelstodatasetswithathousandormorecases.Itmayalsobepossibletoreducethetimerequirementsusingmoresophisticatedalgorithms(GibbsandMacKay1997a).ThecharacteristicsofaGaussianprocessmodelcaneasilybecontrolledbywritingthecovariancefunctionintermsof“hyperparameters”.Oneapproachtoadaptingthesehyper-parameterstotheobserveddataistoestimatethembymaximumlikelihood(ormaximumpenalizedlikelihood),ashaslongbeendoneinthecontextofspatialstatistics(eg,MardiaandMarshall1984).InafullyBayesianapproach,thehyperparametersaregivenpriordistributions.Predictionsarethenmadebyaveragingovertheposteriordistributionforthehyperparameters,whichcanbedoneusingMarkovchainMonteCarlomethods.Thesetwoapproachesoftengivesimilarresults(WilliamsandRasmussen1996,Rasmussen1996),butthefullyBayesianapproachmaybemorerobustwhenthemodelsareelaborate.ApplyingGaussianprocessmodelstoclassificationproblemspresentsnewcomputationalproblems,sincethejointdistributionofallquantitiesisnolongerGaussian.ApproximatemethodsofBayesianinferenceforsuchmodelshavebeenproposedbyBarberandWilliams(1997)andbyGibbsandMacKay(1997b).Ageneralapproachtoexactlyhandlingclassi-ficationandothergeneralizedmodels(eg,foraPoissonresponse)istouseaMarkovchainMonteCarloschemeinwhichunobserved“latentvalues”associatedwitheachcaseareexplicitlyrepresented.Thispaperappliesthisapproachtoclassificationusinglogisticorprobitmodels,andtoregressionmodelsinwhichthenoisefollowsatdistribution.IhavewrittensoftwareinCforUnixsystemsthatimplementsGaussianprocessmethodsforregressionandclassification,withinthesameframeworkasisusedbymyBayesianneuralnetworksoftware.Thissoftwareisisfreelyavailableforresearchandeducationaluse.1Thecovariancefunctionssupportedmayconsistofseveralparts,andmaybespecifiedintermsofhyperparameters,asdescribedindetailinSection3.Thesecovariancefunctionsprovidefunctionalitysimilartothatoftheneuralnetworkmodels.ThesoftwareimplementsfullBayesianinferenceforthesehierarchicalmodelsusingmatrixcomputationsandMarkovchainsamplingmethods,asdescribedinSections4and5.InSections6and7,Idemonstratetheuseofthesoftwareonathree-wayclassificationproblem,usingamodelthatcanidentifywhichoftheinputsarerelevanttopredictingtheclass,andonaregressionproblemwithoutliers.Iconcludebydiscussingsomeareasforfutureresearch.First,however,Iwill1Foll

1 / 24
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功