Bootstrap Variable-Selection and Confidence Sets

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

BootstrapVariable-SelectionandCondenceSetsRudolfBeranUniversityofCalifornia,BerkeleyThispaperanalyzesestimationbybootstrapvariable-selectioninasimpleGaussianmodelwherethedimensionoftheunknownparametermayexceedthatofthedata.Anaiveuseofthebootstrapinthisproblemproducesriskestimatorsforcandidatevariable-selectionsthathaveastrongupwardbias.Resamplingfromalessoverttedmodelremovesthebiasandleadstobootstrapvariable-selectionsthatminimizeriskasymptotically.Arelatedbootstraptechniquegeneratescondencesetsthatarecenteredatthebestbootstrapvariable-selectionandhavetwofurtherproperties:theasymptoticcoverageprobabilityfortheunknownparameterisasdesired;andthecon-dencesetisgeometricallysmallerthanaclassicalcompetitor.Theresultssuggestapossibleapproachtocondencesetsinotherinverseproblemswherearegularizationtechniqueisused.Keywordsandphrases.Coverageprobability,geometricloss,Cp-estimator.1.IntroductionCertainstatisticalestimationproblems,suchascurveestimation,signalrecovery,orimagereconstruction,sharetwodistinctivefeatures:thedimen-sionoftheparameterspaceexceedsthatofthedata;andeachcomponentoftheunknownparametermaybeimportant.Insuchproblems,ordinaryleastsquaresormaximumlikelihoodestimationtypicallyovertsthemodel.Onegeneralapproachtoestimationinsuchproblemshasthreestages:First,deviseapromisingclassofcandidateestimators,suchaspenalizedmaximumlikelihoodestimatorscorrespondingtoafamilyofpenaltyfunctionsorBayesestimatorsgeneratedbyafamilyofpriordistributions.Thisstepissome-timescalledusingaregularizationtechnique.Second,estimatetheriskofeachcandidateestimator.Third,usethecandidateestimatorwithsmallestestimatedrisk.Largelyunresolvedtodateisthequestionofconstructingaccuratecon-dencesetsbasedonsuchadaptive,regularizedestimators.Evenobtainingreliableestimatorsofriskcanbedicult.Thispapertreatsbothmatters1inthefollowingproblem,whichisrelativelysimpletoanalyzeexplicitly,yetsucientlygeneraltoindicatepotentialdirectionsforotherproblemsthatinvolvearegularizationtechnique.SupposethatXnisanobservationonadiscretizedsignalthatismeasuredwitherroratntimepoints.Theer-rorsareindependent,identicallydistributed,Gaussianrandomvariableswithmeanszero.Thus,XnisarandomvectorwhosedistributionisN(n;2nIn).Bothnand2nareunknown.Theproblemistoestimatethesignaln.Theintegratedsquarederrorofanestimator^nisLn(^n;n)=n1j^nnj2;(1.1)wherejjisEuclideannorm.Underthisloss,Stein(1956)showedthatXn,themaximumlikelihoodorleastsquaresestimatorofn,isinadmissibleforn3.BetterestimatorsfornincludetheJames-Stein(1961)estimator,locallysmoothedestimatorssuchasthekernelvarietytreatedbyRice(1984),andvariable-selectionestimators,tobedescribedinthenextparagraph.Eachoftheseimprovedestimatorsacceptssomebiasinreturnforagreaterreductioninvariance.Avariable-selectionapproachtoestimatingnconsistsofthreesteps:rst,transformXnorthogonallytoX0n=OXn;second,replaceselectedcomponentsofX0nwithzero;andthird,applytheinverserotationO1totheoutcomeofsteptwo.Thevectorgeneratedbysuchaprocesswillbecalledavariable-selectionestimatorofn.HowshallwechoosetheorthogonalmatrixO?Ideally,thecomponentsoftherotatedmeanvectorOnwouldbeeitherverylargeorverysmallrelativetomeasurementerror.ThenatureoftheexperimentthatgeneratedXnmaysuggestthatObeaniteFouriertransform,orananalysisofvariancetransform,oranorthogonalpolynomialtransform,orawavelettransform.Importantthoughitis,wewillnotdealfurther,inthispaper,withthechoiceofO.HavingrotatedXn,howshallwechoosewhichcomponentsofX0ntozeroout?Thereafter,howshallweconstruct,aroundthevariable-selectionestimator,anaccuratecondencesetforn?Aplausibleansweristocom-parecandidatevariable-selectionsthroughtheirbootstraprisks;andthenbootstraptheempiricallybestcandidateestimatortoobtainacondencesetforn.EfronandTibshirani(1993,Chapter17)discussedsimpleboot-strapestimatorsofmeansquaredpredictionerror.However,Freedmanet2al.(1988)andBreiman(1992)showedthatsimplebootstrapestimatorsofmean-squaredpredictionerrorcanbeuntrustworthyforvariable-selection.Thispapertreatsvariable-selectionforestimationratherthanpredictionandallowsthedimensionoftheunknownparametertoincreasewithsamplesizen.Thesecondpointisveryimportant.AstrongermodelassumptionusedbySpeedandYu(1993)andothers|thatthedimensionoftheparam-eterspaceisxedforalln|restrictsthepossiblebiasinducedbycandidatevariable-selections.Insuchrestrictedmodels,variable-selectionbyCpdoesnotchoosewell.Ontheotherhand,Cpcanbeasymptoticallycorrectwhenthedimensionoftheparameterspaceincreasesquicklywithnandthese-lectionclassisnottoolarge(cf.Section2).Rice(1984,Section3)andSpeedandYu(1993,Section4)discussotherinstancesandaspectsofthisphenomenon.Section2ofthispaperprovesforourestimationproblemthatnaivebootstrapping|resamplingfromaN(Xn;^n2In)model,where^n2estimates2n|yieldsupwardlybiasedriskestimatorsforcandidatevariable-selections.However,resamplingfromaN(~n;^n2)distribution,where~nisobtainedbysuitablyshrinkingsomeofthecomponentsofX0ntowardzero,correctsthebiasandgeneratesagoodbootstrapvariable-selectionestimator^n;Bforn.Usingarelate

1 / 19
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功