Factor analysis using delta-rule wake-sleep learni

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

TechnicalReportNo.9607,DepartmentofStatistics,UniversityofTorontoFactorAnalysisUsingDelta-RuleWake-SleepLearningRadfordM.NealDepartmentofStatisticsandDepartmentofComputerScienceUniversityofTorontoradford@stat.utoronto.caPeterDayanDepartmentofBrainandCognitiveSciencesMassachusettsInstituteofTechnologydayan@ai.mit.edu24July1996Wedescribealinearnetworkthatmodelscorrelationsbetweenreal-valuedvisiblevari-ablesusingoneormorereal-valuedhiddenvariables—afactoranalysismodel.Thismodelcanbeseenasalinearversionofthe“Helmholtzmachine”,anditsparameterscanbelearnedusingthe“wake-sleep”method,inwhichlearningoftheprimary“generative”modelisassistedbya“recognition”model,whoseroleistofillinthevaluesofhiddenvariablesbasedonthevaluesofvisiblevariables.Thegenerativeandrecognitionmodelsarejointlylearnedin“wake”and“sleep”phases,usingjustthedeltarule.ThislearningprocedureiscomparableinsimplicitytoOja’sversionofHebbianlearning,whichpro-ducesasomewhatdifferentrepresentationofcorrelationsintermsofprincipalcompo-nents.Wearguethatthesimplicityofwake-sleeplearningmakesfactoranalysisaplau-siblealternativetoHebbianlearningasamodelofactivity-dependentcorticalplasticity.1IntroductionActivity-dependentplasticityinthevertebratebrainhastypicallybeenmodeledintermsofHebbianlearning(Hebb1959),inwhichweightchangesarebasedonthecovarianceofpre-synapticandpost-synapticactivity(eg,vonderMalsburg1973;Linsker1986;Miller,Keller,andStryker1989).Thesemodelsderivesupportfromneurobiologicalevidenceoflong-termpotentiation(see,forexample,CollingridgeandBliss(1987),andforarecentreview,BaudryandDavis(1994)).Theyhavealsobeenseenasperformingareasonablefunction,namelyextractingthestatisticalstructureamongstacollectionofinputsintermsofprincipalcom-ponents(Linkser1988).Inthispaper,wesuggestthestatisticaltechniqueoffactoranalysisasaninterestingalternativetoprincipalcomponentsanalysis,andshowhowtoimplementitusinganalgorithmwhosedemandsonsynapticplasticityareaslocalasthoseoftheHebbrule.Factoranalysisisamodelforreal-valueddatainwhichcorrelationsare“explained”bypostulatingthepresenceofoneormoreunderlying“factors”.Thesefactorsplaytheroleof1“latent”or“hidden”variables,whicharenotdirectlyobservable,butwhichallowthedepen-denciesbetweenthe“visible”variablestobeexpressedinaconvenientway.Everitt(1984)givesagoodintroductiontolatentvariablemodelsingeneral,andtofactoranalysisinpar-ticular.Thesemodelsarewidelyusedinpsychologyandthesocialsciencesasawayofex-ploringwhetherobservedpatternsindatamightbeexplainableintermsofasmallnumberofunobservedfactors.Ourinterestinthesemodelsstemsfromtheirpotentialasawayofbuildinghigh-levelrepresentationsfromsensorydata.Oja’sversionofHebbianlearning(OjaandKarhunen1985;Oja1989,1992)isaparticu-larlyconvenientcounterpoint.Thisruleappliestoalinearunitwithweightvectorwthatcomputesanoutputy=wTxwhenpresentedwithareal-valuedinputvectorx(which,forconvenience,isassumedtohavemeanzero).Aftereachpresentationofaninputvector,theweightsfortheunitarechangedbyanamountgivenbythefollowingproportionality:w/y(xyw)=yxy2w:(1)Thefirstterminthisweightincrement,yx,isofHebbianform.Thesecondterm,y2w,tendstopushtheweightstowardszero,balancingthepositivefeedbackinplainHebbianlearn-ing,whichwouldotherwiseincreasethemagnitudeoftheweightswithoutbound.WyattandElfadel(1995)giveanexplicitanalysisoflearningbasedonequation(1),showingthatwithreasonablestartingconditions,wconvergestotheprincipaleigenvectorofthecovari-ancematrixoftheinputs—thatis,itconvergestoaunitvectorpointinginthedirectionofhighestvarianceintheinputspace.Extractingthesubsidiaryeigenvectorsofthecovari-ancematrixoftheinputsissomewhatmorechallenging,requiringsomeformofinhibitionbetweensuccessiveoutputunits(Sanger1989;F¨oldi´ak1989;Plumbley1993).Linsker(1988)viewsHebbianlearningasawayofmaximisingtheinformationretainedbyyaboutx.UnderthesimplifyingassumptionthatthedistributionoftheinputsisGaus-sian,settingtheoutputofaunittotheprojectionofitsinputontothefirstprincipalcompo-nentoftheinputcovariancematrixconveysasmuchinformationaspossibleonaverage(seealsoPlumbley1993).Thisgoalseemsreasonablefortheveryearlystagesofsensoryprocess-ing,whereinformationbottleneckssuchastheopticnervemayplausiblybepresent.Note,however,thatitimplicitlyassumesthatallinformationisequallyimportant.Maximizingin-formationtransferseemslesscompellingasagoalforsubsequentlevelsofprocessing,oncesensorysignalshavereachedcortex.Severalothercomputationalgoalshavebeensuggestedfromthisstageupwards,includingfactorialcoding(Barlow1989),sparsification(OlshausenandField1995),andvariousmethodsforencouragingthecortextorespectreasonableinvari-ances,suchastranslationorscaleinvarianceforvisualprocessing(LiandAtick1994).Inthispaper,wepursuethesuggestionofHintonandZemel(1994)(seealsoGrenander1976-1981;Mumford1994;Dayan,Hinton,Neal,andZemel1995)thatthecortexmightbeconstructingahierarchicalstochastic“generative”modelofitsinputinthetop-downcon-nections,whileimplementinginthebottom-upconnectionsa“recognition”modelthatinasenseistheinverseofthegenerativemodel.Therecognitionmodelprovideshigh-

1 / 23
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功