Algorithms-for-Non-negative-Matrix-Factorization

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

AlgorithmsforNon-negativeMatrixFactorizationDanielD.LeeBellLaboratoriesLucentTechnologiesMurrayHill,NJ07974H.SebastianSeungDept.ofBrainandCog.Sci.MassachusettsInstituteofTechnologyCambridge,MA02138AbstractNon-negativematrixfactorization(NMF)haspreviouslybeenshowntobeausefuldecompositionformultivariatedata.Twodifferentmulti-plicativealgorithmsforNMFareanalyzed.Theydifferonlyslightlyinthemultiplicativefactorusedintheupdaterules.OnealgorithmcanbeshowntominimizetheconventionalleastsquareserrorwhiletheotherminimizesthegeneralizedKullback-Leiblerdivergence.Themonotonicconvergenceofbothalgorithmscanbeprovenusinganauxiliaryfunc-tionanalogoustothatusedforprovingconvergenceoftheExpectation-Maximizationalgorithm.Thealgorithmscanalsobeinterpretedasdiag-onallyrescaledgradientdescent,wheretherescalingfactorisoptimallychosentoensureconvergence.1IntroductionUnsupervisedlearningalgorithmssuchasprincipalcomponentsanalysisandvectorquan-tizationcanbeunderstoodasfactorizingadatamatrixsubjecttodifferentconstraints.De-pendingupontheconstraintsutilized,theresultingfactorscanbeshowntohaveverydif-ferentrepresentationalproperties.Principalcomponentsanalysisenforcesonlyaweakor-thogonalityconstraint,resultinginaverydistributedrepresentationthatusescancellationstogeneratevariability[1,2].Ontheotherhand,vectorquantizationusesahardwinner-take-allconstraintthatresultsinclusteringthedataintomutuallyexclusiveprototypes[3].Wehavepreviouslyshownthatnonnegativityisausefulconstraintformatrixfactorizationthatcanlearnapartsrepresentationofthedata[4,5].Thenonnegativebasisvectorsthatarelearnedareusedindistributed,yetstillsparsecombinationstogenerateexpressivenessinthereconstructions[6,7].Inthissubmission,weanalyzeindetailtwonumericalalgorithmsforlearningtheoptimalnonnegativefactorsfromdata.2Non-negativematrixfactorizationWeformallyconsideralgorithmsforsolvingthefollowingproblem:Non-negativematrixfactorization(NMF)Givenanon-negativematrix,findnon-negativematrixfactorsandsuchthat:(1)NMFcanbeappliedtothestatisticalanalysisofmultivariatedatainthefollowingmanner.Givenasetofofmultivariate-dimensionaldatavectors,thevectorsareplacedinthecolumnsofanmatrixwhereisthenumberofexamplesinthedataset.Thismatrixisthenapproximatelyfactorizedintoanmatrixandanmatrix.Usuallyischosentobesmallerthanor,sothatandaresmallerthantheoriginalmatrix.Thisresultsinacompressedversionoftheoriginaldatamatrix.WhatisthesignificanceoftheapproximationinEq.(1)?Itcanberewrittencolumnbycolumnas,whereandarethecorrespondingcolumnsofand.Inotherwords,eachdatavectorisapproximatedbyalinearcombinationofthecolumnsof,weightedbythecomponentsof.Thereforecanberegardedascontainingabasisthatisoptimizedforthelinearapproximationofthedatain.Sincerelativelyfewbasisvectorsareusedtorepresentmanydatavectors,goodapproximationcanonlybeachievedifthebasisvectorsdiscoverstructurethatislatentinthedata.ThepresentsubmissionisnotaboutapplicationsofNMF,butfocusesinsteadonthetech-nicalaspectsoffindingnon-negativematrixfactorizations.Ofcourse,othertypesofma-trixfactorizationshavebeenextensivelystudiedinnumericallinearalgebra,butthenon-negativityconstraintmakesmuchofthispreviousworkinapplicabletothepresentcase[8].HerewediscusstwoalgorithmsforNMFbasedoniterativeupdatesofand.Becausethesealgorithmsareeasytoimplementandtheirconvergencepropertiesareguaranteed,wehavefoundthemveryusefulinpracticalapplications.Otheralgorithmsmaypossiblybemoreefficientinoverallcomputationtime,butaremoredifficulttoimplementandmaynotgeneralizetodifferentcostfunctions.Algorithmssimilartoourswhereonlyoneofthefactorsisadaptedhavepreviouslybeenusedforthedeconvolutionofemissiontomographyandastronomicalimages[9,10,11,12].Ateachiterationofouralgorithms,thenewvalueoforisfoundbymultiplyingthecurrentvaluebysomefactorthatdependsonthequalityoftheapproximationinEq.(1).Weprovethatthequalityoftheapproximationimprovesmonotonicallywiththeapplicationofthesemultiplicativeupdaterules.Inpractice,thismeansthatrepeatediterationoftheupdaterulesisguaranteedtoconvergetoalocallyoptimalmatrixfactorization.3CostfunctionsTofindanapproximatefactorization,wefirstneedtodefinecostfunctionsthatquantifythequalityoftheapproximation.Suchacostfunctioncanbeconstructedusingsomemeasureofdistancebetweentwonon-negativematricesand.OneusefulmeasureissimplythesquareoftheEuclideandistancebetweenand[13],(2)Thisislowerboundedbyzero,andclearlyvanishesifandonlyif.Anotherusefulmeasureis!$#%’&(*),+(3)LiketheEuclideandistancethisisalsolowerboundedbyzero,andvanishesifandonlyif.Butitcannotbecalleda“distance”,becauseitisnotsymmetricinand,sowewillrefertoitasthe“divergence”offrom.ItreducestotheKullback-Leiblerdivergence,orrelativeentropy,when--/.,sothatandcanberegardedasnormalizedprobabilitydistributions.WenowconsidertwoalternativeformulationsofNMFasoptimizationproblems:Problem1Minimizewithrespecttoand,subjecttotheconstraints.Problem2Mi

1 / 7
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功