The BigChaos Solution to the Netflix Grand Prize

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

TheBigChaosSolutiontotheNetixGrandPrizeAndreasToscherandMichaelJahrercommendoresearch&consultingNeuerWeg23,A-8580Koach,Austriafandreas.toescher,michael.jahrerg@commendo.atRobertM.BellAT&TLabs-ResearchFlorhamPark,NJSeptember5,20091IntroductionTheteamBellKor'sPragmaticChaosisacombinedteamofBellKor,PragmaticTheoryandBigChaos.BellKorconsistsofRobertBell,YehudaKorenandChrisVolinsky.ThemembersofPragmaticTheoryareMartinPiotteandMartinChabbert.AndreasToscherandMichaelJahrerformtheteamBigChaos.BellKorwontheProgressPrize2007[4].TheProgressPrize2008waswonbythecombinede ortsofBellKorandBigChaos[5][17].ThedocumentationoftheNetixGrandPrizeconsistsofthreeparts.InthisdocumentwefocusonthecontributionofBigChaostothecombinedGrandPrizeSolution.Thedocumentisorganizedasfollows:InSection2wedescribetheNetixdatasetandimportantstatisticalproperties,followedbyadetailedexplanationofthetrainingprocedureofourpredictorsinSection3.Section4de nesthenotation,whichweusethroughoutthisdocument.ThealgorithmicdetailscanbefoundinSection5.InordertocombinethepredictorsofBigChaosandthewholeteamtoforma nalprediction,weusedacombinationofnonlinearprobeblendingandlinearquizblending.ThenonlinearprobeblendingtechniquesaredescribedinSection6,thelinearquizblendisdescribedinSection7.InAppendixAadetailedlistofallusedpredictorsisattached.2TheNetixPrizeDatasetThedatasetconsistsof5-starratingson17770moviesand480189anonymoususers.ItwascollectedbyNetixinaperiodofapproximately7years.Intotal,thenumberofratingsis100480507;theprobesetofsize1408395isasubsetofthem.Thegoalofthecontestistopredictthequalifyingset(size:2817131samples)andachieveaRMSEscoreofatleast0.8563onthequizsubset,togetquali edfortheGrandPrize.Thequizsetisanunknown50%randomsubsetofthequalifyingset.ThejudgingcriteriaforwinningtheNetixGrandPrizeisthefourdigitsroundedRMSEscoreonthetestset(remaining50%).Inthecaseofatietheearliestsubmissionwins.Theprobesethasequalstatisticalpropertiesasthequalifyingset.Furthermoreitisusedasahold-outsetduringthecompetition.Fulldescriptionoftherulescanbefoundunder[1].TheauthorcontributedSection71TrainingsetQualifyingsetProbesetQuizTestcnt=100,480,507cnt=1408395cnt=281713150%50%LeaderboardfeedbackFigure1:TheNetixPrizedatasetindetail.Ratingsareavailableforthetrainingset.Netixacceptspredictionsforthequalifyingset,thefeedback(4digitprecision)iscalculatedona50%randomsubsetofthequalifyingset,thequizset.100101102103104020004000log(support)usercount05001000150020000123x105days(from1998to2005)ratingcount1001011020510x106log(frequency)numberofuser−daysFigure2:E ectsintheratingdataset.Firstrow:Usersupportisthenumberofvotesgivenfromauser.Themodeoftheusersupportison19votes,wheretheaveragenumberofvotesis200.Secondrow:Moreratingsattheendofthetimeline.Thirdrow:Frequencyisthenumberofvotesperdayperuser.Mostusersgaveoneortwovotesperday.Theideatoexplorethefrequencye ectwasintroducedbyourcolleaguesfromPragmaticTheory.3Frameworks3.1OptimizethePredictorsIndividuallyontheProbeSetThesolutionsoftheNetixProgressPrizesof2007and2008hadafocusontheaccuracyoftheindividualcollaborative lteringalgorithms.Blendingtechniqueswereusedtocombinetheindependentlytrainedpredictors.ThepredictorsweretrainedtominimizetheRMSEontheprobeset.First,theprobesetisexcludedfromthetrainingdata.Themodelgetstrainedtominimizetheprobeset.ForgradientdescentmethodsthismeansthatthetraininghastostopwhentheRMSEontheprobesetisminimal.Thenpredictionsarestoredfortheprobeset.Afterwards,theprobesetgetsincludedintothetrainingdataandthetrainingstartsagain,withexactlythesameparametersandinitialconditions.Afterthe2secondtrainingstage,wegeneratepredictionsforthequalifyingset.Thesepredictionsachievea0.0030to0.0090betterquizRMSE,comparedtotheirprobeRMSE,thankstoexpandingthetrainingset.Foreveryalgorithmrun,theoutcomeisapredictorfortheprobeandqualifyingset,whichcanbeusedinprobeblending,seeSection6.TheindividualpredictorisoptimizedtoachievethelowestpossibleprobeRMSE.Somealgorithmsarebasedontheresidualsofothers.TocalculatetheresidualerrorweusethetrainingsetpredictionsofthetrainedpredictorasshowninFigure3.3.2OptimizetheBlendAkeyobservationwithensemblemethodsisthatitisnotoptimaltominimizetheRMSEoftheindividualpredictors.OnlytheRMSEoftheensemblecounts.Thusthepredictorswhichachievethebestblendingresultsaretheones,whichhavetherightbalancebetweenbeinguncorrelatedtotherestoftheensembleandachievingalowRMSEindividually.Anidealsolutionwouldbetotrainallmodelsinparallelandtreattheensembleasonebigmodel.Thebigproblemisthattraining100+modelsinparallelandtuningallparameterssimultaneouslyiscomputationallynotfeasible.Weapproximatethisidealsolutionbytrainingthemodelsoneafteranother,whereeachmodeltriestoachievebestresultswhenblendedwithallprecedingmodels.SothefocusshiftsfromlookingattheRMSEofanindividualpredictortotheRMSEofablendedensemble.Inthefollowing,werefertotheprobeRMSEofalinearblendwithallprecedingpredictorsas\blendRMSE.Eachpredictortriestoachievebestresultswhenitisblendedwithallprecedingones.Thereforeweneitherreformulatetheerrorfunction,norchangeanylearningrules.W

1 / 52
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功