An empirical study of learning speed in back-propa

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

AnEmpiricalStudyofLearningSpeedinBack-PropagationNetworksScottE.FahlmanSeptember1988CMU-CS-88-162AbstractMostconnectionistorneuralnetworklearningsystemsusesomeformoftheback-propagationalgorithm.However,back-propagationlearningistooslowformanyapplications,anditscalesuppoorlyastasksbecomelargerandmorecomplex.Thefactorsgoverninglearningspeedarepoorlyunderstood.Ihavebegunasystematic,empiricalstudyoflearningspeedinbackprop-likealgorithms,measuredagainstavarietyofbenchmarkproblems.Thegoalistwofold:todevelopfasterlearningalgorithmsandtocontributetothedevelopmentofamethodologythatwillbeofvalueinfuturestudiesofthiskind.Thispaperisaprogressreportdescribingtheresultsobtainedduringthefirstsixmonthsofthisstudy.TodateIhavelookedonlyatalimitedsetofbenchmarkproblems,buttheresultsontheseareencouraging:Ihavedevelopedanewlearningalgorithmthatisfasterthanstandardbackpropbyanorderofmagnitudeormoreandthatappearstoscaleupverywellastheproblemsizeincreases.ThisresearchwassponsoredinpartbytheNationalScienceFoundationunderContractNumberEET-8716324andbytheDefenseAdvancedResearchProjectsAgency(DOD),ARPAOrderNo.4976underContractF33615-87-C-1499andmonitoredbytheAvionicsLaboratory,AirForceWrightAeronauticalLaboratories,AeronauticalSystemsDivision(AFSC),Wright-PattersonAFB,OH45433-6543.Theviewsandconclusionscontainedinthisdocumentarethoseoftheauthorsandshouldnotbeinterpretedasrepresentingtheofficialpolicies,eitherexpressedorimplied,oftheseagenciesoroftheU.S.Government.11.IntroductionNote:InthispaperIwillnotattempttoreviewthebasicideasofconnectionismorback-propagationlearning.See[3]forabriefoverviewofthisareaand[10],chapters1-8,foradetailedtreatment.WhenIrefertostandardback-propagationinthispaper,Imeantheback-propagationalgorithmwithmomentum,asdescribedin[9].Thegreatestsingleobstacletothewidespreaduseofconnectionistlearningnetworksinreal-worldapplicationsistheslowspeedatwhichthecurrentalgorithmslearn.Atpresent,thefastestlearningalgorithmformostpurposesisthealgorithmthatisgenerallyknownasback-propagationorbackprop[6,7,9,18].Theback-propagationlearningalgorithmrunsfasterthanearlierlearningmethods,butitisstillmuchslowerthanwewouldlike.Evenonrelativelysimpleproblems,standardback-propagationoftenrequiresthecompletesetoftrainingexamplestobepresentedhundredsorthousandsoftimes.Thismeansthatwearelimitedtoinvestigatingrathersmallnetworkswithonlyafewthousandtrainableweights.Someproblemsofreal-worldimportancecanbetackledusingnetworksofthissize,butmostofthetasksforwhichconnectionisttechnologymightbeappropriatearemuchtoolargeandcomplextobehandledbyourcurrentlearning-networktechnology.OnesolutionistorunournetworksimulationsonfastercomputersortoimplementthenetworkelementsdirectlyinVLSIchips.Anumberofgroupsareworkingonfasterimplementations,includingagroupatCMUthatisusingthe10-processorWarpmachine[13].Thisworkisimportant,butevenifwehadanetworkimplementeddirectlyinhardwareourslowlearningalgorithmswouldstilllimittherangeofproblemswecouldattack.Advancesinlearningalgorithmsandinimplementationtechnologyarecomplementary.Ifwecancombinehardwarethatrunsseveralordersofmagnitudefasterandlearningalgorithmsthatscaleupwelltoverylargenetworks,wewillbeinapositiontotackleamuchlargeruniverseofpossibleapplications.SinceJanuaryof1988Ihavebeenconductinganempiricalstudyoflearningspeedinsimulatednetworks.Ihavestudiedthestandardbackpropalgorithmandanumberofvariationsonstandardback-propagation,applyingthesetoasetofmoderate-sizedbenchmarkproblems.ManyofthevariationsthatIhaveinvestigatedwerefirstproposedbyotherresearchers,butuntilnowtherehavebeennosystematicstudiestocomparethesemethods,individuallyandinvariouscombinations,againstastandardsetoflearningproblems.Onlythroughsuchsystematicstudiescanwehopetounderstandwhichmethodsworkbestinwhichsituations.Thispaperisareportontheresultsobtainedinthefirstsixmonthsofthisstudy.Perhapsthemostimportantresultistheidentificationofanewlearningmethod--actuallyacombinationofseveralideas--thatonarangeofencoder/decoderproblemsisfasterthanstandardback-propagationbyanorderofmagnitudeormore.Thisnewmethodalsoappearstoscaleupmuchbetterthanstandardbackpropasthesizeandcomplexityofthelearningtaskgrows.Imustemphasizethatthisisaprogressreport.Thelearning-speedstudyisfarfromcomplete.UntilnowIhaveconcentratedmostofmyeffortonasingleclassofbenchmarks,namelytheencoder/decoderproblems.Likeanyfamilyofbenchmarkstakeninisolation,encoder/decoderproblemshavecertainpeculiaritiesthatmaybiastheresultsofthestudy.Untilamorecomprehensivesetofbenchmarkshasbeenrun,itwouldbeprematuretodrawanysweepingconclusionsormakeanystrongclaimsaboutthewidespreadapplicabilityofthesetechniques.22.Methodology2.1.WhatMakesaGoodBenchmark?Atpresentthereisnowidelyacceptedmethodologyformeasuringandcomparingthespeedofvariousconnectionistlearningalgorithms.Someresearchershaveproposednewalgorithmsbasedonlyonatheoreticalanalysisoftheproblem.Itissometimeshardtodeterminehowwellthesetheoreticalmodelsfitactualpractice.Otherresearchersimplementtheirideasandrunoneortwobenchmarkstodemonstratethespeedoftheresultings

1 / 19
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功