An empirical study of learning speed in back-propa

z659396023
1 ℃
2020-01-26

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

AnEmpiricalStudyofLearningSpeedinBack-PropagationNetworksScottE.FahlmanSeptember1988CMU-CS-88-162AbstractMostconnectionistorneuralnetworklearningsystemsusesomeformoftheback-propagationalgorithm.However,back-propagationlearningistooslowformanyapplications,anditscalesuppoorlyastasksbecomelargerandmorecomplex.Thefactorsgoverninglearningspeedarepoorlyunderstood.Ihavebegunasystematic,empiricalstudyoflearningspeedinbackprop-likealgorithms,measuredagainstavarietyofbenchmarkproblems.Thegoalistwofold:todevelopfasterlearningalgorithmsandtocontributetothedevelopmentofamethodologythatwillbeofvalueinfuturestudiesofthiskind.Thispaperisaprogressreportdescribingtheresultsobtainedduringthefirstsixmonthsofthisstudy.TodateIhavelookedonlyatalimitedsetofbenchmarkproblems,buttheresultsontheseareencouraging:Ihavedevelopedanewlearningalgorithmthatisfasterthanstandardbackpropbyanorderofmagnitudeormoreandthatappearstoscaleupverywellastheproblemsizeincreases.ThisresearchwassponsoredinpartbytheNationalScienceFoundationunderContractNumberEET-8716324andbytheDefenseAdvancedResearchProjectsAgency(DOD),ARPAOrderNo.4976underContractF33615-87-C-1499andmonitoredbytheAvionicsLaboratory,AirForceWrightAeronauticalLaboratories,AeronauticalSystemsDivision(AFSC),Wright-PattersonAFB,OH45433-6543.Theviewsandconclusionscontainedinthisdocumentarethoseoftheauthorsandshouldnotbeinterpretedasrepresentingtheofficialpolicies,eitherexpressedorimplied,oftheseagenciesoroftheU.S.Government.11.IntroductionNote:InthispaperIwillnotattempttoreviewthebasicideasofconnectionismorback-propagationlearning.See[3]forabriefoverviewofthisareaand[10],chapters1-8,foradetailedtreatment.WhenIrefertostandardback-propagationinthispaper,Imeantheback-propagationalgorithmwithmomentum,asdescribedin[9].Thegreatestsingleobstacletothewidespreaduseofconnectionistlearningnetworksinreal-worldapplicationsistheslowspeedatwhichthecurrentalgorithmslearn.Atpresent,thefastestlearningalgorithmformostpurposesisthealgorithmthatisgenerallyknownasback-propagationorbackprop[6,7,9,18].Theback-propagationlearningalgorithmrunsfasterthanearlierlearningmethods,butitisstillmuchslowerthanwewouldlike.Evenonrelativelysimpleproblems,standardback-propagationoftenrequiresthecompletesetoftrainingexamplestobepresentedhundredsorthousandsoftimes.Thismeansthatwearelimitedtoinvestigatingrathersmallnetworkswithonlyafewthousandtrainableweights.Someproblemsofreal-worldimportancecanbetackledusingnetworksofthissize,butmostofthetasksforwhichconnectionisttechnologymightbeappropriatearemuchtoolargeandcomplextobehandledbyourcurrentlearning-networktechnology.OnesolutionistorunournetworksimulationsonfastercomputersortoimplementthenetworkelementsdirectlyinVLSIchips.Anumberofgroupsareworkingonfasterimplementations,includingagroupatCMUthatisusingthe10-processorWarpmachine[13].Thisworkisimportant,butevenifwehadanetworkimplementeddirectlyinhardwareourslowlearningalgorithmswouldstilllimittherangeofproblemswecouldattack.Advancesinlearningalgorithmsandinimplementationtechnologyarecomplementary.Ifwecancombinehardwarethatrunsseveralordersofmagnitudefasterandlearningalgorithmsthatscaleupwelltoverylargenetworks,wewillbeinapositiontotackleamuchlargeruniverseofpossibleapplications.SinceJanuaryof1988Ihavebeenconductinganempiricalstudyoflearningspeedinsimulatednetworks.Ihavestudiedthestandardbackpropalgorithmandanumberofvariationsonstandardback-propagation,applyingthesetoasetofmoderate-sizedbenchmarkproblems.ManyofthevariationsthatIhaveinvestigatedwerefirstproposedbyotherresearchers,butuntilnowtherehavebeennosystematicstudiestocomparethesemethods,individuallyandinvariouscombinations,againstastandardsetoflearningproblems.Onlythroughsuchsystematicstudiescanwehopetounderstandwhichmethodsworkbestinwhichsituations.Thispaperisareportontheresultsobtainedinthefirstsixmonthsofthisstudy.Perhapsthemostimportantresultistheidentificationofanewlearningmethod--actuallyacombinationofseveralideas--thatonarangeofencoder/decoderproblemsisfasterthanstandardback-propagationbyanorderofmagnitudeormore.Thisnewmethodalsoappearstoscaleupmuchbetterthanstandardbackpropasthesizeandcomplexityofthelearningtaskgrows.Imustemphasizethatthisisaprogressreport.Thelearning-speedstudyisfarfromcomplete.UntilnowIhaveconcentratedmostofmyeffortonasingleclassofbenchmarks,namelytheencoder/decoderproblems.Likeanyfamilyofbenchmarkstakeninisolation,encoder/decoderproblemshavecertainpeculiaritiesthatmaybiastheresultsofthestudy.Untilamorecomprehensivesetofbenchmarkshasbeenrun,itwouldbeprematuretodrawanysweepingconclusionsormakeanystrongclaimsaboutthewidespreadapplicabilityofthesetechniques.22.Methodology2.1.WhatMakesaGoodBenchmark?Atpresentthereisnowidelyacceptedmethodologyformeasuringandcomparingthespeedofvariousconnectionistlearningalgorithms.Someresearchershaveproposednewalgorithmsbasedonlyonatheoreticalanalysisoftheproblem.Itissometimeshardtodeterminehowwellthesetheoreticalmodelsfitactualpractice.Otherresearchersimplementtheirideasandrunoneortwobenchmarkstodemonstratethespeedoftheresultings