ImageNet-Classification-with-Deep-Convolutional-Ne

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

ImageNetClassificationwithDeepConvolutionalNeuralNetworksAlexKrizhevskyUniversityofTorontokriz@cs.utoronto.caIlyaSutskeverUniversityofTorontoilya@cs.utoronto.caGeoffreyE.HintonUniversityofTorontohinton@cs.utoronto.caAbstractWetrainedalarge,deepconvolutionalneuralnetworktoclassifythe1.2millionhigh-resolutionimagesintheImageNetLSVRC-2010contestintothe1000dif-ferentclasses.Onthetestdata,weachievedtop-1andtop-5errorratesof37.5%and17.0%whichisconsiderablybetterthanthepreviousstate-of-the-art.Theneuralnetwork,whichhas60millionparametersand650,000neurons,consistsoffiveconvolutionallayers,someofwhicharefollowedbymax-poolinglayers,andthreefully-connectedlayerswithafinal1000-waysoftmax.Tomaketrain-ingfaster,weusednon-saturatingneuronsandaveryefficientGPUimplemen-tationoftheconvolutionoperation.Toreduceoverfittinginthefully-connectedlayersweemployedarecently-developedregularizationmethodcalled“dropout”thatprovedtobeveryeffective.WealsoenteredavariantofthismodelintheILSVRC-2012competitionandachievedawinningtop-5testerrorrateof15.3%,comparedto26.2%achievedbythesecond-bestentry.1IntroductionCurrentapproachestoobjectrecognitionmakeessentialuseofmachinelearningmethods.Toim-provetheirperformance,wecancollectlargerdatasets,learnmorepowerfulmodels,andusebet-tertechniquesforpreventingoverfitting.Untilrecently,datasetsoflabeledimageswererelativelysmall—ontheorderoftensofthousandsofimages(e.g.,NORB[16],Caltech-101/256[8,9],andCIFAR-10/100[12]).Simplerecognitiontaskscanbesolvedquitewellwithdatasetsofthissize,especiallyiftheyareaugmentedwithlabel-preservingtransformations.Forexample,thecurrent-besterrorrateontheMNISTdigit-recognitiontask(0.3%)approacheshumanperformance[4].Butobjectsinrealisticsettingsexhibitconsiderablevariability,sotolearntorecognizethemitisnecessarytousemuchlargertrainingsets.Andindeed,theshortcomingsofsmallimagedatasetshavebeenwidelyrecognized(e.g.,Pintoetal.[21]),butithasonlyrecentlybecomepossibletocol-lectlabeleddatasetswithmillionsofimages.ThenewlargerdatasetsincludeLabelMe[23],whichconsistsofhundredsofthousandsoffully-segmentedimages,andImageNet[6],whichconsistsofover15millionlabeledhigh-resolutionimagesinover22,000categories.Tolearnaboutthousandsofobjectsfrommillionsofimages,weneedamodelwithalargelearningcapacity.However,theimmensecomplexityoftheobjectrecognitiontaskmeansthatthisprob-lemcannotbespecifiedevenbyadatasetaslargeasImageNet,soourmodelshouldalsohavelotsofpriorknowledgetocompensateforallthedatawedon’thave.Convolutionalneuralnetworks(CNNs)constituteonesuchclassofmodels[16,11,13,18,15,22,26].Theircapacitycanbecon-trolledbyvaryingtheirdepthandbreadth,andtheyalsomakestrongandmostlycorrectassumptionsaboutthenatureofimages(namely,stationarityofstatisticsandlocalityofpixeldependencies).Thus,comparedtostandardfeedforwardneuralnetworkswithsimilarly-sizedlayers,CNNshavemuchfewerconnectionsandparametersandsotheyareeasiertotrain,whiletheirtheoretically-bestperformanceislikelytobeonlyslightlyworse.1DespitetheattractivequalitiesofCNNs,anddespitetherelativeefficiencyoftheirlocalarchitecture,theyhavestillbeenprohibitivelyexpensivetoapplyinlargescaletohigh-resolutionimages.Luck-ily,currentGPUs,pairedwithahighly-optimizedimplementationof2Dconvolution,arepowerfulenoughtofacilitatethetrainingofinterestingly-largeCNNs,andrecentdatasetssuchasImageNetcontainenoughlabeledexamplestotrainsuchmodelswithoutsevereoverfitting.Thespecificcontributionsofthispaperareasfollows:wetrainedoneofthelargestconvolutionalneuralnetworkstodateonthesubsetsofImageNetusedintheILSVRC-2010andILSVRC-2012competitions[2]andachievedbyfarthebestresultseverreportedonthesedatasets.Wewroteahighly-optimizedGPUimplementationof2Dconvolutionandalltheotheroperationsinherentintrainingconvolutionalneuralnetworks,whichwemakeavailablepublicly1.Ournetworkcontainsanumberofnewandunusualfeatureswhichimproveitsperformanceandreduceitstrainingtime,whicharedetailedinSection3.Thesizeofournetworkmadeoverfittingasignificantproblem,evenwith1.2millionlabeledtrainingexamples,soweusedseveraleffectivetechniquesforpreventingoverfitting,whicharedescribedinSection4.Ourfinalnetworkcontainsfiveconvolutionalandthreefully-connectedlayers,andthisdepthseemstobeimportant:wefoundthatremovinganyconvolutionallayer(eachofwhichcontainsnomorethan1%ofthemodel’sparameters)resultedininferiorperformance.Intheend,thenetwork’ssizeislimitedmainlybytheamountofmemoryavailableoncurrentGPUsandbytheamountoftrainingtimethatwearewillingtotolerate.OurnetworktakesbetweenfiveandsixdaystotrainontwoGTX5803GBGPUs.AllofourexperimentssuggestthatourresultscanbeimprovedsimplybywaitingforfasterGPUsandbiggerdatasetstobecomeavailable.2TheDatasetImageNetisadatasetofover15millionlabeledhigh-resolutionimagesbelongingtoroughly22,000categories.TheimageswerecollectedfromthewebandlabeledbyhumanlabelersusingAma-zon’sMechanicalTurkcrowd-sourcingtool.Startingin2010,aspartofthePascalVisualObjectChallenge,anannualcompetitioncalledtheImageNetLarge-ScaleVisualRecognitionChallenge(ILSVRC)hasbeenheld.ILSVRCusesasubsetofImageNetwithroughly1000imagesin

1 / 9
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功