TrainingaDeepNeuralNetworkforDigitClassificationOpenThisExampleThisexampleshowshowtouseNeuralNetworkToolbox™totrainadeepneuralnetworktoclassifyimagesofdigits.Neuralnetworkswithmultiplehiddenlayerscanbeusefulforsolvingclassificationproblemswithcomplexdata,suchasimages.Eachlayercanlearnfeaturesatadifferentlevelofabstraction.However,trainingneuralnetworkswithmultiplehiddenlayerscanbedifficultinpractice.Onewaytoeffectivelytrainaneuralnetworkwithmultiplelayersisbytrainingonelayeratatime.Youcanachievethisbytrainingaspecialtypeofnetworkknownasanautoencoderforeachdesiredhiddenlayer.Thisexampleshowsyouhowtotrainaneuralnetworkwithtwohiddenlayerstoclassifydigitsinimages.Firstyoutrainthehiddenlayersindividuallyinanunsupervisedfashionusingautoencoders.Thenyoutrainafinalsoftmaxlayer,andjointhelayerstogethertoformadeepnetwork,whichyoutrainonefinaltimeinasupervisedfashion.DatasetTrainingthefirstautoencoderVisualizingtheweightsofthefirstautoencoderTrainingthesecondautoencoderTrainingthefinalsoftmaxlayerFormingastackedneuralnetworkFinetuningthedeepneuralnetworkSummaryDatasetThisexampleusessyntheticdatathroughout,fortrainingandtesting.Thesyntheticimageshavebeengeneratedbyapplyingrandomaffinetransformationstodigitimagescreatedusingdifferentfonts.Eachdigitimageis28-by-28pixels,andthereare5,000trainingexamples.Youcanloadthetrainingdata,andviewsomeoftheimages.%Loadthetrainingdataintomemory[xTrainImages,tTrain]=digittrain_dataset;%Displaysomeofthetrainingimagesclffori=1:20subplot(4,5,i);imshow(xTrainImages{i});endThelabelsfortheimagesarestoredina10-by-5000matrix,whereineverycolumnasingleelementwillbe1toindicatetheclassthatthedigitbelongsto,andallotherelementsinthecolumnwillbe0.Itshouldbenotedthatifthetenthelementis1,thenthedigitimageisazero.TrainingthefirstautoencoderBeginbytrainingasparseautoencoderonthetrainingdatawithoutusingthelabels.Anautoencoderisaneuralnetworkwhichattemptstoreplicateitsinputatitsoutput.Thus,thesizeofitsinputwillbethesameasthesizeofitsoutput.Whenthenumberofneuronsinthehiddenlayerislessthanthesizeoftheinput,theautoencoderlearnsacompressedrepresentationoftheinput.Neuralnetworkshaveweightsrandomlyinitializedbeforetraining.Thereforetheresultsfromtrainingaredifferenteachtime.Toavoidthisbehavior,explicitlysettherandomnumbergeneratorseed.rng('default')Setthesizeofthehiddenlayerfortheautoencoder.Fortheautoencoderthatyouaregoingtotrain,itisagoodideatomakethissmallerthantheinputsize.hiddenSize1=100;Thetypeofautoencoderthatyouwilltrainisasparseautoencoder.Thisautoencoderusesregularizerstolearnasparserepresentationinthefirstlayer.Youcancontroltheinfluenceoftheseregularizersbysettingvariousparameters:L2WeightRegularizationcontrolstheimpactofanL2regularizerfortheweightsofthenetwork(andnotthebiases).Thisshouldtypicallybequitesmall.SparsityRegularizationcontrolstheimpactofasparsityregularizer,whichattemptstoenforceaconstraintonthesparsityoftheoutputfromthehiddenlayer.Notethatthisisdifferentfromapplyingasparsityregularizertotheweights.SparsityProportionisaparameterofthesparsityregularizer.Itcontrolsthesparsityoftheoutputfromthehiddenlayer.AlowvalueforSparsityProportionusuallyleadstoeachneuroninthehiddenlayerspecializingbyonlygivingahighoutputforasmallnumberoftrainingexamples.Forexample,ifSparsityProportionissetto0.1,thisisequivalenttosayingthateachneuroninthehiddenlayershouldhaveanaverageoutputof0.1overthetrainingexamples.Thisvaluemustbebetween0and1.Theidealvaluevariesdependingonthenatureoftheproblem.Nowtraintheautoencoder,specifyingthevaluesfortheregularizersthataredescribedabove.autoenc1=trainAutoencoder(xTrainImages,hiddenSize1,...'MaxEpochs',400,...'L2WeightRegularization',0.004,...'SparsityRegularization',4,...'SparsityProportion',0.15,...'ScaleData',false);Youcanviewadiagramoftheautoencoder.Theautoencoderiscomprisedofanencoderfollowedbyadecoder.Theencodermapsaninputtoahiddenrepresentation,andthedecoderattemptstoreversethismappingtoreconstructtheoriginalinput.view(autoenc1)VisualizingtheweightsofthefirstautoencoderThemappinglearnedbytheencoderpartofanautoencodercanbeusefulforextractingfeaturesfromdata.Eachneuronintheencoderhasavectorofweightsassociatedwithitwhichwillbetunedtorespondtoaparticularvisualfeature.Youcanviewarepresentationofthesefeatures.plotWeights(autoenc1);Youcanseethatthefeatureslearnedbytheautoencoderrepresentcurlsandstrokepatternsfromthedigitimages.The100-dimensionaloutputfromthehiddenlayeroftheautoencoderisacompressedversionoftheinput,whichsummarizesitsresponsetothefeaturesvisualizedabove.Trainthenextautoencoderonasetofthesevectorsextractedfromthetrainingdata.First,youmustusetheencoderfromthetrainedautoencodertogeneratethefeatures.feat1=encode(autoenc1,xTrainImages);TrainingthesecondautoencoderAftertrainingthefirstautoencoder,youtrainthesecondautoencoderinasimilarway.Themaindifferenceisthatyouusethefeaturesthatweregeneratedfromthefirstautoencoderasthetrainingdatainthesecondautoencoder.Also,youdecreasethesizeofthehiddenrepresentationto50,sothattheencoderinthesecondautoencoderlearnsanevensmall