百度文库-让每个人平等地提升自我I基于MATLAB的手写体数字识别算法的实现与分析摘要手写体数字识别是利用计算机自动辨认手写体阿拉伯数字的一种技术,是光学字符识别技术的一个分支。手写体数字识别在邮政编码、财务报表、银行票据、各种凭证以及调查表格的识别等等方面有着重要应用,由于数字识别经常涉及财会、金融领域,其严格性更是不言而喻的。所以,对识别系统的可靠性和识别率要求很高,构成了手写体数字识别面临的主要困难,大批量数据处理对系统速度又有相当高的要求。本文基于MNIST数据集,通过Matlab平台,对决策树算法、SVM算法和人工神经网络(ANN)算法进行实现,并对分类算法的准确率进行评估。实验结果表明,人工神经网络(ANN)的准确率最高,为99.69%,SVM算法次之,准确率为94.53%,决策树算法的准确率为83.53%。三种分类算法中,决策树算法的速度最快,SVM算法的速度最慢。另外,针对每一种分类算法在MNIST数据集上的实验结果,本文还得出以下结论:第一,MNIST数据集的归一化与否对决策树的分类效果几乎没有影响;对SVM的分类效果影响较大,未归一化时的准确率为11.35%,归一化之后的准确率为94.53%;对人工神经网络的分类效果影响较小,未归一化时的准确率为82.11%,归一化之后的准确率为99.69%。这说明三种分类算法对数据的不平衡分布的敏感程度各不相同。第二,对于SVM分类算法,当训练数据集的样本容量小于60000(MNIST训练数据集的最大样本容量)时,该算法对测试数据集分类预测的准确率随样本容量的增大而增大。第三,针对人工神经网络,数据类标签的表示形式对分类预测的准确率的影响较大。使用10位数据表示类标签是的准确率为99.69%,远远高于使用1位数据表示类标签时的准确率60.24%。关键词:手写体数字识别;决策树算法;SVM算法;人工神经网络算法百度文库-让每个人平等地提升自我IIABSTRACTHandwrittennumeralrecognitionisatechniquethatusescomputertorecognizehandwrittenArabicnumeralsautomaticallyandisabranchofopticalcharacterrecognitiontechnology.Handwrittennumeralrecognitionhasimportantapplicationsinpostalcodes,financialstatements,banknotes,variouskindsofvouchersandtheidentificationofsurveyforms.Sincedigitalidentificationofteninvolvesaccountingandfinance,itsstrictnessisself-evident.Thedemandforidentificationsystemofthereliabilityandrecognitionrateisveryhigh,constitutingahandwrittendigitalidentificationfacingmajordifficulties,high-volumedataprocessingonthesystemspeedandaveryhighdemand.Inthispaper,weuseMatlabtoimplementdecisiontreealgorithm,SVMalgorithmandartificialneuralnetwork(ANN)algorithmbasedonMNISTdataset,andtheaccuracyofclassificationalgorithmsiscalculatedbyusingtherealdatatag.Experimentalresultsshowthattheartificialneuralnetwork(ANN)thehighestaccuracyratefor99.69%,SVMalgorithm,followedby94.53percentaccuracyrate,decisiontreealgorithmaccuracyis83.53%.Intermsofspeed,decisiontreealgorithmisthefastest,SVMalgorithmistheslowest.Inaddition,foreachclassificationalgorithmwealsoconcludedthat:Firstly,whetherornottheMNISTdatasetisnormalizedhasnoeffectintheclassificationtree;WhileithasagreatimpactonSVMclassification.Whenitisnotnormalizedtheaccuracyis11.35%,andafternormalizedtheaccuracyis94.53%;Theartificialneuralnetworkclassificationislessaffected,andwhenitisnotnormalizedtheaccuracyis82.11%whileafternormalizedtheaccuracyis99.69%.Thisshowsthesensitivityofthethreeclassificationalgorithmstounbalanceddistributionofdata.Secondly,fortheSVMclassificationalgorithm,whenthesamplesizeislessthan60,000(maximumsizeofMNISTtestdataset),theaccuracyincreaseswiththeincreasingofsamplesize.Thirdly,fortheartificialneuralnetwork,theimpactofclasslabelrepresentationislargeontheclassificationaccuracy.Whenusing10bitstorepresentclasslabels,theaccuracyis99.69%,farhigherthantheaccuracyof60.24%whenusing1bittorepresentdatalabels.KEYWORDS:Handwrittennumeralrecognition;Decisiontreealgorithm;SVMalgorithm;Artificialneuralnetworkalgorithm百度文库-让每个人平等地提升自我III目录ABSTRACT..................................................................................................................................II1.引言...................................................................................................................................11.1手写数字识别...........................................................................................................12.分类算法...........................................................................................................................12.1决策树算法...............................................................................................................22.1.1ID3算法....................................................................................................................22.1.2C4.5算法..................................................................................................................32.1.3CART算法................................................................................................................32.1.4SLIQ算法..................................................................................................................32.1.5SPRINT算法.............................................................................................................32.1.6经典决策树算法的比较.........................................................................................42.2支持向量机...............................................................................................................42.3人工神经网络...........................................................................................................62.3.1人工神经网络的原理..............................................................................................62.3.2反向传播网络(BP)..............................................................................................62.3.3Hopfield网络...........................................................................................................83实验过程与结果分析.....................................................................................................103.1实验环境..........................................