HUNANUNIVERSITY毕业论文论文题目一种类人机器人的语音交互与软件设计学生姓名陈明学生学号201208070103专业班级智能科学与技术1班学院名称信息科学与工程学院指导老师李仁发学院院长李仁发2016年6月1日湖南大学本科生毕业设计(论文)I一种类人机器人的语音交互与软件设计摘要本文阐述了利用NAO机器人进行语音识别研究并涉及了机器人相关的常见行为交互。语音识别技术是一门涉及了语音学、声学、语言学、信号处理、人工智能等多学科的综合性技术,目前其应用越来越广。NAO机器人作为标准机器人平台应用在比赛、教育、科研等方方面面,基于NAO机器人进行相关科研是符合时代趋势与研究趋势所在。论文前面部分介绍了语音识别领域的基础方法与知识,并且简要介绍了NAO机器人的结构和功能。在理论部分,第3章介绍了GMM-HMM,即高斯混合模型-隐马尔科夫模型的理论知识。这两个模型在实验中都应用到了语音识别中。在语音识别的实验部分,通过对由NAO机器人捕获的音频流进行处理操作:音频分轨、滤波、分帧、加Hamming窗函数、语音特征提取、对样本音频流进行机器学习训练等等。完成必要的处理后,处理结果将会由本地计算机的matlab客户端传到NAO机器人控制软件Choregraphe的服务器端。机器人将会根据识别的传回的结果做出相应的行为。论文除了进行语音识别的研究外,还对NAO机器人进行定向运动、多任务并行的舞蹈、给定话题下的交流这些行为交互功能进行了设计。定向运动能够使机器人运动具体的角度和旋转方向。多任务并行的舞蹈的设计实际上是把多种任务聚合在behavior层,这些任务包括:头、足、手臂的分帧运动设计,LEDs灯组的颜色变化以及根据AldebaranRobotics公司的官方文档中QiChatSyntax部分进行给定话题下的对话设计。总的来说,本项目设计和论文的撰写包含语音识别和行为交互设计两大部分。语音识别是通过NAO机器人捕获目标音频流并通过ftp传入本地计算机继续处理。行为设计是在NAO机器人的顶层控制软件Choregraphe中进行多种行为的设计,这些行为中的特定行为将会依据语音识别的结果被触发,成功完成规定的设计任务。关键词:高斯混合模型;隐马尔可夫模型;定向运动;多任务并行舞蹈;音频流分帧;窗函数;语音特征提取;TCP/IP通信湖南大学本科生毕业设计(论文)IIAnapproachofspeechinteractionandsoftwaredesignforhumanoidrobotsAbstractTheessayillustratestheresearchonspeechrecognitionandtheusualbehaviorinteractionsonbasisofaNAOrobot.SpeechrecognitionisakindofcomprehensivetechniqueconcerningAcoustics,Phonetics,Linguistics,SignalProcessingandArtificialIntelligence,etc.Currently,thetechniquesofSpeechRecognitionarewidelyappliedintoanincreasinglynumberoffields.Asthestandardplatforminavarietyofareas,suchascompetitions,elementaryandtertiaryeducation,scientificresearch,NAOrobotsareofsignificantimportanceintermsofdoingstudies,whichisinaccordancewiththemainstreamresearch.Atthebeginningoftheessay,basicapproachesandknowledgerelatedtoSpeechRecognitionarebrieflydiscussed,followedbytheintroductionofstructureandfunctionsaboutNAOrobots.Asforthepartoftheappliedtheories,GMM(GaussianMixtureModel)andHMM(HiddenMarkovModel)aremainpointsinchapter3.Boththesetheorieswouldbeappliedinmyresearchexperiment.InthepartofpracticalexperimentsonSpeechRecognition,capturingtheaudiostreamistheinitialoperationforNAOrobot,afterwhichthetargetstreamwouldbedownloadedbylocalcomputerthroughftpcommandsinmatlabcommandwindow.Thentheprocessingofthetargetaudiostreamissupposedtobedividedintoaseriesofoperations,concerningseparatingaudiotracks(4tracksarecapturedbyNAO’smicrophones),filteringaudiowave,framingtargetaudio,addingHammingwindowfunction,extractingfeaturesoftheaudio,usingmachinelearningmethodsfortrainingaudiodataset.Aftertheaboveindispensablesteps,theprocessedresultofthetargetaudiostreamwouldbetransferredtoNAOrobot’ssocketseverinChoregraphethroughTCP/IPcommunicationprotocols.AslongastheresultistransferredtoNAO,therobotwouldbegintodothepre-designedbehaviorinaccordancewiththeresult.Inadditiontotheresearchofspeechrecognition,multiplebehaviorsofNAOrobotarestudiedanddesignedaswell.Thedesignedbehaviorsinvolveorientation-moving,multi-taskdancingandtalkingonagiventopic.Thebehavioroforientation-movingenablestherobottomovealongaspecificdirectionwithaccuratedistanceandangleofrotation.Multi-taskingdancingisabouttheconceptofparallelprocessinganditneedstheconvergenceofthe湖南大学本科生毕业设计(论文)IIIinteractivelevelsofhead,arms,legs,LEDssets,soundandmusic.Forthetalkingbehavioronagiventopic,thedesignedconversationsoughttofollowQiChatSyntax,whichcouldbedividedintoninemainclassifications,illustratedwithdetailsinAldebaranDocumentation.Toconclude,thewholeprojectdealswiththeproblemofspeechrecognitionthroughanalyzingtheaudiostreamcapturedfromNAOrobot’smicrophonesandenablestherobottodopre-designedbehaviorsinaccordancewiththerecognitionresulttransferredfromlocalcomputerthoughTCP/IPcommunicationprotocols.KeyWords:GaussianMixtureModel;HiddenMarkovModel;multi-taskingdancing;framingaudiostream;windowfunction;speechfeatureextractions;TCP/IPcommunication湖南大学本科生毕业设计(论文)IV目录毕业设计(论文)原创性声明和毕业设计(论文)版权使用授权书…….………..…...Ⅰ摘要……………………………………………………………………………….……...…ⅡAbstract………………………………………………………………………….……………Ⅲ插图索引……………………………………………………………………….….….………Ⅴ第1章绪论..............................................................................................................................11.1引言.......................................................................................................................11.2语音识别技术的发展...........................................................................................11.2.1基于模板的方法.........................................................................................21.2.2基于知识的方法.........................................................................................21.2.3连接方法.....................................................................................................21.2.4统计方法....................................................................................................31.3语音识别技术在机器人中的应用.......................................................................31.3.1智能轮椅机器人........................................................................................31.3.2语音聊天机器人...........................................................................