语音处理第四次组会

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

2013.11.301.尝试使用Alize+Spro+python构建说话人识别平台(1)ALIZEversion3.x-[]-LIA_RAL-LIA_SpkDET(2)Spro4.0.1-[]-Filter-bankcepstralfeatures(3)python3.3.22.《机器学习的哲学探索》读书笔记参考素材1.paper-[ALIZE,afreetoolkitforspeakerrecognition]2.paper-[ALIZE/SpkDet:astate-of-the-artopensourcesoftwareforspeakerrecognition]3.HTK-[]5.BILL’sblock”使用Alize等工具构建说话人识别平台”-[]6.ALIZE3.0-Open-sourceplatformforspeakerrecognition-[]ALIZE介绍•TheALIZEprojectconsistsofalowlevelAPI(ALIZE)andasetofhighlevelexecutablesthatformtheLIA_RALtoolkit.Theensemblemakesitpossibletoeasilysetupaspeakerrecognitionsystemforresearchpurposesaswellasdevelopindustrybasedapplications.•LIA_RALisahighleveltoolkitbasedonthelowlevelALIZEAPI.Itconsistsofthreesetsofexecutables:LIA_SpkSeg,LIA_UtilsandLIA_SpkDET.LIA_SpkSegandLIA_UtilsrespectivelyincludeexecutablesdedicatedtospeakersegmentationandutilityprogramstohandleALIZEobjectswhileLIA_SpkDetisdevelopedtofulfilthemainfunctionsofastate-of-the-artspeakerrecognitionsystemasdescribedinthefollowingfigure.ALIZE介绍•ALIZEdoesnotincludeacousticfeatureextractionbutiscompatiblewithSPro,HTKandRAWformats•ScorematricescanbeexportedinbinaryformateasilyhandledbytheBOSARIStoolkitSPro介绍sproisaimedatextractingfeaturesintheareaofspeakerrecognition,youcanextractfeaturessuchasmfccandlpc.SProisafreespeechsignalprocessingtoolkitwhichprovidesruntimecommandsimplementingstandardfeatureextractionalgorithmsforspeechrelatedapplicationsandaClibrarytoimplementnewalgorithmsandtouseSProfileswithinyourownprograms.SProwasoriginallydesignedforvariableresolutionspectralanalysisbutalsoprovidesforfeatureextractiontechniquesclassicallyusedinspeechapplications.Therearecommandsforthefollowingrepresentations:filter-bankenergiescepstralcoefficientslinearpredictionderivedrepresentationSPro介绍Thoughthetoolkithasbeendesignedasafront-endforapplicationssuchasspeechorspeakerrecognition,webelievethelibraryprovidesenoughpossibilitiestoimplementvariousfeatureextractionalgorithmseasily(e.g.zerocrossingrate).However,nocommandforsuchfeaturesisprovided.Thelibrary,writteninANSIC,providesfunctionsforthefollowing:•waveformsignalinput•low-levelsignalprocessing(FFT,LPCanalysis,etc.)•low-levelfeatureprocessing(lifter,CMS,variancenormalization,deltas,etc.)•featureI/OSPro介绍Thelibrarydoesnotprovideforhigh-levelfeatureextractionfunctionswhichdirectlyconvertsawaveformintofeatures,mainlybecausesuchfunctionswouldrequireatremendousnumberofargumentsinordertobeversatile.However,itisrathertrivialtowritesuchafunctionforyourparticularneedsusingtheSProlibrary.SPro介绍Filter-bankcepstralfeatures•Thesecondfilter-bankanalysistool,sfbcep,takesasinputawaveformandoutputfilter-bankderivedcepstralfeatures.Thefilter-bankprocessingissimilartowhatisdoneinsfbank(seeprevioussection).ThecepstralcoefficientsarecomputedbyDCT'ingthefilter-banklog-magnitudesandpossiblyliftered.•Optionally,thelog-energycanbeaddedtothefeaturevector.Insfbcep,theframeenergyiscalculatedasthesumofthesquaredwaveformsamplesafterwindowing.Asforthemagnitudesinthefilter-bank,thelog-energyarethresholdedtokeepthempositiveornull.Thelog-energiesmaybescaledtoavoiddifferencesbetweenrecordings.•Meanandvariancenormalizationofthestaticcepstralcoefficientscanbespecifiedwiththeglobal`--cms'and`--normalize'optionsbutdonotapplytolog-energies.Thenormalizationscanbeglobal(default)orbasedonaslidingwindowwhoselengthisspecifiedwith`--segment-length'.•Finally,firstandsecondorderderivativesofthecepstralcoefficientsandofthelog-energiescanbeappendedtothefeaturevectors.Whenusingdeltafeatures,theabsolutelog-energycanbesuppressedusingthe`--no-static-energy'option第1步,特征提取MFCCsfbcep.exe(MFCC)第2步,Silenceremoval静音检测和去除NormFeat.exe先能量规整EnergyDetector.exe基于能量检测的静音去除第3步,FeaturesNormalization特征规整NormFeat.exe再使用这个工具进行特征规整第4步,WorldmodeltrainingTrainWorld.exe训练UBM第5步,TargetmodeltrainingTrainWorld.exe在训练好UBM的基础上,训练trainingset和testingset的GMM第6步,TestingComputeTest.exe将testingset的GMM在trainingset的GMM上进行测试和打分第7步,ScoreNormalizationComputeNorm.exe将得分进行规整第8步,ComputeEER计算等错误率可以查查计算EER的matlab代码,NISTSRE的官网上有下载()others•关于各步骤中参数的问题,可以在命令行“工具-help”来查看该工具个参数的具体含义,另外还可参考Alize源码中各个工具的test目录中提供的实例,而关于每个工具的作用及理论知识则需要查看相关论文。•常见问题及解答:•更多问题请在Google论坛(=&hl=zh-CN#!forum/alize—voice-print-recognition)提出,大家一起讨论!Others-ALIZE中用到的功能(其它功能作用待研究)Others-浅谈Python程序和C程序的整合利用ctypes模块整合Python程序和C程序ctypes是Python的一个标准模块,它包含在Python2.3及以上的版本里。ctypes是一个Python的高级外部函数接口,它使得Python程序可以调用C语言编译的静态链接库和动态链接库。运用ctypes模块,能够在Python源程序中创建,访问和操作简单的或复杂的C语言数据类型。最为重要的是ctypes模块能够在多个平台上工作,包括Windows,WindowsCE,MacOSX,Linux,Solaris,

1 / 30
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功