HumanFeaturesforSpeechProductionTopicsTheproductionofspeech.TheclassificationofspeechProductionofSpeechPhysiologicalstructureFormationofspeechProductionofSpeechArticularTrachea:Larynx;vocalTrcatsTrachea气管Esophagus食管Epiglottis会厌palatine上颚Larynx喉Nasal鼻的]膜StructureofthroatThroatlocatesattopoftrachea,itiscomposedofaloopingcartilagelacatedattheendoftrachea.TwopieceofmuscleWithinitwascalledvocalcords.andthecavityinvocalcordsareglottis.whenthevocalcordsopens,theglottiswasopenedandtheaircanbeexpiredfreely.Normalexpirationisjustinthiscase.MechanismthyroidcartilageglottisvocalcordsCricoidcartilageFrontFormationofspeechAirflow:AirinlungissqueezedExcitation:AirflowthroughglottisvocalSpeech:ExcitationexertvocaltractHumanFeaturesforSpeechProductionLarynxVocalSourceandVocalFoldVoicePhysicsTheVoiceisoneofthemostfundamentalsound-makingdevicesMuchofthebraindedicatedtoproducingandanalysingspeech(visual/motor/auditorycortex;Broca’s/Wernike’sareas;frontal/temporallobesetc)Veryversatile:canmimicotheranimals,produceloudandsoftsounds,createmusicalsoundsetc.‘Source–Filter’modelofvowelproduction.AmodelforanalysinghowvowelsareproducedManyphysicalfeaturesinvolvedinspeechproductionSourceProductionVocalfolds(notcords)incentreoflarynx2foldsofskinwhichcreateanarrowgapforairtopassthroughAbduction(open)Adduction(closed)WhenairpassesthroughtheadductedfoldstheyoscillateThiscreatesaseriesofpulseswhichareharmonicallyrichAudiotrack28vocalfoldclickLikenedtoclarinetreedwithsomedifferences(later)FoldsopenFoldsclosedVocalTractAcoustics(thefilter)ThevocaltractcanbemodelledasatubeclosedatoneendSimilartostandingwavesinrooms,tubesexhibitmodesor‘favouredfrequencies’InthevocaltractthesemodalfrequenciesarereferredtoasFormantsAtypicalspeakingvoicewillcontain5ormoreformantsTheformantsgraduallydecayinintensityasthefrequencyrisesChangingtheModesThepositionoftheformantscanbealteredbymanipulatingtheshapeofthetubeorvocaltractNarrowingthetubetoapoint–Raisesthefrequencyofanymodethatexhibitsanantinodeatthatpoint–LowersthefrequencyofanymodethatexhibitsatnodeatthatpointWideningthetubeatapoint–Lowersthefrequencyofanymodethatexhibitsanantinodeatthatpoint–RaisesthefrequencyofanymodethatexhibitsanodeatthatpointThisgivesthespeakertheabilitytoformvowels鼻端嘴唇17cm8.5cm13cm声道的无损模型谐振频率的计算谐振频率发生在:Fn=(声道的横截面是均匀的,发元音e时,声道近似是均匀的。)L=17cm,声道的长度n=1,2,3…称为第一共振峰F1=500Hz、第二共振峰F2=1500Hz、第三共振峰F3=2500Hz,…c=340m/s2n-14LcCalculationsThemodesaredeterminedbythelengthofthetube:Fn=Cn/4L(n=1,3,5,7etc;C=speedsound=340ms-1;L=tubelengthm)E.gAssumevocaltractlength=22cm,calculatethefirst3modalfrequenciesofthetractF1=Cn/4L=340x1/0.22x4=386.36HzF3=340x3/0.22x4=1159.1HzF5=340x5/0.22x4=1931.8HzEx.Calculatethefirst3modesforavocaltractlengthof15cm(achild)PhonationQualitiesSpectralqualitydeterminedbyhowtight/closethevocalfoldsareheldLouderphonationgreaterairflow=brightersoundsimilartomanymusicalinstrumentsPhysiologicalFactorsOtherarticulatorsinvolvedinspeechproductionTheTongueTheLipsTheJawTheVelumPosition(height)oftheLarynxfamiliarwiththeseVelumandLarynxHeightVelum(膜)–controlstheamountofairwhichentersthenasalpassageSay‘ng’(asinsing)withtonguefarbackaspossible–velumandtonguetouchandallsoundpassesthroughnasalpassageThisaddsanew(different)setofresonanceswhichcolourthesound(fig11.7)LarynxHeight–changeseffectivelengthofvocaltract(seelater)Givestheperceptionoflarger/smallerhead(DarthVadervs.TheChipmunks)audiotrack38–fig11.9Goodsingers/professionalactors(mimics)canproducelargerangeofsoundssimplybyvaryingthelarynxheightAmodelofSpeechProduction:RelationbetweenexcitationandthevocaltractSimplelinearsystemmodel:Linearsystem,withinputx(t),systemfunctionH[],andoutputs(t)Timeinvariant?No,butmaybe“piecewise”timeinvariant-e.g.every10msh(t)impulseresponses(t)=x(t)*h(t)GeneralModelforSpeechProductionImpulseTrainGeneratorRandomNoiseGeneratorGlottalPulseModelG(z)RadiationModelR(z)VocalTractModelH(z)AuAvPitchperiodspeechVoiced/UnvoicedswitchGainforvoicesourceGainfornoisesourceModelofProductionImpluseTrainGeneratorVocalTractModelPitchPeriodVocalTractParametersspeechWhiteNoiseXVoicedUnvoicedGClassificatonVoicedsounds:vibrationofvocalcordsProductbyQuasi-periodicExcitationUnvoicedsounds:novibrationofvocalcordsProductbyturbulencePlosivesounds:ProductbycompressionairSpeechSoundsCoarseclassificationwithphonemes.Aphoneistheacousticrealizationofaphoneme.Allophonesarecontextdependentphonemes.PhonemeHierarchySpeechsoundsVowelsConsonantsDiphtongsPlosiveNasalFricativeRetroflexliquidLateralliquidGlideiy,ih,ae,aa,ah,ao,ax,eh,er,ow,uh,uway,ey,oy,aww,yp,b,t,d,k,gm,n,ngf,v,th,dh,s,z,sh,zh,hrlLanguagedependent.About50inEnglish.SpeechWaveformCharacteristicsLoudnessVoiced/Unvoiced.Pitch.Fundamentalfrequency.Spectralenvelope.Formants.SpeechWaveformCharacteristicsCont.VoicedSpeechUnvoicedSpeech/ih//s/Short-TimeSpeechAnalysisSegments(orframes,orvectors)aretypicallyoflength20ms.Speechcharacteristicsareconstant.Allowsforrelativelysimplemodeling.Oftenoverlappingsegmentsareextracted.BBBBB=1/NTheSpectrogramAclassicanalysistool.ConsistsofDFTsofoverlapping,andwi