STOCHASTIC MODELS OF LANGUAGE EVOLUTION AND AN APP

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

STOCHASTICMODELSOFLANGUAGEEVOLUTIONANDANAPPLICATIONTOTHEINDO-EUROPEANFAMILYOFLANGUAGESTANDYWARNOW,STEVENN.EVANS,DONRINGE,ANDLUAYNAKHLEHABSTRACT.Weproposeseveralmodelsofhowlanguagesevolve,anddiscussstatisticalestimationofevolutionunderthesemodels.Wealsodiscussissuesofidentiabilityandstatisticalconsistencyunderthesemodels.1.INTRODUCTIONInrecentmonthsseveralmethodsforestimatingevolutionaryhistoriesoflanguageshavebeendescribedandusedonIndo-European(IE)datasetsinordertoestimatedatesatwhichlanguagesdiversied.Implicitinthesemethodsarestochasticmodelsofhowlanguagesevolve(Forster&Toth,2003;Gray&Atkinson,2003).Weagreethatacarefullyconsideredsto-chasticmodelcanbeoftremendoususetohistoricallinguistics:ifsuf-cientlyrealistic,inferenceunderthemodelcanrevealmuchaboutthehis-toryofthelanguagefamily,andexaminationsofhowreconstructionmeth-odsperformunderthesemodels(viasimulation,inparticular)canhelpusquantifythereliabilityofareconstructionmethod.Sinceourowninter-estinthisisprimarilymotivatedbytheIEfamily,wewillformulatethismodelsoastoreectwhatwebelieveislikelytobetrueaboutIE'sevolu-tion.Much,however,shouldbeappropriateforotherfamilies,andwewilldiscussextensionstootherfamiliesattheendofthepaper.2.MODELSInthissectionweexplainwhatismeantbyastochasticmodeloflan-guageevolution,andwepresentsomespecicmodelsthatareworthexam-ininginthecontextofIEevolution.Webeginbyexplainingwhatlinguistic“characters”are,sincetheevolu-tionarymodeldescribeshoweachcharacterevolves.Date:April16,2004.TWsupportedbyNSFgrantBCS-0312830.SNEsupportedinpartbyNSFgrantDMS-0071468.DRsupportedinpartbyNSFgrantBCS-0312911.12TANDYWARNOW,STEVENN.EVANS,DONRINGE,ANDLUAYNAKHLEH2.1.Linguisticcharacters.A(linguistic)characterisanyfeatureoflan-guagesthatcantakeoneormoreforms;thesedifferentformsarecalledthe“states”ofthecharacter.Thus,ourcharactersincludelexicalcharacters,wherethedifferentstatesarethecognateclasses,sothattwolanguagesexhibitthesamestateforthelexicalcharacterifandonlyiftheyhavecog-natesforthemeaningassociatedwiththelexicalcharacter.Othercharactersincludephonologicalcharacters(theappearanceofasoundchangewithinthelanguageoritsancestry)andmorphologicalcharacters(e.g.,inectionalmarkers).Thus,acharacterdenesanequivalencerelationonthelanguagefamily,wheretwolanguagesareequivalentiftheyexhibitthesamestateforthecharacter.Givenapartitionofasetintodisjointsubsets,wecandeneanequivalencerelationbymakingtwolanguagesequivalentifandonlyiftheyareinthesamesubset;thus,apartitionofasetintodisjointsubsetsdenesanequivalencerelation(andtheconverseholdsaswell).Ourrstsimplifyingassumptionisthatallthecharactersare“monomor-phic”,whichmeansthateverylanguageexhibitsonlyonestateofeachchar-acter.Thecontrastingphenomenonisacharacterwhichhastwoormorestatesforsomelanguages;examplesofsuchcharactersincludethesemanticslot“rock”forwhichEnglishcontainsatleasttwoequivalents:“rock”and“stone”.Becausewedonotunderstandinenoughdetailhowpolymorphismarises,wewillexcludepolymorphiccharactersfromourmodel.Simplifyingassumption#1:thereisnopolymorphism(i.e,theap-pearanceoftwoormorestatesforagivencharacterinagivenlan-guage).Foreachcharacter,wecanassignnumberstothestatesofthecharactersothatthecharacterisdenedtobeafunctionthatassignseverylanguageinasetoflanguagesarealnumber;thenumberassignedtothelanguageiscalledthe“state”ofthecharacterforthatlanguage.Thus,thestatesofallourcharactersarerealnumbers,andwhenwewriteforalanguageandacharacter,wemeanthestateofthecharacterexhibitedbythelanguage.However,theparticularrealnumberusedtolabelastateisirrelevant,andallthatmattersiswhethertwostatesareequalordifferent.2.2.Treemodels.Languagescanevolveinapurelytreelikefashion(theStammbaummodel),orwithenoughcontactbetweenlanguagesthatun-detected(orundetectable)borrowingoccursbetweenlineages,sothatitbecomesdifcult(orinappropriate)todenea“genetictree”forthefam-ily.Manyconditionscanmakeevolutionnon-treelike;creoles(hybridlan-guages)areone,dialectcontinuaareanother,butmoregenerallycontactitselfbetweendivergentlineagescanalsoleadtotreesbeinginappropriate(orjustdifculttoinfer).Alloftheseconditionscanbelooselygroupedunderthecategoryof“reticulateevolution”.STOCHASTICMODELS3Wewillinitiallydescribethemodelforthecasewherethereisnoreticu-lateevolution,sincemostoftheconceptsaremorefamiliarinthatcontext;laterwewillshowhowthemodelextendstothecasewherewepermitreticulateevolution.Inthecasewherethereisnoreticulateevolution,theevolutionaryhistoryofthelanguagesisdescribedbyarootedtree,inwhichtheleavesrepre-sentthelanguagesinthefamily,andtheinternalnodesrepresentancestrallanguagesatparticularpointsintime;thisisthe“genetictree”forthefam-ily.Everynodeinhasatimeassociatedtoit,withtimesatnodesincreasingasonemovesawayfromtherootofthetree.Alloftheinternalnodesinthetreewillhaveatleasttwoedgesissuingfromthem(thatis,theywillhaveout-degreeatleasttwo)sothatnodescanalsobethoughtofasrepresentingdiversicationevents.Therefore,anedgewithinthetreerepresentsthedevelopmentofthelanguageoveraperiodoftimebetweendiversicationevents.2.3.Theevo

1 / 25
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功