8A Novel Graph-Based Estimation of the Distributio

夜行龙神
2 ℃
2020-01-17

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

98IEEETRANSACTIONSONEVOLUTIONARYCOMPUTATION,VOL.18,NO.1,FEBRUARY2014ANovelGraph-BasedEstimationoftheDistributionAlgorithmandItsExtensionUsingReinforcementLearningXiannengLi,StudentMember,IEEE,ShingoMabu,Member,IEEE,andKotaroHirasawa,Member,IEEEAbstract—Inrecentyears,numerousstudieshavedrawnthesuccessofestimationofdistributionalgorithms(EDAs)toavoidthefrequentbreakageofbuildingblocksoftheconventionalstochasticgeneticoperators-basedevolutionaryalgorithms(EAs).Inthispaper,anovelgraph-basedEDAcalledprobabilisticmodelbuildinggeneticnetworkprogramming(PMBGNP)isproposed.Usingthedistinguishedgraph(network)structureofagraph-basedEAcalledgeneticnetworkprogramming(GNP),PMBGNPensureshigherexpressionabilitythantheconventionalEDAstosolvesomespeciﬁcproblems.Furthermore,anextendedalgorithmcalledreinforcedPMBGNPisproposedtocombinePMBGNPandreinforcementlearningtoenhancetheperfor-manceintermsofﬁtnessvalues,searchspeed,andreliability.Theproposedalgorithmsareappliedtosolvetheproblemsofcontrollingtheagents’behavior.Twoproblemsareselectedtodemonstratetheeffectivenessoftheproposedalgorithms,includingthebenchmarkone,i.e.,theTileworldsystem,andarealmobilerobotcontrol.IndexTerms—Agentcontrol,estimationofdistributionalgorithm(EDA),geneticnetworkprogramming(GNP),graphstructure,reinforcementlearning(RL).I.IntroductionINTHELASTfewyears,therehasbeenasigniﬁcantdevel-opmentoftheestimationofdistributionalgorithm(EDA)inboththeoryandpractice[1]–[4].Unliketheconventionalevolutionaryalgorithms(EAs)thatusestochasticwaystosimulatethebiologicalgeneticoperatorsfornewpopulationgeneration,EDAconstructsaprobabilisticmodelusingthetechniquesofstatisticsormachinelearningtoestimatetheprobabilitydistributionofthecurrentpopulation,andsamplesthemodeltogenerateanewpopulation.ManystudieshaveinvestigatedwhetherEDAcanoutperformconventionalEAbyavoidingtheprematureconvergenceandspeedingupoftheevolutionprocessinsomeproblems[5]–[8].AlargenumberofstudieshavebeenconductedonEDAtoproposeManuscriptreceivedJuly5,2012;revisedOctober27,2012;acceptedDecember24,2012.DateofpublicationJanuary9,2013;dateofcurrentversionJanuary27,2014.TheauthorsarewiththeGraduateSchoolofInformation,ProductionandSystems,WasedaUniversity,Fukuoka808-0135,Japan(e-mail:sen-nou@asagi.waseda.jp;mabu@aoni.waseda.jp;hirasawa@waseda.jp).Thispaperhassupplementarydownloadablematerialavailableatﬁer10.1109/TEVC.2013.2238240numerousalgorithms.Particularly,fromtheperspectiveofindividualrepresentation,EDAcanbesimplyclassiﬁedintotwocategories,whichareprobabilisticmodelbuildinggeneticalgorithm(PMBGA,orgeneticalgorithm-basedEDA)[9]andprobabilisticmodelbuildinggeneticprogramming(PMBGP,orgeneticprogramming-basedEDA)[10].PMBGAemploysGA’sstringstructuretorepresentitsindividualsandismainlyappliedtosolveoptimizationproblems,whilePMBGPusesGP’streestructuretorepresentitsindividualsforprogramevolution.Inthispaper,anovelgraph-basedEDAcalledprobabilisticmodelbuildinggeneticnetworkprogramming(PMBGNP)[11]isdescribed.TheaimofdevelopingPMBGNPistoextendEDAfromthestringandtreestructurestothegraphstructure,wherethedirectedgraph(network)structureofgeneticnetworkprogramming(GNP)[12],[13]isemployed.Somepreviousresearchhasshownthesuperiorityofgraph-basedEAsintermsofhigherexpressionabilitythanthatofconventionalGP[12]–[16].GNPisonesuchgraph-basedEA,whichextendsGAandGPbyusingadirectedgraph(network)structuretorepresentitsindividuals.Differentfromtheothergraph-basedEAs,GNPisﬁrstdesignedforsolvingtheproblemsofcontrollingtheagents’behavior,whileinrecentyearsithasbeenextendedtomanyotherproblems,suchasmultiagentsystems[17],datamining[18],elevatorsystemcontrol[19],intrusiondetectionsystem[20],etc.Therefore,fromtheperspectiveofindividualrepresentation,PMBGNPhashigherexpressionabilitythantheconventionalEDAstoefﬁcientlysolvesomeproblemsduetothedirectedgraphstructureofGNP.Ontheotherhand,anotherchallengeinEDAisusingittoexploremanyotherproblems.ThispaperappliesPMBGNPtosolvetheproblemsofcontrollingtheagents’behavior,wheremostofthecurrentEDAsaredesignedtosolvetheotherproblems.Twoproblemsareselectedtodemonstratetheeffectivenessofthispaper,includingthebenchmarkone,i.e.,theTileworldsystem[21],andarealmobilerobotcontrol,Kheperarobotcontrol[22],[23].Therefore,therearemainlytwoprimaryfeaturesofPMBGNP.1)EDAisextendedtograph-basedEA.2)EDAisappliedtosolvetheproblemsofcontrollingtheagents’behavior.Inaddition,weproposeanextendedalgorithmcalledrein-forcedPMBGNP(RPMBGNP)thatcombinesreinforcement1089-778Xc2013IEEE.Personaluseispermitted,butrepublication/redistributionrequiresIEEEpermission.See:NOVELGRAPH-BASEDEDAANDITSEXTENSIONUSINGREINFORCEMENTLEARNING99learning(RL)[24]andPMBGNPinordertoenhanceitsper-formanc