控制工程ControlEngineeringofChinaJul.2008Vol.15,No.42008年7月第15卷第4期:16717848(2008)04042303:20080327;:20080402:(04C26214501352);(0575016);(05920016);(2004ZD04):(1955),,,,林小峰,喻亮,宋绍剑,宋春宁(,530004):自适应评价设计(ACD)是一种适用于非线性系统的近似最优控制方法介绍了自适应评价设计的执行依赖启发式动态规划(ADHDP)和执行依赖双启发式动态规划(ADDHP)方法,该方法可以解决由对象非线性或者系统建模不良所造成的不确定性问题,适于处理时变的复杂系统和动态变化的复杂任务阐述了两种方法的结构计算和评价网络输出上的不同,并通过仿真分析了两种方法各自的学习能力控制效果:自适应评价设计;执行依赖启发式动态规划;执行依赖双启发式动态规划:TP273:AActiondependentMethodsofAdaptiveCriticDesignLINXiaofeng,YULiang,SONGShaojian,SONGChunning(CollegeofElectricalEngineering,GuangxiUniversity,Nanning530004,China)Abstract:Adaptivecriticdesign(ACD)isamothedusedtoapproximateoptimalcontrolinnonlinearsystems.Actiondependentheuristicdynamicprogrammingandtheactiondependentdualheuristicprogrammingareintroducedtosolvetheuncertaintyproblemcreatedbynonlinearityofplantorworsesystemmodeling,andbeingappropriatetodealwiththetimevaryingcomplexsystemandthedynamicvariationcomplextask.ThedifferencesbetweenthetwoactiondependentACDsinstructure,evaluationandcriticsoutputarediscussed,andthelearningcapabilityandcontroleffectofthetwomethodsareanalyzedrespectively.Keywords:adaptivecriticdesigns;ADHDP;ADDHP1,,,,,,,,werbos1997,,,,,2(),x(t+1)=F[x(t),u(t),t],x(0)=x0(1),x!Rn;u!Rm;t=0,1,∀,N-1J[x(i),i]=#∃k=ik-iU[x(k),u(k),k](2),U(UtilityFunction);(DiscountFactor),0%1;Jx(i)(CosttoGo)u(k),k=i,i+1,∀,(2)JACDBellman,:J*[x(t),t]=minu(t){U[x(t),u(t),t]+J*[x(t+1),t+1]}(3)ACD&(CriticNetwork)(ModelNetwork)(ActionNetwork),,11Fig1Adaptivecriticdesignwiththreemodules,;;,ACD,J,(2)J:∋(HeuristicDynamicProgramming,HDP)((DualHeuristicProgramming,DHP))(GlobalDualHeuristicProgramming,GDHP)ACD,,,(ActiondependentAdaptiveCriticDesigns),ADHDPADDHPADHDPADDHPADHDP,J,ADDHPJ,ADHDPADDHP,ADHDP,ADDHPJ,22Fig2Typicalschemeofactiondependentadaptivecriticdesign1)ADHDPADHDP,J:∗Ec∗=#t12[J(t)-U(t)-J(t+1)]2(4),J(t)=J[x(t),u(t),t,WC],WCU,(2)Ux(t),u(t)t,U(t)=U[x(t),u(t),t]t,Ec=0,(4):J(t)=U(t)+J(t+1)=U(t)+[U(t+1)+J(t+2)]=∀=#∃k=tk-tU(k)(5)(2),,(3),J(2)J,,(4),:Wc=+(-!Ec!Wc)=-+!Ec!J(t)+!J(t)!Wc=-[J(t)-J(t+1)-U(t)]!J(t)!Wc(6),(0),,J(t):Wa=l+(-!J(t)!Wa)=-l+!J(t)!u(t)+!u(t)!Wa(7),u(t)t;l(l0),J,[3];,,,,ADHDP2)ADDHPADDHPADHDP,ADDHP,ADDHP,∀(t)CosttoGo[14]:+424+控制工程第15卷∀(t)=!J!u(8)Jx∀x(t)=!J!x(9)()Ej=#t(ej2(t)2)=#t[∀(t)-!U(t)!uj(t)-∀(t+1)]22,j=1,2,∀,m(10)ADHDP,e(t),t,Ej(t)=0,(10):!J(t)!uj(t)=!U(t)!uj(t)+!J(t+1)!uj(t)(11)!J(t+1)!uj(t)=#ni=1∀xi(t+1)!xi(t+1)!uj(t)(12),∀x(t+1)∀(t),:∀xi(t+1)=!J(t+1)!xi(t+1)=#mk=1∀k(t+1)!uk(t+1)!xi(t+1)(13),ADDHPCosttoGoJ(t)Bellman,∀,∀0,j(j%m)∀0j(t)=!J(t)!uj(t)=![U(t)+J(t+1)]!uj(t)=!U(t)!uj(t)+#ni=1∀xi(t+1)!xi(t+1)!uj(t)(14),n;!U(t)!u(t)U(t)u(t),Ej(t)Wa=-l+!J(t)!Wa=#mk=1!J(t)!uk(t)+!uk(t)!Wa(15)(11)(15),:Wa=-l#mk=1[!U(t)!uk(t)+!J(t+1)!uk(t)]!uk(t)!Wa(16):!J(t+1)!uk(t)=#ni=1!J(t+1)!xi(t+1)+!xi(t+1)!uk(t)(17)!xi(t+1)!uk(t),ADDHP,,,,,33ADDHPFig3TrainingschematicforcriticnetworkinADDHP3[2]4,2002006000,ACD200,(2006000),,|#|,12−|x|,24m,,;6000,15200ADHDP100%,178;ADDHP909%,445,ADHDP,,,ADDHPADDHP,,;ADDHP,ADHDP,(下转第465页)+425+第4期林小峰等:自适应评价设计的执行依赖方法,5RS,∋,,,(,),,,,.,RSNP,,,BP,,,,(References):[1].[D].:,2003.(LiuXiaoying.Intelligentfaultdiagnosistechnologyoncomplexprocessanditsapplicationstolargeindustrialkilns[D].Changsha:CentralSouthUniversity,2003.)[2],,,.[J].,2006,27(12):22802285.(TangZhaohui,GuiWeihua,HuZhikun,etal.KnowledgeacquisitionmethodbasedonroughsetforfaultdiagnosisforimperialPbZnsmeltingfurnace[J].MiniMicroSystems,2006,27(12):22802285.)[3],,.[J].,2001,22(3):8991,97.(LiuXiaoying,GuiWeihua,ZhuShuang.FaultdiagnosisonneuralnetworkexpertsystemforLeadandZincsmeltingprocess[J].JournalofShanghaiMaritimeUniversity,2001,22(3):8991,97.)[4]GuiWH,YangCH,TengJ.IntelligentfaultdiagnosisinLeadZincsmeltingprocess[J].InternationalJournalofAutomationandComputing,2007,4(2):135140.[5],,.[M].:,2001.(ZhangWenxiu,WuWeizhi,LiangJiye.Roughsettheoryandmethod[M].Beijing:SciencePress,2001.)[6]PawlakZ.Whyroughsets[J].TheFifthIEEEIntConferenceonFuzzySystems,1996,15(2):738743.[7],,.[J].,2007,14(1):7375.(DongWei,WangJianhui,GuShusheng.Ruleinductionapproachbasedonvariableprecisionroughsettheory[J].ControlEngineeringofChina,2007,14(1):7375.)[8]ShenLX,FrancisEH,QuLS.Faultdiagnosisusingroughsetstheory[J].ComputersinIndustry,2000,43(3):6172.[9]KistlerJ,SatyanarayananM.Disconnectedoperationinthecodafilesystem[J].ACMTransonComputerSys,1992,10(1):325.[10],,,.Rough[J].,2003,26(5):524529.(LiuShaohui,ShengQiujian,WuBin,etal.Researchonefficientalgorithmsforroughsetmethods[J].ChineseJournalofComputer,2003,26(5):524529.)[11],,,.Rough[J].,2005,26(3):356358.(HeMing,FengBoqin,MaZhaofengetal.Heuristicalgorithmforreductionofattributesbasedonroughsettheory[J].MiniMicroSystems,2005,26(3):356358.)(上接第425页),44Fig4Comparisonofpoleanglesandcartpositions4ADHDPADDHP,ADHDP,J,ADDHPJ,ADHDP,,;ADDHP,,,,;ADDHP,,ADHDP,,(References):[1]BertsekasDP.Neurodynamicprogramming[M].Belmont:MITAthenaScientific,1996.[2]LiuD.Actiondependentadaptivecriticdesigns[C].Chicago:IEEEInternationalJointConferenceonNeuralNetworks,2001.[3]ProkhorovDV.Adaptivecriticdesigns[J].IEEETransofNeuralNetw