XXXX年成都会计从业资格证报名

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

WilliamStallingsComputerOrganizationandArchitecture5thEditionChapter13InstructionLevelParallelismandSuperscalarProcessorsWhatisSuperscalar?•Thetermsuperscalarreferstoamachinethatdesignedtoimprovetheperformanceoftheexecutionofscalarinstructions.•Therearemultipleindependentinstructionpipelinesinasuperscalarprocessor.•Eachpipelineconsistsofmultiplestages,canhandlemultipleinstructionsatatime.•Multiplepipelinesintroduceanewlevelofparallelism,enablingmultiplestreamsofinstructionstobeprocessedatatime.WhatisSuperscalar?•Asuperscalarprocessorfetchesmultipleinstructionsatatime•Attemptstofindnearbyinstructionsthatareindependentofoneanotherandcanbeexecutedinparallel.•Theessenceofthesuperscalarapproachistheabilitytoexecuteinstructionsindependentlyindifferentpipelines.WhatisSuperscalar?•Commoninstructions(arithmetic,load/store,conditionalbranch)canbeinitiatedandexecutedindependentlyinasuperscalarprocessor•EquallyapplicabletoRISC&CISC•InpracticeusuallyRISCWhySuperscalar?•Mostoperationsareonscalarquantities(seeRISCnotes)•ImprovetheseoperationstogetanoverallimprovementGeneralSuperscalarOrganization•Twointeger,twofloating-point,andonememory(eitherloadorstore)operationscanbeexecutingatthesametime.Superpipelined•Manypipelinestagesneedlessthanhalfaclockcycle•DoubleinternalclockspeedgetstwotasksperexternalclockcycleSuperscalarvSuperpipelineSuperscalarvSuperpipeline•BasemachineSuperscalarvSuperpipeline•SuperpipelineSuperscalarvSuperpipeline•SuperscalarLimitations•Instructionlevelparallelism—Compilerbasedoptimisation—Hardwaretechniques•Limitedby—Truedatadependency数据相关—Proceduraldependency过程相关—Resourceconflicts资源冲突—Outputdependency输出相关—Antidependency反相关TrueDataDependency•ADDr1,r2(r1:=r1+r2;)•MOVEr3,r1(r3:=r1;)•Canfetchanddecodesecondinstructioninparallelwithfirst•CanNOTexecutesecondinstructionuntilfirstisfinished•Alsocalledflowdependency•orwrite-readdependencyTrueDataDependencyProceduralDependency•Cannotexecuteinstructionsafterabranchinparallelwithinstructionsbeforeabranch•Also,ifinstructionlengthisnotfixed,instructionshavetobedecodedtofindouthowmanyfetchesareneeded•ThispreventssimultaneousfetchesProceduralDependencyResourceConflict•Resources—Memories,caches,buses,register-file,ports,functionalunits•Twoormoreinstructionsrequiringaccesstothesameresourceatthesametime—e.g.twoarithmeticinstructions•Canduplicateresources—e.g.havetwoarithmeticunitsResourceConflictEffectofDependenciesDesignIssues•Instructionlevelparallelism—Instructionsinasequenceareindependent—Executioncanbeoverlapped—Governedbydataandproceduraldependency•MachineParallelism—Abilitytotakeadvantageofinstructionlevelparallelism处理器提供指令级并行性支持能力的度量—Governedbynumberofparallelpipelines•E.g.—LoadR1R2(23)AddR3R3,”1”—AddR3R3,”1”AddR4R3,R2—AddR4R4,R2Store[R4]R0InstructionIssuePolicy(指令发射策略)•Orderinwhichinstructionsarefetched—取指令的顺序•Orderinwhichinstructionsareexecuted—指令执行的顺序•Orderinwhichinstructionschangeregistersandmemory—指令改变寄存器和存储器内容的顺序In-OrderIssueIn-OrderCompletion•Issueinstructionsintheordertheyoccur•Notveryefficient•Mayfetch1instruction•InstructionsmuststallifnecessaryIn-OrderIssueIn-OrderCompletion(Diagram)In-OrderIssueOut-of-OrderCompletion•Outputdependency—R3:=R3+R5;(I1)—R4:=R3+1;(I2)—R3:=R5+1;(I3)—I2dependsonresultofI1-datadependency—IfI3completesbeforeI1,theresultfromI1willbewrong-output(read-write)dependencyIn-OrderIssueOut-of-OrderCompletion(Diagram)Out-of-OrderIssueOut-of-OrderCompletion•Decoupledecodepipelinefromexecutionpipeline•Cancontinuetofetchanddecodeuntilthispipelineisfull•Whenafunctionalunitbecomesavailableaninstructioncanbeexecuted•Sinceinstructionshavebeendecoded,processorcanlookaheadOut-of-OrderIssueOut-of-OrderCompletion(Diagram)Antidependency•Write-writedependency—R3:=R3+R5;(I1)—R4:=R3+1;(I2)—R3:=R5+1;(I3)—R7:=R3+R4;(I4)—I3cannotcompletebeforeI2startsasI2needsavalueinR3andI3changesR3RegisterRenaming•Outputandantidependenciesoccurbecauseregistercontentsmaynotreflectthecorrectorderingfromtheprogram•Mayresultinapipelinestall•Registersallocateddynamically—i.e.registersarenotspecificallynamedRegisterRenamingexample•Registerrenaming—R3b:=R3a+R5a(I1)—R4b:=R3b+1(I2)—R3c:=R5a+1(I3)—R7b:=R3c+R4b(I4)•Withoutsubscriptreferstologicalregisterininstruction•Withsubscriptishardwareregisterallocated•NoteR3aR3bR3cMachineParallelism•Threehardwaretechniques—DuplicationofResources—Outoforderissue—Renaming•Figure13.5showssimulationresults•Notworthduplicationfunctionswithoutregisterrenaming•Registerrenamingeliminatesantidependenciesandoutputdependencies•Needinstructionwindowlargeenough(morethan8)BranchPrediction•80486fetchesbothnextsequentialinstructionafterbranchandbranchtargetinstruction•GivestwocycledelayifbranchtakenRISC-DelayedBranch•Calculateresultofbranchbeforeunusableinstructionspre-fetched•Alwaysexecutesingleinstructionimmediatelyfollowingbranch•Keepspipelinefullwhilefetchingnewinstructionstream•Notasgoodforsuperscalar—Multipleinstructionsneedtoexecuteindelayslot—Instructiondependenceproblems•ReverttobranchpredictionSuperscalarExecutionSuperscalarImplementation•

1 / 49
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功