4SolutionsSolution4.14.1.1Thevaluesofthesignalsareasfollows:RegWriteMemReadALUMuxMemWriteALUOpRegMuxBrancha.100(Reg)0Add1(ALU)0b.111(Imm)0Add1(Mem)0ALUMuxisthecontrolsignalthatcontrolstheMuxattheALUinput,0(Reg)selectstheoutputoftheregisterfileand1(Imm)selectstheimmediatefromtheinstructionwordasthesecondinputtotheALU.RegMuxisthecontrolsignalthatcontrolstheMuxattheDatainputtotheregis-terfile,0(ALU)selectstheoutputoftheALUand1(Mem)selectstheoutputofmemory.AvalueofXisa“don’tcare”(doesnotmatterifsignalis0or1)4.1.2Resourcesperformingausefulfunctionforthisinstructionare:a.AllexceptDataMemoryandbranchAddunitb.AllexceptbranchAddunitandsecondreadportoftheRegisters4.1.3OutputsthatarenotusedNooutputsa.BranchAddDataMemoryb.BranchAdd,secondreadportofRegistersNone(allunitsproduceoutputs)4.1.4Onelongpathforandinstructionistoreadtheinstruction,readthereg-isters,gothroughtheALUMux,performtheALUoperation,andgothroughtheMuxthatcontrolsthewritedataforRegisters(I-Mem,Regs,Mux,ALU,andMux).Theotherlongpathissimilar,butgoesthroughControlwhileregistersareread(I-Mem,Control,Mux,ALU,Mux).Thereareotherpathsbuttheyareshorter,suchasthePCincrementpath(onlyAddandthenMux),thepathtopreventbranching(I-Mem,Control,MuxusesBranchsignaltoselectthePC+4inputasthenewvalueforPC),thepaththatpreventsamemorywrite(onlyI-MemandthenControl,etc).S110Chapter4Solutionsa.Controlisfasterthanregisters,sothecriticalpathisI-Mem,Regs,Mux,ALU,Mux.b.Controlisfasterthanregisters,sothecriticalpathisI-Mem,Regs,Mux,ALU,Mux.4.1.5Onelongpathistoreadinstruction,readregisters,usetheMuxtoselecttheimmediateasthesecondALUinput,useALU(computeaddress),accessD-Mem,andusetheMuxtoselectthatasregisterdatainput,sowehaveI-Mem,Regs,Mux,ALU,D-Mem,Mux.Theotherlongpathissimilar,butgoesthroughControlinsteadofRegs(togeneratethecontrolsignalfortheALUMUX).Otherpathsareshorter,andaresimilartoshorterpathsdescribedfor4.1.4.a.Controlisfasterthanregisters,sothecriticalpathisI-Mem,Regs,Mux,ALU,D-Mem,Mux.b.Controlisfasterthanregisters,sothecriticalpathisI-Mem,Regs,Mux,ALU,Mux.4.1.6Thisinstructionhastwokindsoflongpaths,thosethatdeterminethebranchconditionandthosethatcomputethenewPC.Todeterminethebranchcondition,wereadtheinstruction,readregistersorusetheControlunit,thenusetheALUMuxandthentheALUtocomparethetwovalues,thenusetheZeroout-putoftheALUtocontroltheMuxthatselectsthenewPC.Asin4.1.4and4.1.5:a.Thefirstpath(throughRegs)islonger.b.Thefirstpath(throughRegs)islonger.TocomputethePC,onepathistoincrementitby4(Add),addtheoffset(Add),andselectthatvalueasthenewPC(Mux).TheotherpathforcomputingthePCistoReadtheinstruction(togettheoffset),usethebranchAddunitandMux.Bothofthecompute-PCpathsareshorterthanthecriticalpaththatdeterminesthebranchcondition,becauseI-MemisslowerthanthePC+4Addunit,andbecauseALUisslowerthanthebranchAdd.Solution4.24.2.1Existingblocksthatcanbeusedforthisinstructionare:a.Thisinstructionusesinstructionmemory,bothexistingreadportsofRegisters,theALU,andthewriteportofRegisters.b.Thisinstructionusestheinstructionmemory,oneoftheexistingregisterreadports,thepaththatpassedtheimmediatetotheALU,andtheregisterwriteport.4.2.2Newfunctionalblocksneededforthisinstructionare:a.AnotherreadportinRegisters(toreadRx)andeitherasecondALU(toaddRxtoRs+Rt)orathirdinputtotheexistingALU.b.WeneedtoextendtheexistingALUtoalsodoshifts(addsaSLLALUoperation).Chapter4SolutionsS1114.2.3Thenewcontrolsignalsare:a.WeneedacontrolsignalthattellsthenewALUwhattodo,orifweextendedtheexistingALUweneedtoaddanewADD3operation.b.WeneedtochangetheALUOperationcontrolsignalstosupporttheaddedSLLoperationintheALU.4.2.4Clockcycletimeisdeterminedbythecriticalpath,whichforthegivenlatencieshappenstobetogetthedatavaluefortheloadinstruction:I-Mem(readinstruction),Regs(takeslongerthanControl),Mux(selectALUinput),ALU,DataMemory,andMux(selectvaluefrommemorytobewrittenintoRegisters).Thelatencyofthispathis400ps+200ps+30ps+120ps+350ps+30ps=1130ps.Newclockcycletimea.1130ps(Nochange,Addunitsarenotonthecriticalpath).b.1230(1130ps+100ps,Regsareonthecriticalpath)4.2.5Thespeed-upcomesfromchangesinclockcycletimeandchangestothenumberofclockcyclesweneedfortheprogram:Benefita.Speed-upis1(nochangeinnumberofcycles,nochangeinclockcycletime).b.Weneed5%fewercyclesforaprogram,butcycletimeis1230insteadof1130,sowehaveaspeed-upof(1/0.95)×(1130/1230)=0.97,whichmeansweactuallyhaveasmallslowdown.4.2.6Thecostisalwaysthetotalcostofallcomponents(notjustthoseonthecriticalpath,sotheoriginalprocessorhasacostofI-Mem,Regs,Control,ALU,D-Mem,2Addunitsand3Muxunits,foratotalcostof1000+200+500+100+2000+2×30+3×10=3890.Wewillcomputecostrelativetothisbaseline.Theperformancerelativetothisbaselineisthespeed-upwecomputedin4.2.5,andourcost/performancerelativetothebaselineisasfollows:NewcostRelativecostCost/Performancea.3890+2×20=39303930/3890=1.011.01/1=1.01.Wearepayingabitmoreforthesameperformance.b.3890+200=40904090/3890=1.051.05/0.97=1.08.Wearepayingsomemoreandgettingasmallslowdown,sooutcost/performancegetsworse.S112Chapter4SolutionsSolution4.34.3.1a.Both.Itismostlyflip-flops,butithaslogicthatcontrolswh