1 An Efficient Procedure for Building the Function

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

1AnEfficientProcedureforBuildingtheFunctionalPerformanceModelofaProcessorAlexeyLastovetsky,RaviReddy,RobertHigginsDepartmentofComputerScience,UniversityCollegeDublin,BelfieldDublin4,IrelandE-mail:Alexey.Lastovetsky@ucd.ie,Manumachu.Reddy@ucd.ie,Robert.Higgins@ucd.ieAbstract---Inthispaper,wepresentanefficientprocedureforbuildingapiecewiselinearfunctionapproximationofthespeedfunctionofaprocessorwithhierarchicalmemorystructure.Theproceduretriestominimizetheexperimentaltimeusedforbuildingthespeedfunctionapproximation.WedemonstratetheefficiencyofourprocedurebyperformingexperimentswithamatrixmultiplicationapplicationandaCholeskyFactorizationapplicationthatusememoryhierarchyefficientlyandamatrixmultiplicationapplicationthatusesmemoryhierarchyinefficientlyonalocalnetworkofheterogeneouscomputers.1.IntroductionInourpreviousresearch[1],weaddressedtheproblemofoptimaldistributionorschedulingofcomputationaltasksonnetworksofheterogeneouscomputerswhenoneormoretasksdonotfitintothemainmemoryoftheprocessorsandwhenrelativespeedsofprocessorscannotbeaccuratelyapproximatedbyconstantfunctionsoftheproblemsize.Wedesignedefficientalgorithmstosolvethisschedulingproblemusingaperformancemodelthatintegratessomeoftheessentialfeaturesofaheterogeneousnetworkofcomputers(HNOC)havingamajorimpactontheperformance,suchastheprocessorheterogeneity,theheterogeneityofmemorystructure,andtheeffectsofpaging.Underthismodel,thespeedofeachprocessorisrepresentedbyacontinuousandrelativelysmoothfunctionoftheproblemsize.Thismodelisapplication-centricinthesensethatgenerallyspeakingdifferentapplicationswillcharacterizethespeedoftheprocessorbydifferentfunctions.Actuallyongeneral-purposecommonheterogeneousnetworks,2sizeoftheproblemAbsolutespeed)(1xs)(2xsFigure1.Usingpiecewiselinearapproximationtobuildspeedbandsfor2processors.Thecircularpointsareexperimentallyobtainedwhereasthesquarepointsarecalculatedusingheuristics.Thespeedbandforprocessors1(x)isbuiltfrom3experimentallyobtainedpoints(applicationrunonthisprocessorusesmemoryhierarchyinefficiently)whereasthespeedbands2(x)(applicationrunonthisprocessorusesmemoryhierarchyefficiently)isbuiltfrom4experimentallyobtainedpoints.anintegratedcomputerwillexperienceconstantandstochasticfluctuationsintheworkload.Thischangingtransientloadwillcauseafluctuationinthespeedofthecomputerinthesensethattheexecutiontimeofthesametaskofthesamesizewillvaryfordifferentrunsatdifferenttimes.Thenaturalwaytorepresenttheinherentfluctuationsinthespeedistouseaspeedbandratherthanaspeedfunction.Thewidthofthebandcharacterizestheleveloffluctuationintheperformanceduetochangesinloadovertime.Inourpreviousresearch,wedidnotproposeanymethodstobuildandmaintainthespeedbandofaprocessor.Inthispaper,wepresentanefficientandapracticalprocedureforbuildingapiecewiselinearfunctionapproximationofthespeedbandofaprocessorwithhierarchicalmemorystructure.Thisbandshouldbeabletorepresentanyspeedfunctionoftheprocessor,thatis,anyspeedfunctionrepresentingtheperformanceoftheprocessorshouldfitintothespeedband.Theproceduretriestominimizetheexperimentaltimeusedforbuildingthepiecewiselinearfunction3approximationofthespeedband.Wedonotproposemethodstomaintainthespeedfunctionapproximation.Thisisasubjectofourfutureresearch.Samplepiece-wiselinearfunctionapproximationsofthespeedbandofaprocessorareshowninFigure1.Eachoftheapproximationsisbuiltusingasetoffewexperimentallyobtainedpoints.Themorepointsusedtobuildtheapproximation,themoreaccuratetheapproximationis.Howeveritisprohibitivelyexpensivetouselargenumberofpoints.Henceanoptimalsetoffewpointsneedstobechosentobuildanefficientpiecewiselinearfunctionapproximationofthespeedband.Suchanapproximationgivesthespeedoftheprocessorforanyproblemsizewithcertainaccuracywithintheinherentdeviationoftheperformanceofcomputerstypicallyobservedinthenetwork.Therestofthepaperisorganizedasfollows.Insection2,weformulatetheproblemofbuildingapiecewiselinearfunctionapproximationofaprocessorandpresentanefficientandapracticalproceduretosolvetheproblem.Todemonstratetheefficiencyofourprocedure,weperformexperimentsusingamatrixmultiplicationapplicationandaCholeskyFactorizationapplicationthatusememoryhierarchyefficientlyandamatrixmultiplicationapplicationthatusesmemoryhierarchyinefficientlyonalocalnetworkofheterogeneouscomputers.2.ProcedureforBuildingSpeedFunctionApproximationThissectionisorganizedasfollows.Westartwiththeformulationofthespeedbandapproximationbuildingproblem.Thisisfollowedbyasectiononobtainingtheloadfunctionscharacterizingtheleveloffluctuationinloadovertime.Thethirdsectionpresentstheassumptionsadoptedbyourprocedure.Wethenpresentsomeoperationsandrelationsrelatedtothepiecewiselinearfunctionapproximationofthespeedband.Andfinallyweexplainourproceduretobuildthepiecewiselinearfunctionapproximation.4SizeoftheproblemAbsolutespeedSizeoftheproblemAbsolutespeedreal-lifepiecewise(a)(b)SizeoftheproblemAbsolutespeedx))(,(maxxsx))(,(minxsxSizeoftheproblemAbsolutespeed1x2x3x))(,(1min1xsx))(,(2min2xsx))(,(3min3xsx))(,(1max1xsx))(,(2max2xsx))(,(3max3xsx(c)(d)Figure2.(a)Real-l

1 / 25
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功