56软件调优基础

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

软件调优基础陈健2003/3为什么需要调优?相同的代码不同的性能SELFRELEASEOPT:4IMSLCXMLATLASMKL50MKL5116.676s5.445s5.457s10.996s3.328s0.762s0.848s0.738sfor(i=0;iNUM;i++){for(j=0;jNUM;j++){for(k=0;kNUM;k++){c[i][j]=c[i][j]+a[i][k]*b[k][j];}}}for(i=0;iNUM;i++){for(k=0;kNUM;k++){for(j=0;jNUM;j++){c[i][j]=c[i][j]+a[i][k]*b[k][j];}}}目标明确性能调优的主要任务定义一些重要的性能调优术语利用Intel工具提供帮助AgendaPerformanceCycleOverview–ThePerformanceCycle–WhentoStart–PerformanceGains–WhentoStop–PuttingitintoPerspectivePerformanceCycleDetailsSummary调优循环分析数据并得出结论测试结果修改代码实现优化确定修改方法来解决问题从这里开始收集性能数据When(why)toStartUserRequirement?SoftwareVendorRequirement?PutPerformanceRequirementintotheRequirementsDocumentPerformanceshouldbeconsideredateverystageoftheproductlifecycle(RequirementsGathering,Design,andTesting)Exception:Do“codetuning”afterthesimple/readablenon-optimizedversionoftheapplicationexists.工作vs.效果EffortPerformacneTheoreticalPerformanceRequiredPerformancePerformanceAttainedwToolsPerformanceAttainedw/oToolsWhentoStopArchitectureisatMaximumEfficiency?Besureyouknowwhatthisis:CalculateTheoreticalMaximumPerformanceRequirementissatisfiedIncrementallydoWideMeshOptimizations2untildone调优原则Weshouldforgetaboutsmallefficiencies,sayabout97%ofthetime:prematureoptimizationistherootofallevil.DonaldKnuthQualityCodeis:–Portable–Readable–Maintainable–ReliableIntelligentlySacrificeQualityforPerformanceAgendaPerformanceCycleOverviewPerformanceCycleDetails–GatherPerformanceData–AnalyzeDataandIdentifyIssues–GenerateAlternativestoResolveIssues–ImplementEnhancementsSummary收集性能数据Timer–Usetogetwallclocktime–Accuracy,LowOverheadUseIntel®VTune™PerformanceAnalyzer–Profiler:GatherInformationaboutCodeUsage–PerformanceMonitor:GatherInformationaboutSystemResourceUsage工作量Agoodworkloadshouldhavethesecharacteristics:–measurable–reproducible–static–representative分析数据得出结论BaselineCurrentPerformanceExamineHotSpotsIdentifyBottlenecksCalculatePotentialMaximumPerformanceExamineHotSpotsTheParetoPrinciple,a.k.a.the80/20Rule–Concentrateonthevitalfewvs.thetrivialmanyHotSpot:应用或系统中占主要运算量的部分GenerallyconsistsofaLoopForApplicationsthatdon’thavehotspots,examine:–MemoryLayout–Exceptions–EffectiveCompilerUsage额外内容BigOUtilization,Efficiency,Throughput,LatencyBottlenecks–I/O,Memory,CPUMIPS/FLOPS/CPIConcurrency,ParallelismScalabilityLoads/StoresperCalculationAgendaPerformanceCycleOverviewPerformanceCycleDetails–GatherPerformanceData–AnalyzeDataandIdentifyIssues–GenerateAlternativestoResolveIssues–ImplementEnhancementsSummary优化设计层次问题定义系统结构算法和数据结构代码调优系统软件系统硬件代码调优汇编指令级内部函数C++向量类库多线程循环转化编译器及参数性能库HardesttodevelopandmaintainEasiesttodevelop,portandmaintainCodeTuningIfParallelProcessing–BreakAlgorithmupacrossClusters(DistributedMemory)–SingleNodeOptimization–BreakAlgorithmupacrossProcessors(SMP)修改代码实现优化UseIntel®LibrariesUseVariousCompilerSwitchesFindoutifthecompilerorhardwaredoestheenhancementsautomatically-beforeimplementingyourselfModifySource(i.e.LoopTransformations,SWP,SIMD,OpenMP,Intrinsics,Assembly)Test!MakesureApplicationsstillrunscorrectly(RegressionTesting)MakesureenhancementactuallyincreasesperformanceCalculateSpeed-upDecideifyou’redoneoptimizingSpeed-UpSpeed-Up=OptimizedTimeBaselineTimeSpeed-Up=OptimizedThroughputBaselineThroughputTheTwoBasicFormulasSummaryOptimizationTasks–GatherPerformanceData–AnalyzeData&IdentifyIssues–GenerateAlternativestoResolveIssue–ImplementEnhancements–TestResultsUseIntel®SoftwareDevelopmentToolsforeverystepintheprocess

1 / 25
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功