Parallel direct methods for sparse linear systems,

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

PARALLELDIRECTMETHODSFORSPARSELINEARSYSTEMSMichaelT.HeathDepartmentofComputerScienceandNCSAUniversityofIllinoisUrbana,Illinois61801ABSTRACTWepresentanoverviewofparalleldirectmethodsforsolvingsparsesystemsoflinearequations,focusingonsymmetricpositivedenitesystems.Weexaminetheperformanceimplicationsoftheimportantdierencesbetweendenseandsparsesystems.Ourmainemphasisisonparallelimplementationofthenumericallyintensivefactorizationprocess,butwealsobrieyconsidertheothermajorcomponentsofdirectmethods,suchasparallelordering.IntroductionInthispaperwepresentabriefoverviewofparalleldirectmeth-odsforsolvingsparselinearsystems.Paradoxically,sparsematrixfactorizationoersadditionalopportunitiesforexploitingparallelismbeyondthoseavailablewithdensematrices,yetitisoftenmoredif-culttoattaingoodeciencyinthesparsecase.Weexaminebothsidesofthisparadox:theadditionalparallelisminducedbysparsity,andthedicultyinachievinghigheciencyinspiteofit.WefocusonCholeskyfactorization,primarilybecausethisallowsustodiscussparallelisminrelativeisolation,withouttheadditionalcomplicationsofpivotingfornumericalstability.Mostofthelessonslearnedarealsoapplicabletoothermatrixfactorizations,suchasLUandQR.Ourmainpointinthecurrentdiscussionistoexplainhowthesparsecasediersfromthedensecase,andexaminetheperformanceimplicationsofthosedierences.ConsiderasystemoflinearequationsAx=b;whereAisannnsymmetricpositivedenite(SPD)matrix,bisaknownvector,andxistheunknownsolutionvectortobecomputed.OnewaytosolvethelinearsystemisrsttocomputetheCholeskyfactorizationA=LLT;wheretheCholeskyfactorLisalowertriangularmatrixwithpositivediagonalelements.ThenthesolutionvectorxcanbecomputedbysuccessiveforwardandbacksubstitutionstosolvethetriangularsystemsLy=b;LTx=y:CholeskyFactorizationAlgorithmThealgorithmforCholeskyfactorizationisavariantofGaussianeliminationthattakesadvantageofsymmetrytoreducebothworkandstoragebyabouthalf.LikeGaussianelimination,thealgorithmconsistsofatriplenestedloop.Oneofthe3!waysofarrangingthatloopisshowninFigure1.forj=1;nfork=1;j1fori=j;naij=aijaikajkfcmod(j;k)gajj=pajjfork=j+1;nakj=akj=ajjfcdiv(j)gFigure1:SerialCholeskyfactorizationalgorithmWemakethefollowingimportantobservationsaboutthisalgorithm:SinceAisSPD,thesquarerootsareallofpositivenumbers,sothealgorithmiswelldened.Pivotingisnotrequiredfornumericalstability.OnlythelowertriangularportionofAisaccessed.ThefactorLiscomputedinplace,overwritingthelowertrian-gleofA.Eachcolumnjismodiedbyamultipleofeachpriorcolumnk.Wedenotethisoperationbycmod(j;k).Ifthatmultipleiszero(i.e.,ajk=0),thentheinnermostloophasnoeectandmayaswellbeskipped.ElementsofAthatwereinitiallyzeromaybecomenonzeroduetocmodoperationsbynonzeroelementsfrompreviouscolumns.Suchnewnonzerosarecalledll.Whenallmodicationstocolumnjarecomplete,itisscaledbythesquarerootofitsdiagonalelementtoproducecolumnjofthefactor.Wedenotethisoperationbycdiv(j).ForfurtherdetailsonCholeskyfactorization,see(GolubandVanLoan,1989).ThreeFormsofCholeskyFactorizationThethreechoicesofindexfortheouterloopyieldmarkedlydier-entmemoryaccesspatterns,asillustratedinFigure2,andthesehaveimportantperformanceimplicationsinvariousarchitecturalsettings,suchaseectivecacheutilization,vectorization,parallelization,orout-of-coresolutions.Row-Cholesky:Withiintheouterloop,theinnerloopssolveatriangularsystemforeachnewrowintermsofthepreviouslycomputedrows.Column-Cholesky:Withjintheouterloop,theinnerloopscomputethematrix-vectorproductthatgivestheeectofpre-viouslycomputedcolumnsonthecolumncurrentlybeingcom-puted.Submatrix-Cholesky:Withkintheouterloop,theinnerloopsapplythecurrentcolumnasarank-1updatetotheremainingunreducedsubmatrix.Althoughrow-orientedalgorithmscanbeeectiveinsomecon-texts,column-orientedalgorithmstendtobemuchmoreeectiveinpracticeforsparseproblems,sowewillrestrictourattentiontothelatter.row-Choleskycolumn-Choleskysubmatrix-CholeskymodiedusedformodicationFigure2:ThreeformsofCholeskyfactorization.Column-Choleskyissometimessaidtobea\left-lookingalgo-rithm,sinceateachstageitaccessesneededcolumnstotheleftofthecurrentcolumninthematrix.Itcanalsobeviewedasa\demand-drivenalgorithm,sincetheinnerproductsthataectagivencolumnarenotaccumulateduntilactuallyneededtomod-ifyandcompletethatcolumn.Forthisreason,column-Choleskyiscalleda\delayed-updatealgorithm.Itisalsoreferredtoasa\fan-inalgorithm,sincethebasicoperationistocombinetheeectsofmultiplepreviouscolumnsonasingletargetcolumn.Insubmatrix-Cholesky,assoonacolumnhasbeencomputed,itseectsonallsubsequentcolumnsarecomputedimmediately.Thus,submatrix-Choleskyissaidtobea\right-lookingalgorithm,sinceateachstagecolumnstotherightofthecurrentcolumnaremodi-ed.Itcanalsobeviewedasa\data-drivenalgorithm,sinceeachnewcolumnisusedassoonasitiscompletedtomakeallmodi-cationstoallthesubsequentcolumnsitaects.Forthisreason,submatrix-Choleskyiscalledan\immediate-updatealgorithm.Itisalsoreferredtoasa\fan-outalgorithm,sincethebasicopera-tionisforasinglecolumntoaectmultiplesubsequentcolumns.Wewillseethatthesecharacterizationsofthecolumn-Cho

1 / 36
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功