ARACNE An Algorithm for the Reconstruction of Gene

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

1ARACNE:AnAlgorithmfortheReconstructionofGeneRegulatoryNetworksinaMammalianCellularContextAdamA.Margolin1,2,IlyaNemenman2,KatiaBasso3,ChrisWiggins2,4,GustavoStolovitzky5,RiccardoDallaFavera3,AndreaCalifano1,2,*1DepartmentofBiomedicalInformatics,2JointCentersforSystemsBiology,3InstituteforCancerGenetics,4DepartmentofAppliedPhysicsandAppliedMathematics,ColumbiaUniversity,NewYork,NY100325IBMT.J.WatsonResearchCenter,YorktownHeights,N.Y.10598*Correspondingauthor:1130St.NicholasAvenueRoom910,NewYork,NY10032.Emailaddresses:AAM:adam@dbmi.columbia.edu,IN:ilya.nemenman@columbia.edu,KB:kb451@columbia.edu,CW:chw2@columbia.edu,GS:gustavo@us.ibm.com,RDF:rd10@columbia.edu,AC:califano@c2b2.columbia.edu2AbstractBackgroundElucidatinggeneregulatorynetworksiscrucialforunderstandingnormalcellphysiologyandcomplexpathologicphenotypes.Existingcomputationalmethodsforthegenome-wide“reverseengineering”ofsuchnetworkshavebeensuccessfulonlyforlowereukaryoteswithsimplegenomes.HerewepresentARACNE,anovelalgorithm,usingmicroarrayexpressionprofiles,specificallydesignedtoscaleuptothecomplexityofregulatorynetworksinmammaliancells,yetgeneralenoughtoaddressawiderrangeofnetworkdeconvolutionproblems.Thismethodusesaninformationtheoreticapproachtoeliminatethemajorityofindirectinteractionsinferredbyco-expressionmethods.ResultsWeprovethatARACNEreconstructsthenetworkexactly(asymptotically)iftheeffectofloopsinthenetworktopologyisnegligible,andweshowthatthealgorithmworkswellinpractice,eveninthepresenceofnumerousloopsandcomplextopologies.WeassessARACNE’sabilitytoreconstructtranscriptionalregulatorynetworksusingbotharealisticsyntheticdatasetandamicroarraydatasetfromhumanBcells.OnsyntheticdatasetsARACNEachievesverylowerrorratesandoutperformsestablishedmethods,suchasRelevanceNetworksandBayesianNetworks.ApplicationtothedeconvolutionofgeneticnetworksinhumanBcellsdemonstratesARACNE’sabilitytoinfervalidatedtranscriptionaltargetsofthec-MYCproto-oncogene.Wealsostudytheeffectsofmis-estimationofmutualinformationonnetworkreconstruction,andshowthatalgorithmsbasedonmutualinformationrankingaremoreresilienttoestimationerrors.ConclusionsARACNEshowspromiseinidentifyingdirecttranscriptionalinteractionsinmammaliancellularnetworks,aproblemthathaschallengedexistingreverseengineeringalgorithms.Thisapproachshouldenhanceourabilitytousemicroarraydatatoelucidatefunctionalmechanismsthatunderliecellularprocessesandtoidentifymoleculartargetsofpharmacologicalcompoundsinmammaliancellularnetworks.3BackgroundCellularphenotypesaredeterminedbythedynamicalactivityoflargenetworksofco-regulatedgenes.Thusdissectingthemechanismsofphenotypicselectionrequireselucidatingthefunctionsoftheindividualgenesinthecontextofthenetworksinwhichtheyoperate.Becausegeneexpressionisregulatedbyproteins,whicharethemselvesgeneproducts,statisticalassociationsbetweengenemRNAabundancelevels,whilenotdirectlyproportionaltoactivatedproteinconcentrations,shouldprovidecluestowardsuncoveringgeneregulatorymechanisms.Consequently,theadventofhighthroughputmicroarraytechnologiestosimultaneouslymeasuremRNAabundancelevelsacrossanentiregenomehasspawnedmuchresearchaimedatusingthesedatatoconstructconceptual“genenetwork”modelstoconciselydescribetheregulatoryinfluencesthatgenesexertoneachother.Genome-wideclusteringofgeneexpressionprofiles[1]providesanimportantfirststeptowardsthisgoalbygroupingtogethergenesthatexhibitsimilartranscriptionalresponsestovariouscellularconditions,andarethereforelikelytobeinvolvedinsimilarcellularprocesses.However,theorganizationofgenesintoco-regulatedclustersprovidesaverycoarserepresentationofthecellularnetwork.Inparticular,itcannotseparatestatisticalinteractionsthatareirreducible(i.e.,direct)fromthosearisingfromcascadesoftranscriptionalinteractionsthatcorrelatetheexpressionofmanynon-interactinggenes.Moregenerally,asappreciatedinstatisticalphysics,longrangeorder(i.e.,highcorrelationamongnon-directlyinteractingvariables)caneasilyresultfromshortrangeinteractions[2].Thuscorrelations,oranyotherlocaldependencymeasure,cannotbeusedastheonlytoolforthereconstructionofinteractionnetworkswithoutadditionalassumptions.Withinthelastfewyearsanumberofsophisticatedapproachesforthereverseengineeringofcellularnetworks(alsocalleddeconvolution)fromgeneexpressiondatahaveemerged(reviewedin[3]).Theirgoalistoproduceahigh-fidelityrepresentationofthecellularnetworktopologyasagraph,wheregenesarerepresentedasverticesandareconnectedbyedgesrepresentingdirectregulatoryinteractions.Thecriteriafordefininganedge,aswellasitsbiologicalinterpretation,remainimpreciseandvarybetweenapplications.Forexample,graphicalmodeling[4]definesedgesasparent-childrelationshipsbetweenmRNAabundancelevelsthataremostlikelytoexplainthedata,integrativemethods[5]useindependentexperimentalcluestodefineedgesasthoseshowingevidenceofphysicalinteractions,andotherstatistical/informationtheoreticalmethods[6]identifyedgeswiththestrongeststatisticalassociationsbetweenmRNAabundancelevels.Allavailableapproachessuffertovariousdegreesfromproblemssuchasoverfitting,highcomputational

1 / 28
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功