Predicting gene regulatory elements in silico on a

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

10.1101/gr.8.11.1202Accessthemostrecentversionatdoi:19988:1202-1215GenomeRes. AlvisBrazma,IngeJonassen,JaakViloandEskoUkkonen PredictingGeneRegulatoryElementsinSilicoonaGenomicScale  References: :serviceEmailalertingclickheretoprightcornerofthearticleorReceivefreeemailalertswhennewarticlescitethisarticle-signupintheboxattheNotes :GenomeResearchTosubscribeto©1998ColdSpringHarborLaboratoryPressColdSpringHarborLaboratoryPressonJanuary4,2008-Publishedby(EMBL)Outstation–Hinxton,EuropeanBioinformaticsInstitute,WellcomeTrustGenomeCampus,Hinxton,CambridgeCB101SD,UK;2DepartmentofInformatics,UniversityofBergen,Høyteknologisenteret,N5020Bergen,Norway;3DepartmentofComputerScience,FIN-00014UniversityofHelsinki,Helsinki,FinlandWeperformedasystematicanalysisofgeneupstreamregionsintheyeastgenomeforoccurrencesofregularexpression-typepatternswiththegoalofidentifyingpotentialregulatoryelements.Toachievethisgoal,wehavedevelopedanewsequencepatterndiscoveryalgorithmthatsearchesexhaustivelyforaprioriunknownregularexpression-typepatternsthatareover-representedinagivensetofsequences.Weappliedthealgorithmintwocases,(1)discoveryofpatternsinthecompletesetof6000sequencestakenupstreamoftheputativeyeastgenesand(2)discoveryofpatternsintheregionsupstreamofthegeneswithsimilarexpressionprofiles.Inthefirstcase,welookedforpatternsthatoccurmorefrequentlyinthegeneupstreamregionsthaninthegenomeoverall.Inthesecondcase,firstweclusteredtheupstreamregionsofallthegenesbysimilarityoftheirexpressionprofilesonthebasisofpubliclyavailablegeneexpressiondataandthenlookedforsequencepatternsthatareover-representedineachcluster.Inbothcasesweconsideredeachpatternthatoccurredatleastinsomeminimumnumberofsequences,andratedthemonthebasisoftheirover-representation.Amongthehighestratingpatterns,mosthavematchestosubstringsinknownyeasttranscriptionfactor-bindingsites.Moreover,severalofthemareknowntoberelevanttotheexpressionofthegenesfromtherespectiveclusters.Experimentsonsimulateddatashowthatthemajorityofthediscoveredpatternsarenotexpectedtooccurbychance.Completelysequencedgenomes,togetherwiththeemergingDNAmicroarraytechnologiesenablingthemeasurementofthegeneexpressionlevelsincellcultures(Schenaetal.1995;forasurvey,seeRamsay1998),areopeningnewpossibilitiesforstudyinggeneregulation.Thesequencingofthefirsteukaryoticgenome(theyeastSaccharomycescer-evisiae)wascompletedin1996(Goffeauetal.1996;Mewesetal.1997).Dataabouttheexpressionlevelsofalmostallofthe~6000yeastgeneshavebeenobtained(DeRisietal.1997;Velculescuetal.1997;Wodickaetal.1997)during1997.Inparticular,De-Risietal.(1997)measuredtherelativeexpressionlevelsoftheyeastgenesatsevenconsecutivetimepoints(in2-hrintervals)duringashiftfromanaero-bictoaerobicmetabolism(diauxicshift).Theyshowedthatsomeofthegenesthatareknowntobeinvolvedinmetabolicpathwaysrelatedtothedi-auxicshiftunderwentaverysignificantchangeintheirexpressionlevelduringtheshift.Bytreatingtheexpressionmeasurementsasatimeseries,itispossibletoclustergenesaccordingtosimilaritiesintheirexpressionprofiles.Itmaybehypothesizedthatatleastsomeofthegenesinaclusterareregu-latedbysimilarmechanisms.Thetranscriptionregulationmechanismsineu-karyoticgenomesarenotwellunderstood.Evi-dently,however,anessentialroleisplayedbytran-scriptionfactors,whichcanbindtoparticularDNAsequences,calledtranscriptionfactor-bindingsites,believedtobeabout5–25bplong.Inyeast,thesesitesareusuallywithinseveralhundredbasepairsupstreamoftherespectiveORFs(Mellor1993).Regularexpressiontypepatterns,aswellasnucleotidedistributionmatrices,havebothbeenusedfordescribingtranscriptionfactor-bindingsites,(e.g.,seeBucher1990;Ghosh1990;Chenetal.1995;Wingenderetal.1996).Inferenceofsuchde-scriptionsfromthesequencesthatareassumedtocontainasiteforaparticulartranscriptionfactorisadifficultproblemastheconsensusofthedifferentbindingsitesofthesametranscriptionfactorisof-tenratherweak.Algorithmshavebeenproposedforinferringsuchdescriptionsfromsetsofrelativelysmallnumberofsequences(about20)inwhichall4Correspondingauthor.E-MAILvilo@cs.helsinki.fi;FAX358970844441.LETTER1202GENOMERESEARCH8:1202–1215©1998byColdSpringHarborLaboratoryPressISSN1054-9803/98$5.00;(e.g.,seeStormoandHartzell1989;Wolfertstetteretal.1996;vanHeldenetal.1998).Morerecently,vanHeldenetal.(1998)andYadaetal.(1998)haveproposedmethodsforthediscoveryofputativetranscriptionfactor-bindingsitesfromlargerdatasets.Yadaetal.(1998)appliedtheirmethodtoana-lyzeabout400humanpromotorsequences.Apparently,

1 / 15
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功