野生稻O.rufipogonW19431888条全长cDNA序列的数据分析NCGR2008-09-03背景•野生稻O.rufipogon(AAgenome)是与栽培稻关系最近的祖先水稻品种1,2。•具有许多优于栽培稻的农艺性状,比如耐旱、耐盐等等3,4;•公共数据库中有大量栽培稻的基因组序列信息5,6,同时也有大量的cDNA资源7,8;•极少野生稻的序列和克隆资源,比较成规模的是Oryzaminuta(BBCCgenome)5,211条叶片ests9。现状与目的•NCGR野生稻资源:克隆并精确测序了1,888个unique的O.rufipogonW1943cDNA克隆。•期望通过W1943cDNA序列与籼、粳稻cDNA序列的比较:汇总一些水稻新基因、潜在野生稻特有的基因、W1943特有剪切方式基因、组织特异性高表达的基因和与microRNA相关的基因;提供一些线索,供有兴趣者作进一步研究之用。1888W1943cDNAsBLASTagainstcultivatedricegenomicsequencesandcDNAs1888W1943cDNAsSSRcomparisonwithindicaandjaponicacDNAs一、未匹配粳稻基因组之基因•定义:未能定位到O.sativajaponicaNipponbaregenomesequences,但与籼稻93-11基因组序列有同源或与水稻ests序列有同源或与其它禾本科ests序列有同源。且去除与细菌有同源的基因•解释:或者落于粳稻基因组测序gap中,或者籼稻特有的基因,或者野生稻特有基因。name93-11contigsESTsormRNAhitsproteinname93-11contigsESTsormRNAhitsproteinCT842002Contig005912AK241925-CU406895Contig003011CT859459-CT842007Contig008507CT856206-CU861744Contig000750AK099287ring-boxproteinCU405940Contig001402AK103326-CU405657-CT856885-CU406172Contig014596AK242967-CT841712-CA766528-CT842006Contig000383AK111647GTP-bindingproteinCU405768-CT83665660SribosomalproteinL7ACU861753Contig000750AK099287ring-boxproteinCU405675-CA75623560SribosomalproteinL17CU406308Contig000444AK070131-CU406202-NM_001063334-CT841996Contig002576CT834800-CU406924-AC145809-CU406568Contig003848AK064050BowmanBirktrypsininhibitorCU405898-CN130755.1(Sorghumbicolor)ribulose-bisphosphatecarboxylaseCU406582Contig000444AK107776-CU406778-BE429292.1(Triticumturgidum)-CU406596Contig001277AK242711-CU861677-FF534517.1(Manihotesculenta)-CT842008Contig008507CT856206--CT841912-EH277383.1(Spartinaalterniflora)-二、水稻新基因•定义:能定位到栽培稻基因组序列的同源,但无任何已知水稻表达序列的同源。与riceMPSS搜索比较几乎没有找到匹配片段。•解释:水稻新基因。或者在栽培稻中表达量过低难于克隆,或者野生稻特有。nameLen(bp)ChrlocationIdentity(%)nameLen(bp)ChrlocationIdentity(%)AntisenseproteinCU4069106561099CU4057857270599CA764081DNA-directedRNApolymerase3CU4061385680299CU8617954750979CT858901-CU4060225431299CU4063558371297AK107125AP2domain,putativeCU40575747704100CU4063965200299AK103485-CU40692141402100CT8418009411199AK121962patatin,putativeCU40653538902100CU8616886930899AK109182-CU4068325301092CT84193715520898AK106713-CU4068714580184注:该17个基因均没有找到任何蛋白同源匹配。右侧的7个基因与已知的水稻表达序列成反义RNA对。CU8618043830699CU86172155401100三、W1943特有剪切方式基因•定义:与栽培稻japonica基因组序列完全一致(100%identity),同时与栽培稻表达序列同源但剪切方式独有(独特的AS剪切方式)。•解释:或只是尚未克隆到该AS表达方式;或为野生稻所独具。nameLen(bp)ChrlocationNo.ofexonproteinCT841942978076(1stintron:GC-AG)-CU406810958066(1stintron:GT-TG)dual-specificityphosphataseproteinCT8418931011016drought-inducedproteinCT8418741369014vesicletransportproteinCU4058531377051dehydration-responsiveproteinCU405923639071IAAamidohydrolaseCU406279648051-CU406025839021-CT841561740062-CU406579468092-CU4069351345012-CU4066001107012-CU405570952012-CU406091893013-CU406134665103-someW1943cDNAsuniquesplicingpatternⅰ:Theexpressionlevelofeverygeneshouldexceed100tpm(timespermillion)ofatleastonetissue.ⅱ:Ifthegeneexpressedinseveraldiversetissues,thenthepercentageofthehighestexpressionlevelshouldbemorethan75%amongalltissues.ⅲ:Theratioofthefirsttwohighestexpressionlevelshouldbeover10.41putativetissue-specificgenes10剪切出现intron;intron中出现exonexon剪切出现intronintergenic区出现exon2个exon合并成单exon四、组织特异性高表达基因namelenORFtissueproteinorw1943s101k15619bp51-434bpleaflight-regulatedproteinorw1943c102c24833bp62-463bpleafsubunitofribulose-1,5-bisphosphatecarboxylaseorw1943s101p081544bp111-650bpleafglycolateoxidaseorw1943s101h18912bp120-752bpleafH+-transportingATPsynthasechain9-likeproteinorw1943c113b17837bp108-623bpleafretrotransposonprotein,putative,unclassifiedorw1943c002g13843bp58-576bpleafalanineaminotransferaseorw1943c003d24404bp18-227bpleafferredoxin-NADP(H)oxidoreductaseorw1943c104a05985bp70-777bpleafphotosystem-1Fsubunitprecursororw1943c006o21625bp89-310bprootmetallothionein-likeproteintype1orw1943c103g16916bp110-553bprootMAPEGfamilyorw1943c003o09619bo111-311bprootPotatoinhibitorIfamilyorw1943c112g141008bp58-825bprootreceptor-likeproteinkinaseorw1943c104g04805bp115-618bprootpathogenesis-relatedproteinPR-1aorw1943c006h22664bp1-453bprootpathogenesis-relatedprotein4borw1943s101p06512bp36-353bprootN-carbamyl-L-aminoacidamidohydrolaseorw1943c111d191399bp59-1312bpgerminatingseedlingputativealpha-galactosidaseorw1943c002o05769bp82-432bpgerminatingseedlingnonspecificlipid-transferprotein2precursororw1943c002p22518bp92-331bpmeristematictissuemetallothioneinorw1943c109d02682bp45-230bpmuturepollenputativelipase五、潜在miRNA及miRNA靶基因•microRNAs:21-23nt小分子RNA,由具发夹结构的70-90nt单链RNA前体经Dicer酶加工而来。具种间保守性。•作用方式:通过不完全互补结合到靶目标mRNA(多数3’UTR区),诱发蛋白翻译抑制,不影响转录本的稳定性;少数miRNA可能以类似siRNA的方式诱导靶目标mRNA的降解。根据互补的完全程度发挥不同的作用。•判断流程及标准11:PredictedORF100aa;NoORFpredictedSeqswithnon-codingproteinBLASTxAllpreviouslyknownplantmiRNAsBLASTSeqswith≤3mismatchedagainstknownmiRNAsPredictedsecondaryRNAstructurebymFold3.21.pre-miRNAssequencecanfoldintoanappropriatehairpinsecondarystructurethatcontainsthe~22ntmaturemiRNAsequencewithinonearmofthehairpin.2.miRNAprecursorswithsecondarystructureshadhighernegativeminimalfreeenergiesthanotherdifferenttypesofRNAs.3.miRN