转录组数据分析解读及实例操作

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

转录组数据分析解读及  实例操作罗奇斌  奇云诺德QY  NODE  德国慕尼黑工业大学  Second  genera1on  sequencers234常规分析5实验流程6分析所需工具  7• Bow1e  so7ware  – h9p://bow1e-­‐bio.sourceforge.net/index.shtml/  • SAM  tools  – h9p://samtools.sourceforge.net/  • TopHat  so7ware  – h9p://tophat.cbcb.umd.edu/  • Cufflinks  so7ware  – h9p://cufflinks.cbcb.umd.edu/  • CummeRbund  so7ware  – h9p://compbio.mit.edu/cummeRbund/    *Linux,  64bit  CPU,  16G  memory  • RNAseqisapowerfultooltodetcetthewholetransciptomeincellandtissue.• PreviousRNAseqresearchfocusonmRNA,butrecentstudiesprovethatpartoffunctionalnoncodingtransctiptandprotein-codingRNAsarelackofpolyA.Contentoftranscriptome1. Genes:expression,alterantesplices2. NoncodingRNA:snoRNA,mRNA-likencRNA,snRNA,someantisensetranscripts,pesudogenes,retrotransposon,andothersfunctionalRNAs3.SomerepeatelementsRNA-seq的生物学重复和标准1. 至少有两个生物学重复,除非“短时间梯度取样” (overlappingtimepointswithhightemporalresolution)不需要技术重复2. 对基因注释较好的物种,只定量比较研究,可用reads大于20M;用于注释基因组的转录组,大于100M3. 最好有浓度不同长度不同的绝对定量control(Spike-in),以评估mapping质量、测序均匀性和RNA-seq定量效果4. “3端/5端比值”是衡量RNA完整性的关键指标(理想值是1),也要进行计算评估5. 样品处理流程,文库构建流程,测序机器,测序类型,分析软件,样品评估关键指标,rpkm值关键结果完备。BackgroundmRNA-seqMapping  and  Assembly  tools  BWA  -­‐  BWA  is  a  fast  light-­‐weighted  tool  that  aligns  rela1vely  short  sequences  (queries)  to  a  sequence  database  (targe),  such  as  the  human  reference  genome  SeqMap  -­‐  A  Tool  For  Mapping  Millions  Of  Short  Sequences  To  The  Genome.  MAQ  -­‐  stands  for  Mapping  and  Assembly  with  Quality  It  builds  assembly  by  mapping  short  reads  to  reference  sequences.    ERANGE  -­‐  Mapping  and  Quan1fying  Mammalian  Transcriptomes  by  RNA-­‐Seq  Cufflinks  -­‐  assembles  transcripts,  es1mates  their  abundances,  and  tests  for  differen1al  expression  and  regula1on  in  RNA-­‐Seq  samples.  iAssembler  –  a  standalone  package  to  assemble  ESTs  generated  using  Sanger  and/or  Roche-­‐454  pyrosequencing  technologies  into  con1gs.  MapPER  -­‐  an  RNA-­‐seq  paired-­‐end  read  (PER)  protocol.    Support  splice  mapping  and  quan7fy    TopHat  -­‐  is  a  fast  splice  junc1on  mapper  for  RNA-­‐Seq  reads.  SpliceMap  -­‐  SpliceMap  is  a  de  novo  splice  junc1on  discovery  tool.  It  offers  high  sensi1vity  and  support  for  arbitrarily  long  RNA-­‐seq  read  lengths.  MapSplice  -­‐  Splice  Junc1on  Mapping  Tool.  Trinity  RNA-­‐Seq  Assembly  –  so7ware  solu1ons  targeted  to  the  reconstruc1on  of  full-­‐length  transcripts  and  alterna1vely  spliced  isoforms  from  Illumina  RNA-­‐Seq  data  PALMapper  -­‐  a  combina1on  of  the  spliced  alignment  method  QPALMA  with  the  short  read  alignment  tool  GenomeMapper.  RNA-SeqDataAnalysisToolsWeb-­‐based  tools  rQuant.web  -­‐  is  a  web  service  to  provide  convenient  access  to  tools  for  the  quan1ta1ve  analysis  of  RNA-­‐Seq  data.    Galaxy  -­‐  Mapping  pipeline  for  Illumina,  454,  and  SOLiD  sequencing  data.  UCSC  Genome  Browser  -­‐  This  site  contains  the  reference  sequence  and  working  dra7  assemblies  for  a  large  collec1on  of  genomes.  It  also  provides  portals  to  the  ENCODE  and  Neandertal  projects.  Bioconductor  -­‐  Bioconductor  is  an  open  source  and  open  development  so7ware  project  for  the  analysis  and  comprehension  of  genomic  data.  ExpEdit  -­‐  is  a  web  applica1on  for  assessing  RNA  edi1ng  in  human  at  known  or  user  specified  sites  supported  by  transcript  data  obtained  by  RNA-­‐Seq  experiments.  Myrna  -­‐  a  cloud  compu1ng  tool  for  RNA  sequence.  GenePa9ern  -­‐  is  a  powerful  genomic  analysis  pladorm  that  provides  access  to  more  than  100  tools  for  gene  expression  analysis,  proteomics,  SNP  analysis  and  common  data  processing  tasks.    Others  Scripture  -­‐  is  a  method  for  transcriptome  reconstruc1on  that  relies  solely  on  RNA-­‐Seq  reads  and  an  assembled  genome  to  build  a  transcriptome  ab  ini&o.  CisGenome  -­‐  An  integrated  tool  for  1ling  array,  ChIP-­‐seq,  genome  and  cis-­‐regulatory  element  analysis.    ArrayExpressHTS  -­‐  is  an  R  based  pipeline  for  pre-­‐processing,  expression  es1ma1on  and  data  quality  assessment  of  high  throughput  sequencing  transcrip1onal  profiling  (RNA-­‐seq)  datasets.  RSEQtools  -­‐  a  modular  framework  to  analyze  RNA-­‐Seq  data  using  compact,  anonymized  data  summaries.  RNA-­‐MATE  -­‐  A  recursive  mapping  strategy  for  high-­‐throughput  RNA-­‐sequencing  data.    SAMMate  -­‐  an  RNA-­‐seq  analysis  pipeline,  allows  processing  of  SAM/BAM  files  and  is  compa1ble  with  both  single-­‐end  and  paired-­‐end  sequencing  technologies.  Oqtans:  Online  Quan1ta1ve  Transcriptome  Analysis.  DESeq  -­‐  Digital  gene  expresion  analysis  based  on  the  nega1ve  binomial  distribu1on.  EdgeR  GeneexpressionnormalizationFragmentReads:RPKM:quantifiedtranscriptlevelsinreadsperkilobaseofexonmodelpermillionmappedr

1 / 48
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功