79lecture-19(宾夕法尼亚大学二代测序数据分析教程)

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

2013-BMMB597D:AnalyzingNextGenerationSequencingDataWeek10,Lecture19NickStolerTheHuckInstitutesoftheLifeSciencesPennStateSequencedatatogenotypes●AcommonsequencingworkflowSequencingreadsAlignmentsVariantcallsFASTQSAM/BAMVCFalistofshortsequencesalistofshortsequencesandwheretheyareinthegenomealistoflocationsinthegenomeandwhatthebaseisateachWhatarevariantcalls?●Naivevariantcalling-Checkallthereadsthatcoverbasechr1:291-Addupthebasesatchr1:291-e.g.10A's,2G's∙IsthisanA/Gheterozygoussiteortwosequencingerrors?●Actualvariantcallers-Estimatelikelihoodofavariantsitevsasequencingerror∙Sequencingerrorrate∙QualityscoresVCF:VariantCallFormat●Representalistoflocationsandthevariantcallateach-Simple,right?●Yesandno.-Simplefoundation∙Locationandbase-Complex“bonusfeatures”∙Indels,structuralvariants,etc.∙Multiplesamples∙HaplotypephasingVCF:Thesimplepart●location,referencebase,yourbase-CHROM/POS,REF,ALT-alotlikewgsim'smutations.txtVCF:TherestVCF:Thefullcolumnlist*****●Variantcallconfidence-likePhredscoreandMAPQ:Multiplevariants●Whatifyourreadshavemorethan1baseatonelocation?-wgsim'smutations.txt∙IUPACnotation●VCFjustgivescomma-separatedlists-REFALT-AA,C:Complexvariants●Canshowshortindels-CCT(insertT)-ACGA(deleteCG)VCF:Multiplesamples●VCFcanhaveavariablenumberofcolumns!●Columnheadingsarethesamplenames●VCFcanrepresentSNVcalls●andmuch,muchmore-Indels(GGC)-Multiplevariantspersite(inALTcolumn)-Multiplesamples(SAMPLEcolumns)●Checkposterforquickoverview-●Checkfullspecificationfordetails-●SamtoolsmpileupBCF-BCFistoVCFasBAMistoSAM∙(roughly)-TheBCFdoesn'tholdactualcalls∙encodeslikelihoodsforallvariants●BcftoolsviewVCF-Performstheactualvariantcalling-u:uncompressedoutput-D:includereaddepthinoutput-f:use../refs/sc.faasreference-v:onlyoutputnon-referencesites-c:doSNPcalling-g:callgenotypesatvariantsitesLiH.AstatisticalframeworkforSNPcalling,mutationdiscovery,associationmappingandpopulationgeneticalparameterestimationfromsequencingdata.Bioinformatics(2011)27(21):2987-2993.Morempileuptricks●CombinemultipleBAMfilesintooneBCF●OnlyincludeoneregionHomework19●Takeyourmutations.txtfilefromwgsim(orcreateanotherone)andcreateapartialVCFfilefromthefirst10lines(butskiponeswithindels)-Onlythelastheaderline(#CHROM)-Onlythefirst5columns-RefertoIUPACnucleicacidcodesfornon-ACGTbases∙meansitgeneratedreadswithbothAandTatthislocation●Usesamtools/bcftoolstocreateafullVCFfilefromthealignmentsyoucreatedintheprevioushomework

1 / 14
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功