ChIP-seq

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

ChIP-seqdataanalysisChromatinImmunoPrecipitationassay染色质免疫沉淀技术tounderstandtranscriptionalregulationFunctionalgenomicsBioinformaticsProblems•WhatisChIP-seqtechnology?WhatbiologicalinformationcanbeobtainedfromChIP-seqdata?•Whatisthepeakcalling?Howtoperformit?•WhatisTFBS?Isthereanypossibilitytoobtainallbindingsitesacrossagenomeforonetranscriptionfactor,pleasethinkaboutitandgiveyourpropose.•TFBS,ChIP-seq,peakcalling,affinity,motif,PWMChIP-seqprotocolBrainstroming•HowtoanalyzethedatafromChIP-seqexperiment?–Tasks?–Pipelines?–Methods?ChIP-seqbigpicture•Combine“Next-Generation”sequencingwithChromatinImmunoprecipitationtoidentifygenome-widechromatinbindingsites.[Since2007]•Select(andidentify)fragmentsofDNAthatinteractwithspecificproteinssuchas:–Transcriptionfactors–Nucleosomes–Histonemodifications–Chromatinremodelingenzymes–Chaperons–RNAPolymerase(surveyactivelytranscribeportionsofthegenome)–DNApolymerase(investigateDNAreplication)–DNArepairenzymes–OrfragmentsofDNAthataremodified:e.g.CpGmethylationAnalysisofChIP-seqdata•Experimentaldesign–Controlsandreplicates•QC/Readprocessing–LibraryQC–Alignmentandfiltering–QCmeasuresandassessment•Peakcalling–Peakcallers•Differentialbindinganalysis–Occupancy-basedanalysis–Affinity-basedanalysis•Validationanddownstreamanalysis–Motifanalysis–Annotation–IntegratingbindingandexpressiondataWGAChIPInputLabelw/CyDyesApplytomicroarrayCompareRed/GreensignalintensitiestoidentifybindingsitesChIP-chip(ChIP2)“thepre-sequencingtechnology”-Limitedtoorganismswithavailablegenomicmicroarrays(oryou’llneedtomakeyourown)-Microarrayswitholigoscoveringwholemammaliangenomesareveryexpensive(manyarrayspersample)-Canbeeconomicalformodelorganismswithsmallgenomes&commerciallyavailablearrays(orforlimitedanalysis:e.g.promoterregions).Wholegenomeamplification(WGA)allowsgoodprobesignalfromsmallstartingsamples.-Subjecttohybridizationcurvelimitations&hyb.artifactsAdaptedfromChIPWorkshopbyCharlieNicolet,HeatherN.Witt&PeggyFarnhamProfXiaoleShirleyLiu刘小乐教授•天津南开中学,1992年考入北京大学生物系,1994年转学到美国史密斯女子学院(SmithCollege)双修生物化学和计算机科学,全校积分最高的1%的毕业生)•2002年斯坦福大学生物医学信息学博士和计算机科学辅修博士学位后,被直接聘为哈佛大学终身制助理教授。•哈佛大学公共卫生学院生物统计与计算生物学系终身教授Dana-Farber肿瘤研究所功能性癌症表观遗传组学中心主任,同济大学生物信息学系教授并长江学者讲座教授•基因调控机制的生物信息和计算生物学研究,Motifdiscovery:BioProspector,MDscan,MotifRegressor,CompareProspector,AdaBoostReleaseDNAImmunoprecipitateHigh-throughputsequencingofDNAendsMapsequencetagstogenome&identifypeaksChIP-seqAdaptedfromslidesetby:StuartM.Brown,Ph.D.,CenterforHealthInformatics&Bioinformatics,NYUSchoolofMedicinePOI=proteinofinterestPreparesequencinglibraryChIP-Seqadvantages•Doesn’trequireaspecially-constructedmicroarray•Workswithanysequencedgenome(betterifit’salsowellannotated)•Canbeeconomical:•At~160Millionreads,onelanecangiveyouallthebindingsitesinthegenome•MultiplexingcanallowmultiplesamplesperlaneLimitations:•Likearrays,can’tmakesenseofrepeatregions.•Alwaysgenomewide:…whichisgreatfortranscriptionfactorbindingsites&somehistonemodifications(whereonlyafewplacesinthegenomehavemanyreads,overalowreads/kbbackground)…canbeproblematicforverycommoneventslikenucleosomepositions&CpGmethylation(wheremostplacesingenomehaveroughlyequalreads/kb,thus160Mreadsstillgivesread#atanyonelocusthatistoolowtoquantitate).SummaryonChIP-seq•highresolution,lownoise,highgenomiccoveragecomparedwithChIP-chip•themostwidelyusedprocedureforgenome-wideassaysofprotein-DNAinteraction•studyofhistonemodificationsinepigeneticsresearchDataprocessingstepsschematicofChIP-seqexperiments[Parketal,2009]ChIPsequencingalignmentpeak-findingBaileyT,etal.PLoScomputationalbiology.2013,9(11):e1003326WorkflowinENCODESequencingDepth•dependsmainlyonthesizeofthegenomeandthenumberandsizeofthebindingsitesoftheprotein•HumanTForhistonemarks:thousandsofbindingsites,20millionreadsmaybeadequate•RNAPolII,morebindingsites,morereads,upto60million•TakeintoaccountlibrarycomplexityReadMappingandQualityMetrics•Filteredbyapplyingaqualitycutoff,totrimtheendofreadsthatareoflowquality•remainingreadsshouldbemappedusingoneoftheavailablemapperssuchasBowtie,BWA,SOAP,MAQ•allowa(user-settable)numberofmismatchesinthereads[parameter]•thepercentageofuniquelymappedreads,70%isnormal,50%maybecauseforconcern•multi-mappingreadswillbeignoredbymostpeak-callingalgorithms•Assessthesignal-to-noiseratio(SNR)viaqualitymetrics:CHANCE(assessesIPstrength)PeakCalling•Findingregionswithsignificantnumbersofmappedreads(peaks)withwindow-basedmethod•Balancebetweensensitivityandspecificity–peak-callingalgorithmandnormalizationmethod•Punctate(断点)-sourcetranscriptionfactors•PeakcallersSPPandMACSusecross-correlationtofindthelagbetweenreadstotheminusandplusstrand–Signalsmoothing–backgroundmodeling:removenoise–Statisticalassessment:Poisson,localPoisson(MACS),negativebinomial(CisGenome),zero-inflatednegativebinomial(ZINBA),HiddenMarkovModel(HPeak),BayesPeakBasicidea:Countthenumberofreadsinwindowsanddeterminewhetherthisnumberisabovebackground–ifso,definethatregionasbound•Broadpeaks:Useq,QuEST•Doespeakfindingmethodusecontroldata?[Chenetal,2012

1 / 98
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功