Deepsequencing-basedexpressionanalysisshowsmajoradvancesinrobustness,resolutionandinter-labportabilityoverfivemicroarrayplatformsPeterA.C.'tHoen1,*,YavuzAriyurek1,HeleneH.Thygesen1,ErnoVreugdenhil2,RolfH.A.M.Vossen1,RenéeX.deMenezes1,JudithM.Boer1,Gert-JanB.vanOmmen1andJohanT.denDunnen11TheCenterforHumanandClinicalGeneticsandtheLeidenGenomeTechnologyCenter,LeidenUniversityMedicalCenterand2TheDepartmentofMedicalPharmacologyfromtheLeiden/AmsterdamCenterforDrugResearch,Leiden,TheNetherlands*Towhomcorrespondenceshouldbeaddressed.Tel:+31715269421;Fax:+31715268285;Email:p.a.c.hoen@lumc.nlReceivedAugust12,2008.RevisedSeptember16,2008.AcceptedSeptember29,2008.ABSTRACTThehippocampalexpressionprofilesofwild-typemiceandmicetransgenicforC-doublecortin-likekinasewerecomparedwithSolexa/Illuminadeepsequencingtechnologyandfivedifferentmicroarrayplatforms.WithTOPABSTRACTINTRODUCTIONMATERIALSANDMETHODSRESULTSDISCUSSIONSUPPLEMENTARYDATAFUNDINGREFERENCESIllumina'sdigitalgeneexpressionassay,weobtained2.4millionsequencetagspersample,theirabundancespanningfourordersofmagnitude.Resultswerehighlyreproducible,evenacrosslaboratories.WithadedicatedBayesianmodel,wefounddifferentialexpressionof3179transcriptswithanestimatedfalse-discoveryrateof8.5%.Thisisamuchhigherfigurethanfoundformicroarrays.TheoverlapindifferentiallyexpressedtranscriptsfoundwithdeepsequencingandmicroarrayswasmostsignificantforAffymetrix.ThechangesinexpressionobservedbydeepsequencingwerelargerthanobservedbymicroarraysorquantitativePCR.Relevantprocessessuchascalmodulin-dependentproteinkinaseactivityandvesicletransportalongmicrotubuleswerefoundaffectedbydeepsequencingbutnotbymicroarrays.Whileundetectablebymicroarrays,antisensetranscriptionwasfoundfor51%ofallgenesandalternativepolyadenylationfor47%.Weconcludethatdeepsequencingprovidesamajoradvanceinrobustness,comparabilityandrichnessofexpressionprofilingdataandisexpectedtoboostcollaborative,comparativeandintegrativegenomicsstudies.INTRODUCTIONGeneexpressionmicroarraysareatpresentthedefaulttechnologyfortranscriptomeanalysis.Sincetheyrelyonsequence-specificprobehybridization,theysufferfrombackgroundandcross-hybridizationproblemsandmeasureonlytherelativeabundancesoftranscripts(1).Moreover,onlypredefinedsequencesaredetected.Incontrast,tag-basedsequencingmethodslikeSAGE(SerialAnalysisofGeneExpression)measureabsoluteabundanceandarenotlimitedbyarraycontent(2).However,laboriousandcostlycloningandsequencingstepshavethusfargreatlyTOPABSTRACTINTRODUCTIONMATERIALSANDMETHODSRESULTSDISCUSSIONSUPPLEMENTARYDATAFUNDINGREFERENCESlimitedtheuseofSAGE.Thishasradicallychangedwiththeintroductionofdeepsequencingtechnology,enablingthesimultaneoussequencingofuptomillionsofdifferentDNAmolecules.ThesharedideabehindthedifferentdeepsequencingapproachesistheclonaldetectionofsingleDNAmoleculesatphysicallyisolatedlocations(3–5).WeusedtheSolexa/Illumina1GGenomeAnalyzer,inwhichadaptersequences,ligatedtobothendsoftheDNAmolecule,areboundtoaglasssurfacecoatedwithcomplementaryoligonucleotides.Thisisfollowedbysolid-phaseDNAamplificationandsequencing-by-synthesis(6).Thesystemyieldsmillionsofshortreads(currentlyupto36bp),andisthereforeverysuitablefortag-basedtranscriptomesequencing.ThetechnologyisalsoreferredtoasDigitalGeneExpressiontagprofiling(DGE),andisessentiallyanimprovedversionoftheearlierMassivelyParallelSignatureSequencing(MPSS)technology(3,7).ThefirststepsoftheprocedurearesimilartoclassicalLONG-SAGE.Tworestrictionenzymesareusedtogeneratetags,cuttingatthemost3'CATGand17bpdownstreamofthefirstenzymesite.UnlikeinclassicalSAGE,tagsareneitherconcatenatednorcloned,butsequencedimmediately.Theunprecedentedsequencingdepthnowenablestheanalysisofindividualbiologicalsamples,whilepoolingofsampleswaspreviouslytheonlyaffordableoptioninSAGE.Ourresultsincludeastrikingexampleoftheintrinsichazardsofpoolinginexpressionprofiling.Thebiologicalquestionaddressedinthecurrentstudywastheidentificationoftranscriptsdifferentiallyexpressedinthehippocampusbetweenwild-typeandtransgenicmiceoverexpressingasplicevariantofthedoublecortin-likekinase-1(Dclk1)gene.Thissplicevariant,C-doublecortin-likekinase(DCLK)-short,makesthekinaseconstitutivelyactive(8),andcausessubtlebehavioralphenotypes(Schenketal.,inpreparation).TheexactsameRNAsampleshavebeenanalyzedbeforeonfivedifferentgenome-widemicroarrayexpressionprofilingplatforms(9),whichdetectedfewdifferencesinexpressionbetweenthetwogroups.WereportherethatDGEdetectsalotmoresmall,yetsignificantdifferencesbetweenthetwogroupsofmice,includingthoseinantisensetranscriptsandtranscriptswithdifferent3'-untranslatedregions(UTRs).Furthermore,wediscusstheadvantagesofdeepsequencingovermicroarrayexpressionprofiling.MATERIALSANDMETHODSSamplesWild-typemaleC57/BL6jandtransgenicmalemiceoverexpressingDCLK-shortwithaC57/BL6jbackgroundwereindividuallyhoused7dayspriortothestartoftheexperiment.Animalswerehousedunderstandardconditions,12h/12hlight/darkcycleandhadaccesstofoodandwateradlibitum.Wild-type(N=4)andtransgenic(N=4)tissuesampleswerecollectedbytakingthebrainfrom