MachineLearningmanuscriptNo.(willbeinsertedbytheeditor)Machinelearningfortargeteddisplayadvertising:TransferlearninginactionC.PerlichB.DalessandroO.StitelmanT.RaederF.ProvostReceived:date/Accepted:dateAbstractThispaperpresentsadetaileddiscussionofproblemformulationanddatarepresentationissuesinthedesign,deployment,andoperationofamassive-scalemachinelearningsystemfortargeteddisplayadvertising.Notably,themachinelearningsystemitselfisdeployedandhasbeenincontinualuseforyears,forthousandsofadvertisingcampaigns(incontrasttosimplyhavingthemodelsfromthesystembedeployed).Inthisapplication,acquiringsufficientdatafortrainingfromtheidealsamplingdistributionisprohibitivelyexpensive.Instead,dataaredrawnfromsurrogatedomainsandlearningtasks,andthentransferredtothetargettask.Wepresentthedesignofthismultistagetransferlearningsystem,highlightingtheproblemformulationaspects.Wethenpresentadetailedexperimentalevaluation,showingthatthedifferenttransferstagesindeedeachaddvalue.Wenextpresentproductionresultsacrossavarietyofadvertisingclientsfromavarietyofindustries,illustratingtheperformanceofthesysteminuse.Weclosethepaperwithacollectionoflessonslearnedfromtheworkoverhalfadecadeonthiscomplex,deployed,andbroadlyusedmachinelearningsystem.1IntroductionAdvertisingisahugeindustry(around2%ofU.S.GDP),andadvertisersarekeenlyinterestedinwell-targetedads.Onlinedisplayadvertisingisakeysubfieldoftheindustrywhereadtarget-ingholdsbothpromiseandchallenges.Itispromisingbecauseofthewealthofdatathatcanbebroughttobeartotargetads.Itischallengingbecausethedisplayadvertisingecosystemisanex-tremelycomplicatedsystemwhereaccessingthedataanddeliveringtheadscaninvolvedozensofdifferentcorporateplayers,incontrasttosearchadvertisingforexample.Thispaperdealswithaparticularlychallengingsegmentoftheonlinedisplayadvertisingmarket:customerprospecting.Customerprospectinginvolvesdeliveringadvertisementstoconsumerswhohavenopreviouslyob-servedinteractionswiththebrand,butaregoodprospects—i.e.,arelikelytobecomecustomersafterhavingbeenshownanadvertisement.Displayadvertisinghasmaturedrapidlyoverthepastseveralyears,withtheproliferationofreal-timebiddingexchanges(RTBs)thatauctionoffwebsiterealestateforplacingonlinedisplayadsinrealtime.ThishascreatedanefficientmethodforadvertiserstotargetadvertisementstoC.Perlich,B.Dalessandro,O.Stitelman,T.RaederM6DResearch37E.18thSt.NewYork,NY,USAE-mail:fclaudia,briand,ori,troyg@m6d.comF.ProvostLeonardN.SternSchoolofBusiness,NewYorkUniversity44W.4thSt.NewYork,NY,USAE-mail:fprovost@stern.nyu.edu2Perlichetalparticularconsumers(seee.g.[22]).Asisstandardintheindustry,let’scalltheshowingofadisplayadtoaparticularconsumeran“impression.”IneachRTBthegoodbeingauctionedisanimpressionopportunity—aparticularspaceor“slot”onaparticularwebpageataparticularinstantwithaparticularconsumerviewingit.Theauctionsareruninreal-time,beingtriggeredtheinstantaconsumernavigatestothepageandtakingplaceduringthetimethepageisfullyrenderedintheconsumer’sbrowser.Atthetimeofauction,informationaboutthelocationofthepotentialadvertisementandanidentifieroftheparticularinternetuserarepassedtoallpotentialbiddersintheformofabidrequest.Advertisersoftensupplementthisdatawithinformationpreviouslycollectedorpurchasedaboutthespecificconsumerandwebsite.Whenanauctionisinitiated,apotentialadvertisermustdetermineifitwantstobidonthisimpression,howmuchitwouldliketobid,andwhatadvertisementitwouldliketodisplayifitwinstheauction.Therearebillionsofsuchreal-timeauctionsdailyandadvertisersrequirelarge-scaleandefficientsystemstomakethesedecisionsinmilliseconds.Thiscomplicatedecosysteminvitesmachinelearningtoplayakeyroleintheadoptimizationprocess,particularlybecauseofthesimultaneousavailabilityof(i)massive,veryfine-graineddataonconsumerbehavior,(ii)dataonthebrand-orientedactionsofconsumers,viainstrumentationofpurchasesystems,and(iii)theabilitytomakeadvertisingdecisionsanddeliveradvertisementsinrealtime.Theworkwedescribeinthispaperisonesuchmassive-scalemachinelearningsystemthatisdeployedandinregularusebyM6D,acompanythatfindsprospectivecustomersfortargeteddisplayadvertisingcampaignsandexecutesthosecampaignsonthemanyadvertisingexchanges.Notably,thelearningsystemitselfisdeployed,incontrasttothemuchmorecommoncaseofde-ployingthemodelsresultingfrommachinelearning(aswellashumanmodelcuration).Eachweek,thislearningsystembuildsthousandsofmodelsautomatically,drivingtheadvertisingcampaignsformajormarketersacrossmanyindustries.Thispaper’smaincontributiontothemachinelearningliteratureistousethisapplicationdo-maintodemonstratehowdatacharacteristicsandavailabilityconstraintsaretranslatedandinte-gratedintoacomplexproblemformulationandfinallyimplementedsuccessfullyasarobustlearn-ingsystem.Wedigintosomeseldomlydiscussedaspectsofproblemformulationformachinelearningapplications,focusingmuchofthediscussiononthefactthatforpragmaticreasons,thesystemdrawsdatafrommultiple,differentsamplingdistributionstocomposethemachinelearningsolution.Asmentionedattheoutset,ourtaskistoidentifyprospectivecustomers—onlineconsumerswhoaremostlikelytopur