人与机器恶意众包工人的实际敌对检测

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

ThispaperisincludedintheProceedingsofthe23rdUSENIXSecuritySymposium.August20–22,2014•SanDiego,CAISBN978-1-931971-15-7OpenaccesstotheProceedingsofthe23rdUSENIXSecuritySymposiumissponsoredbyUSENIXManvs.Machine:PracticalAdversarialDetectionofMaliciousCrowdsourcingWorkersGangWang,UniversityofCalifornia,SantaBarbara;TianyiWang,UniversityofCalifornia,SantaBarbaraandTsinghuaUniversity;HaitaoZhengandBenY.Zhao,UniversityofCalifornia,SantaBarbara:PracticalAdversarialDetectionofMaliciousCrowdsourcingWorkersGangWang†,TianyiWang†‡,HaitaoZheng†andBenY.Zhao††ComputerScience,UCSantaBarbara‡ElectronicEngineering,TsinghuaUniversity{gangw,tianyi,htzheng,ravenben}@cs.ucsb.eduAbstractRecentworkinsecurityandsystemshasembracedtheuseofmachinelearning(ML)techniquesforidentify-ingmisbehavior,e.g.emailspamandfake(Sybil)usersinsocialnetworks.However,MLmodelsaretypicallyderivedfromfixeddatasets,andmustbeperiodicallyretrained.Inadversarialenvironments,attackerscanadaptbymodifyingtheirbehaviororevensabotagingMLmodelsbypollutingtrainingdata.Inthispaper1,weperformanempiricalstudyofad-versarialattacksagainstmachinelearningmodelsinthecontextofdetectingmaliciouscrowdsourcingsystems,wheresitesconnectpayinguserswithworkerswillingtocarryoutmaliciouscampaigns.Byusinghumanwork-ers,thesesystemscaneasilycircumventdeployedse-curitymechanisms,e.g.CAPTCHAs.WecollectadatasetofmaliciousworkersactivelyperformingtasksonWeibo,China’sTwitter,anduseittodevelopML-baseddetectors.WeshowthattraditionalMLtechniquesareaccurate(95%–99%)indetectionbutcanbehighlyvulnerabletoadversarialattacks,includingsimpleeva-sionattacks(workersmodifytheirbehavior)andpower-fulpoisoningattacks(whereadministratorstamperwiththetrainingset).WequantifytherobustnessofMLclas-sifiersbyevaluatingtheminarangeofpracticaladver-sarialmodelsusinggroundtruthdata.Ouranalysispro-videsadetailedlookatpracticaladversarialattacksonMLmodels,andhelpsdefendersmakeinformeddeci-sionsinthedesignandconfigurationofMLdetectors.1IntroductionToday’scomputingnetworksandservicesareextremelycomplexsystemswithunpredictableinteractionsbe-tweennumerousmovingparts.Intheabsenceofac-curatedeterministicmodels,applyingMachineLearning1OurworkreceivedapprovalfromourlocalIRBreviewboard.(ML)techniquessuchasdecisiontreesandsupportvec-tormachines(SVMs)producespracticalsolutionstoavarietyofproblems.Inthesecuritycontext,MLtech-niquescanextractstatisticalmodelsfromlargenoisydatasets,whichhaveprovenaccurateindetectingmis-behaviorandattacks,e.g.emailspam[35,36],networkintrusionattacks[22,54],andInternetworms[29].Morerecently,researchershaveusedthemtomodelanddetectmalicioususersinonlineservices,e.g.Sybilsinsocialnetworks[42,52],scammersine-commercesites[53]andfraudulentreviewersononlinereviewsites[31].Despiteawiderangeofsuccessfulapplications,ma-chinelearningsystemshaveaweakness:theyarevulner-abletoadversarialcountermeasuresbyattackersawareoftheiruse.First,througheitherreadingpublicationsorself-experimentation,attackersmaybecomeawareofdetailsoftheMLdetector,e.g.choiceofclassifierandparametersused,andmodifytheirbehaviortoevadede-tection.Second,morepowerfulattackerscanactivelytamperwiththeMLmodelsbypollutingthetrainingset,reducingoreliminatingitsefficacy.Adversarialmachinelearninghasbeenstudiedbypriorworkfromatheoreti-calperspective[6,12,27],usingsimplisticall-or-nothingassumptionsaboutadversaries’knowledgeabouttheMLsysteminuse.Inreality,however,attackersarelikelytogainincompleteinformationorhavepartialcontroloverthesystem.AnaccurateassessmentoftherobustnessofMLtechniquesrequiresevaluationunderrealisticthreatmodels.Inthiswork,westudytherobustnessofmachinelearningmodelsagainstpracticaladversarialattacks,inthecontextofdetectingmaliciouscrowdsourcingactiv-ity.Maliciouscrowdsourcing,alsocalledcrowdturfing,occurswhenanattackerpaysagroupofInternetuserstocarryoutmaliciouscampaigns.Recentcrowdturf-ingattacksrangedfrom“artificialgrassroots”politicalcampaigns[32,38],productpromotionsthatspreadfalserumors[10],tospamdissemination[13,39].Today,thesecampaignsaregrowinginpopularityindedicated24023rdUSENIXSecuritySymposiumUSENIXAssociationcrowdturfingsites,e.g.ZhuBaJie(ZBJ)2andSanDaHa(SDH)3,andgenericcrowdsourcingsites[26,48].Thedetectionofcrowdturfingactivityisanidealcon-texttostudytheimpactofadversarialattacksonma-chinelearningtools.First,crowdturfingisagrowingthreattotoday’sonlineservices.Becausetasksareper-formedbyintelligentindividuals,theseattacksareunde-tectablebynormalmeasuressuchasCAPTCHAsorratelimits.Theresultsofthesetasks,fakeblogs,slander-ousreviews,fakesocialnetworkaccounts,areoftenin-distinguishablefromtherealthing.Second,centralizedcrowdturfingsiteslikeZBJandSDHprofitdirectlyfrommaliciouscrowdsourcingcampaigns,andthereforehavestrongmonetaryincentiveandthecapabilitytolaunchadversarialattacks.Thesesiteshavethecapabilitytomodifyaggregatebehavioroftheirusersthroughinter-facechangesorexplicitpolicies,therebyeith

1 / 17
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功