人与机器恶意众包工人的实际敌对检测

523665365
0 ℃
2021-11-06

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

ThispaperisincludedintheProceedingsofthe23rdUSENIXSecuritySymposium.August20–22,2014•SanDiego,CAISBN978-1-931971-15-7OpenaccesstotheProceedingsofthe23rdUSENIXSecuritySymposiumissponsoredbyUSENIXManvs.Machine:PracticalAdversarialDetectionofMaliciousCrowdsourcingWorkersGangWang,UniversityofCalifornia,SantaBarbara;TianyiWang,UniversityofCalifornia,SantaBarbaraandTsinghuaUniversity;HaitaoZhengandBenY.Zhao,UniversityofCalifornia,SantaBarbara:PracticalAdversarialDetectionofMaliciousCrowdsourcingWorkersGangWang†,TianyiWang†‡,HaitaoZheng†andBenY.Zhao††ComputerScience,UCSantaBarbara‡ElectronicEngineering,TsinghuaUniversity{gangw,tianyi,htzheng,ravenben}@cs.ucsb.eduAbstractRecentworkinsecurityandsystemshasembracedtheuseofmachinelearning(ML)techniquesforidentify-ingmisbehavior,e.g.emailspamandfake(Sybil)usersinsocialnetworks.However,MLmodelsaretypicallyderivedfromﬁxeddatasets,andmustbeperiodicallyretrained.Inadversarialenvironments,attackerscanadaptbymodifyingtheirbehaviororevensabotagingMLmodelsbypollutingtrainingdata.Inthispaper1,weperformanempiricalstudyofad-versarialattacksagainstmachinelearningmodelsinthecontextofdetectingmaliciouscrowdsourcingsystems,wheresitesconnectpayinguserswithworkerswillingtocarryoutmaliciouscampaigns.Byusinghumanwork-ers,thesesystemscaneasilycircumventdeployedse-curitymechanisms,e.g.CAPTCHAs.WecollectadatasetofmaliciousworkersactivelyperformingtasksonWeibo,China’sTwitter,anduseittodevelopML-baseddetectors.WeshowthattraditionalMLtechniquesareaccurate(95%–99%)indetectionbutcanbehighlyvulnerabletoadversarialattacks,includingsimpleeva-sionattacks(workersmodifytheirbehavior)andpower-fulpoisoningattacks(whereadministratorstamperwiththetrainingset).WequantifytherobustnessofMLclas-siﬁersbyevaluatingtheminarangeofpracticaladver-sarialmodelsusinggroundtruthdata.Ouranalysispro-videsadetailedlookatpracticaladversarialattacksonMLmodels,andhelpsdefendersmakeinformeddeci-sionsinthedesignandconﬁgurationofMLdetectors.1IntroductionToday’scomputingnetworksandservicesareextremelycomplexsystemswithunpredictableinteractionsbe-tweennumerousmovingparts.Intheabsenceofac-curatedeterministicmodels,applyingMachineLearning1OurworkreceivedapprovalfromourlocalIRBreviewboard.(ML)techniquessuchasdecisiontreesandsupportvec-tormachines(SVMs)producespracticalsolutionstoavarietyofproblems.Inthesecuritycontext,MLtech-niquescanextractstatisticalmodelsfromlargenoisydatasets,whichhaveprovenaccurateindetectingmis-behaviorandattacks,e.g.emailspam[35,36],networkintrusionattacks[22,54],andInternetworms[29].Morerecently,researchershaveusedthemtomodelanddetectmalicioususersinonlineservices,e.g.Sybilsinsocialnetworks[42,52],scammersine-commercesites[53]andfraudulentreviewersononlinereviewsites[31].Despiteawiderangeofsuccessfulapplications,ma-chinelearningsystemshaveaweakness:theyarevulner-abletoadversarialcountermeasuresbyattackersawareoftheiruse.First,througheitherreadingpublicationsorself-experimentation,attackersmaybecomeawareofdetailsoftheMLdetector,e.g.choiceofclassiﬁerandparametersused,andmodifytheirbehaviortoevadede-tection.Second,morepowerfulattackerscanactivelytamperwiththeMLmodelsbypollutingthetrainingset,reducingoreliminatingitsefﬁcacy.Adversarialmachinelearninghasbeenstudiedbypriorworkfromatheoreti-calperspective[6,12,27],usingsimplisticall-or-nothingassumptionsaboutadversaries’knowledgeabouttheMLsysteminuse.Inreality,however,attackersarelikelytogainincompleteinformationorhavepartialcontroloverthesystem.AnaccurateassessmentoftherobustnessofMLtechniquesrequiresevaluationunderrealisticthreatmodels.Inthiswork,westudytherobustnessofmachinelearningmodelsagainstpracticaladversarialattacks,inthecontextofdetectingmaliciouscrowdsourcingactiv-ity.Maliciouscrowdsourcing,alsocalledcrowdturﬁng,occurswhenanattackerpaysagroupofInternetuserstocarryoutmaliciouscampaigns.Recentcrowdturf-ingattacksrangedfrom“artiﬁcialgrassroots”politicalcampaigns[32,38],productpromotionsthatspreadfalserumors[10],tospamdissemination[13,39].Today,thesecampaignsaregrowinginpopularityindedicated24023rdUSENIXSecuritySymposiumUSENIXAssociationcrowdturﬁngsites,e.g.ZhuBaJie(ZBJ)2andSanDaHa(SDH)3,andgenericcrowdsourcingsites[26,48].Thedetectionofcrowdturﬁngactivityisanidealcon-texttostudytheimpactofadversarialattacksonma-chinelearningtools.First,crowdturﬁngisagrowingthreattotoday’sonlineservices.Becausetasksareper-formedbyintelligentindividuals,theseattacksareunde-tectablebynormalmeasuressuchasCAPTCHAsorratelimits.Theresultsofthesetasks,fakeblogs,slander-ousreviews,fakesocialnetworkaccounts,areoftenin-distinguishablefromtherealthing.Second,centralizedcrowdturﬁngsiteslikeZBJandSDHproﬁtdirectlyfrommaliciouscrowdsourcingcampaigns,andthereforehavestrongmonetaryincentiveandthecapabilitytolaunchadversarialattacks.Thesesiteshavethecapabilitytomodifyaggregatebehavioroftheirusersthroughinter-facechangesorexplicitpolicies,therebyeith