当前位置:首页 > 商业/管理/HR > 信息化管理 > OPINAX 一个有效的产品属性挖掘系统
-281-OPINAX:*122E-mail:haoby@cslt.riit.tsinghua.edu.cnE-mail:{yqxia,fzheng}@tsinghua.edu.cnOOVOOVOOVOOV20087.5%HAOBo-yi1,XIAYun-qing2,ZHENGFang21StateKeyLabofIntelligentTechnologyandSystemsTsinghuaUniversity,Beijing100084E-mail:haoby@cslt.riit.tsinghua.edu.cn2CenterforSpeechandLanguageTechnologies,RIIT,TsinghuaUniversity,Beijing100084,ChinaE-mail:{yqxia,fzheng}@tsinghua.edu.cnAbstract:Asamajortaskoftheproductopinionminingsystem,productattributeextractioninfluencesperformanceofthesystemsignificantly.Tofindtheout-of-the-vocabulary(OOV)productattributes,aneffectiveattributeminingalgorithmisproposedbasedonlanguagedependencyparsingandcorpusstatisticalanalysis.Basedonasmallsetofstandardproductattributes,thisalgorithmappliesthedependencyparsingtoolonreviewtexttofindthepotentialOOVproductattributes.ThenstatisticalfeaturesextractedfromboththedependencyparsingresultsandtextcontentareusedtofilterouttheinvalidOOVproductattributesandranktheattributebymeasuringtheconfidence.ExperimentsareconductedtoevaluateprecisionofOOVattributeextractionandeffectivenessoftherankingmethod.Moreover,contributionofvariousfeaturesisalsoevaluated.ExperimentresultsshowthatprecisionofOOVattributeextractionreaches87.5%inthetop200candidates.Keywords:Opinionmining;out-of-vocabularyword;dependencyparsing*60703051-282-SQLOM[1~4]out-of-vocabulary,OOVOOVOOVKobayashi[6]HuLiu[2]PopescuEtzioni[3]OPINEPMIHuLiu[2]OPINE0.030.22OPINAX[7]OPINAXOPINAX[8][9]OOVYiNiblack[1][4]Xia[5]OOV-283-[10,11]FASTR[12]OPINAX1OPINAXOPINAXOPINMINE1OPINAXFig.1FrameworkofOPINAXsystemOPINAXCEICTCLAS[14]HITDParser[15]CECR2-284-2132Fig.2Example1ofdependencyparsingFig.3Example2ofdependencyparsing1ATTSBVADVndaATTCEATTOPINAXATTCOOVOBATT122343ATTATTCE43Fig.4Example3ofdependencyparsingOPINAX5.16-285-=SENOLTO,TL,TF,INDO,DR,FV11DRATTCOOVOB2,,,,,,FVDRPriPOSDRSecPOSINDOTFTLLTOSENO=++2INDOOPINAXOPINMINE2201INDO10TFTFTLTLLTOLTOLTO10SENOSENOSBVSENO=10-286-5.2XD,,0,12,8,1,1fvATTnnATTnws=−−3ATTn-nATTn-wsINDO=012TF=128TL=8LTO=1SENO=166.1F1P1F2P2…FnPn(P1P2…Pn)F1Fn{01}DRPri+POS=ATTn-nDRPri+POSATTn-n=1DRPri+POSATTn-nDRPri+POS1122()nnRdfvPFPFPF=⋅+⋅++⋅K4P1Pn6.2191/9FinfiFi=1npiFi=1tag=FiPFiPFi=npi/nfiPFiFi1122()nnRwfvPFFPFFPFF=⋅+⋅++⋅K5-287-7.1OPINMINE[7]13.2M2.0M11.2MP@NOPINAXP@NNP@N=n/NnN7.2(W1_W2) W0_(W1_W2)(W1_W2)_W3W0_(W1_W2)_W3W‐1_W0_(W1_W2)W1W25DRPri+POS DRSec+POS LTOTFTL5NP@NP@N5Fig.5Comparisonofresultswithdifferentlayersofdependencyparsing7.3-288-DRPri+POS DRSec+POS LTO INDO TF TLDRPri+POS DRSec+POS LTO SENO TF TLDRPri+POS DRSec+POS LTO INDO SENO TF TL 6786Fig.6ComparisonofresultswithINDOandthatwithoutINDO7Fig.7ComparisonofresultswithSENOandthatwithoutSENO-289-Fig.8ComparisonofresultswithfeaturesINDOandSENOandthatwithoutthem4007.41500PFi5 9 9Fig.9Comparisonofresultsbetweenthetworankingmethods-290-20087.5%[1]YiJandNiblackW.SentimentMininginWebFountain.InProc.ofICDE-2005,pp.1073-1083.[2]M.HuandB.Liu.Miningandsummarizingcustomerreviews.InProc.ofKDD’04,pp.168-177.[3]A.PopescuandO.Etzioni.Extractingproductfeaturesandopinionsfromreviews.InProc.ofHLT-EMNLP’05,pp.339-346.[4]..2006Vol.26(No.11)P.2622-2625..[5]Y.Xia,R.Xu,K.-F.Wong,F.Zheng.TheUnifiedCollocationFrameworkforOpinionMining.InProc.OfICMLC-2007:Vol.2,p.844-850.[6]N.Kobayashi,K.Inui,Y.Matsumoto.Collectingevaluativeexpressionsforopinionextraction.InProcofIJCNLP-2004,pp.596--605.[7]R.Xu,Y.XiaandK.-F.Wong.OpinionAnnotationinOn-lineChineseProductReviews.InProc.ofLREC-2008.[8]B.Pang,L.LeeandS.Vaithyanathan.Thumbsup?SentimentClassificationusingMachineLearningTechniques.InProc.ofEMNLP-02,pp.79-86.[9]P.D.TurneyandM.L.Littman.Unsupervisedlearningofsemanticorientationfromahundred-billion-wordcorpus.TechnicalReportEGB-1094,NationalResearchCouncilCanada.[10]B.Daille.StudyandImplementationofCombinedTechniquesforAutomaticExtractionofTerminology.TheBalancingAct:CombiningSymbolicandStatisticalApproachestoLanguage.MITPress,Cambridge.[11]J.JustesonandS.Katz.TechnicalTerminology:somelinguisticpropertiesandanalgorithmforidentificationintext.NaturalLanguageEngineering1(1):9-27.[12]FASTR.[13]R.BunescuandR.Mooney.CollectiveInformationExtractionwithRelationalMarkovNetworks.InProc.ofACL’04,pp.439-446.[14]Z.Zhang,H.YuandD.Xiong.HMM-basedChineseLexicalAnalyzerICTCLAS.Inthe2ndSIGHANworkshopaffiliatedwithACL’03,pp.184-187.[15].
本文标题:OPINAX 一个有效的产品属性挖掘系统
链接地址:https://www.777doc.com/doc-467148 .html