大连民族学院本科毕业设计(论文)互联网产品评论的情感分类研究学院(系):计算机科学与工程专业:计算机科学与技术学生姓名:赵迪学号:2010210730指导教师:孟佳娜评阅教师:刘爽完成日期:2015年6月16日大连民族学院互联网产品评论的情感分类研究--I摘要情感分类研究是最近十多年来新出现的一门学科,而且这门学科将会影响到很多学科的调研作用,尤其在文化的方面,可以真正的了解大众情感趋势和大众情感预测。目前,情感分类研究学科主要研究褒义贬义的情感分类研究、在线的语言价值挖掘,大部分的研究需要基础性信息搜索、学习计算机机器语言、处理大众信息语言、数学统计学相关的专业知识,现在也有一些特定的处理语言方法,收集网络评论后台的数据库,对文档进行分析,可以判别出其中的复杂情感,判断用户的情绪变化趋势,这也是大数据时代,对数据分析的基础。本文针对主要针对互联网的中文产品评论文本,对其进行评论倾向性的分析,并根据已标注的样本,对产品评论的倾向性进行预测。本系统首先对语料进行处理,使用分词系统对语料进行分词、去停用词、构建词典。然后通过TFIDF加权算法对语料的权值计算。最后,使用支持向量机LIBSVM料进行分类,得到产品评论的正面或负面的倾向性预测结果。关键词:情感分类;权值;语料;支持向量机互联网产品评论的情感分类研究--IIResearchonSentimentclassificationofInternetProductReviewsAbstractEmotionalclassificationstudyisinthepasttenyearsanewdiscipline,andthesubjectwillaffectalotofsubjectsofresearch,especiallyintheaspectofculture,cantrulyunderstandthepublicemotionaltrendandaffectiveforecastingofthemasses.Atpresent,theemotionalclassificationresearchsubjectresearchofgoodnegativeemotionclassificationresearch,onlinelanguagevaluemining,mostresearchneedbasicinformationsearch,learncomputermachinelanguage,process,publicinformationlanguage,mathematicalstatisticsrelatedprofessionalknowledge,nowtherearesomespecificmethodsofprocessinglanguagecollectionnetworkcommentsthebackgrounddatabase,analysisofdocuments,canidentifythecomplexemotions,judgethetrendoftheuser'smoodchanges,thisistheeraofbigdata,onthebasisofdataanalysis.ThispaperfocusesontheInternetproductreviewsinChinesetext,theanalysisofthecommentonitsorientation,andaccordingtothemarkedsample,topredictthetendencyofproductreviews.Thissystemtodealwithcorpus,thefirsttousethewordsegmentationsystemtoparticiplecorpus,tostopwordsandthelexicon.AndthenthroughtheweightedalgorithmTFIDFweightcalculationofcorpus.Finally,theuseofsupportmachinesLIBSVMmaterialclassification,bepositiveornegativeorientationofproductreviewspredictionresults.Keywords:sentimentclassification;weight;corpus;supportvectormachine互联网产品评论的情感分类研究--III目录摘要..............................................................................................................................IResearchonSentimentclassificationofInternetProductReviews..............................II1绪论.......................................................................................................................-1-1.1课题背景.....................................................................................................-1-1.2研究现状....................................................................................................-2-1.3本文工作....................................................................................................-2-2.产品评论倾向性分析..........................................................................................-4-2.1什么是中文分词.........................................................................................-4-2.2中文分词技术.............................................................................................-4-2.2.1字符对比分词..................................................................................-4-2.2.2理解切分分词..................................................................................-5-2.2.3基于统计的分词方法......................................................................-6-2.3分词中的难题.............................................................................................-6-2.3.1歧义识别..........................................................................................-6-2.3.2新词识别..........................................................................................-7-2.4ICTCLAS....................................................................................................-7-2.5什么是TFIDF............................................................................................-8-2.5.1TFIDF的概率模型..........................................................................-8-2.5.2TFIDF的工作流程图......................................................................-8-2.6什么是libsvm............................................................................................-9-3系统设计与实现.................................................................................................-10-3.1系统流程...................................................................................................-10-3.2具体步骤...................................................................................................-10-3.3软件的前端设计......................................................................................-13-4.实验结果及分析................................................................................................-15-4.1实验语料与实验结果...............................................................................-15-4.2实验分析...................................................................................................-17-5结论.....................................................................................................................-19-致谢.......................................................................................................................-20-参考文献.............