摘要:目前主观信息情感分类常用的方法主要有基于情感知识和基于规则两类,其中机器学习方法在效果上优于基于情感知识的方法。但机器学习算法要求数据分布均匀,单一的算法有各自的优点和缺点,因此,本文综合多种机器学习算法突出他们的优点避免其缺点,提出了一种多算法集成的微博细粒度情绪分类方法。我们首先采用朴素贝叶斯(NB)分类器对微博进行主客观的粗粒度分类,然后基于情感本体库对有情绪微博建立空间向量模型,最后采用支持向量机(SVM)算法对微博进行细粒度情感判别。实验结果表明,机器学习算法融合的分类方法结果要好于单一分类算法和基于情感词典的方法。
关键词:微博细粒度情绪;多策略集成方法;机器学习算法;朴素贝叶斯;支持向量机
Fused machine learning algorithm to achieve micro-blog fine-grained emotion classification
Abstract: At present, the dictionary based and the rules based method are usually used in text sentiment classification task. The machine learning algorithm in effect is better than the method based on dictionary. However, machine learning algorithms require data distribution and a single algorithm has its own advantages and disadvantages. So, a multi-strategy integrated algorithm about fine-grained emotion recognition is proposed by us. Firstly, apply Naive Bayes (NB) classifier to identify sentiment or non-sentiment about a micro-blog. Then represent micro-blog that has emotion to space vector based on emotional ontology. Finally, support vector machine (SVM) algorithm to determine micro-blog’s fine-grained emotion. And the experiment results proved that integration