基于深度学习的商品评论情感分类研究

李文江, 陈诗琴

知识管理论坛 ›› 2018, Vol. 3 ›› Issue (6) : 353-363.

PDF(1110 KB)
PDF(1110 KB)
知识管理论坛 ›› 2018, Vol. 3 ›› Issue (6) : 353-363. DOI: 10.13266/j.issn.2095-5472.2018.034

基于深度学习的商品评论情感分类研究

作者信息 +

Research on Sentiment Classification of Commodity Reviews Based on Deep Learning

Author information +
文章历史 +

摘要

[目的/意义] 对已有的文本表示、分类算法进行组合,遴选一种复杂度低、训练时间少的组合方式,构建商品评论情感文本分类的优化模型。[方法/过程] 以Keras API为应用环境,将Word2vec词向量输入Embedding嵌入层,依据句子词索引序列,通过控制trainable参数实现3种商品评论的文本表示;将不同的文本表示分别与不同分类算法进行匹配,分析分类效果差异,确立较优算法组合。[结果/结论] Word2vec词向量输入Embedding嵌入层继续训练的文本表示方法,结合TextCNN算法训练获得的分类模型,在商品评论测试集上分类效果表现较好,准确率和ROC曲线面积AUC值分别为94.02%、0.982 7。应用表明,分类模型能较好实现商品评论的情感分类,有较好的分类泛化能力。

Abstract

[Purpose/significance] The existing text representation and classification algorithms are combined, and a combination mode of low complexity and less training time is selected to construct an optimization model for the classification of emotional texts of commodity reviews. [Method/process] Firstly, this paper took the Keras API as an application environment, input Word2vec word vector into Embedding embedded layer. Then, based on sentence word index sequence, three kinds of commodity comment text representation were realized by controlling the trainable parameter. Finally,in this paper, different text representations were matched with different classification algorithms, differences in classification effects were analyzed, and the better combination of algorithms was established. [Result/conclusion] The text representation method which is continued training by Inputting Word2vec Word Vector into Embedding embedded Layer, combined with the TextCNN algorithm establishes the classification model. It performs better on the product review test set. Its accuracy and ROC curve area AUC values are 94.02% and 0.9827, respectively. The application shows that the classification model can better realize the emotional classification of commodity reviews and has better classification generalization ability.

关键词

深度学习 / 情感分类 / Word2vec词向量 / Embedding嵌入层 / TextCNN

Key words

deep learning / sentiment classification / Word2vec word vector / Embedding embedded layer / TextCNN

引用本文

导出引用
李文江 , 陈诗琴. 基于深度学习的商品评论情感分类研究[J]. 知识管理论坛. 2018, 3(6): 353-363 https://doi.org/10.13266/j.issn.2095-5472.2018.034
LI Wen jiang , Chen Shi qin. Research on Sentiment Classification of Commodity Reviews Based on Deep Learning[J]. Knowledge Management Forum. 2018, 3(6): 353-363 https://doi.org/10.13266/j.issn.2095-5472.2018.034
中图分类号: TP391   

参考文献

[1]
黄仁,张卫.基于word2vec的互联网商品评论情感倾向研究[J].计算机科学,2016,43(S1):387-389.
[2]
谢法举,刘臣,唐莉.在线评论情感分析研究综述[J].软件导刊,2018,17(2):1-4,7.
[3]
王志涛,於志文,郭斌,等.基于词典和规则集的中文微博情感分析[J].计算机工程与应用,2015,51(8):218-225.
[4]
赵刚,徐赞.基于机器学习的商品评论情感分析模型研究[J].信息安全研究,2017,3(2):166-170.
[5]
郭博,李守光,王昊,等.电商评论综合分析系统的设计与实现——情感分析与观点挖掘的研究与应用[J].数据分析与知识发现,2017,1(12):1-9.
[6]
热西旦木•吐尔洪太,吾守尔•斯拉木,伊尔夏提•吐尔贡.词典与机器学习方法相结合的维吾尔语文本情感分析[J].中文信息学报,2017,31(1):177-183,191.
[7]
王新宇.基于情感词典与机器学习的旅游网络评价情感分析研究[J].计算机与数字工程,2016,44(4):578-582,766.
[8]
王正成,李丹丹.基于词向量和情感本体的短文本情感分类[J].浙江理工大学学报(社会科学版),2018,40(1):33-38.
[9]
胡朝举,赵晓伟.基于词向量技术和混合神经网络的情感分析[J].计算机应用研究,2018,35(12):3556-3559,3574.
[10]
金志刚,韩玥,朱琦.一种结合深度学习和集成学习的情感分析模型[J].哈尔滨工业大学学报, 2018,50(11):32-39.
[11]
刘全,梁斌,徐进,等.一种用于基于方面情感分析的深度分层网络模型[J/OL].[2018-06-08].http://kns.cnki.net/kcms/detail/11.1826.TP.20171129.2026.006.html.
[12]
YOON K.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Stroudsburg: Association for Computational Linguistics,2014:1746-1751.
[13]
TOMAS M,KAI C,GREG C, et al.Efficient estimation of word representations in vector space[EB/OL].[2018-06-08]. https://arxiv.org/pdf/1301.3781.pdf.
[14]
王勤勤,张玉红,李培培,等.基于word2vec的跨领域情感分类方法[J].计算机应用研究,2018,35(10):2924-2927.
[15]
蔡林森,彭超,陈思远,等.基于多样化特征卷积神经网络的情感分析[J/OL].[2018-06-08].https://doi.org/10.19678/j.issn.1000-3428.0050338.
[16]
段传明.传统情感分类方法与基于深度学习的情感分类方法对比分析[J].软件导刊,2018,17(1):22-24.
[17]
孙超红.基于递归神经网络的微博情感分类研究[D].杭州:浙江理工大学,2017.
[18]
范炜昊,徐健.基于网络用户评论情感计算的用户痛点分析——以手机评论为例[J].情报理论与实践,2018,41(1):94-99.
[19]
Keras中文文档[EB/OL].[2018-06-08]. http://keras-cn.readthedocs.io/en/latest/.
[20]
薛炜明,侯霞,李宁.一种基于word2vec的文本分类方法[J].北京信息科技大学学报(自然科学版),2018,33(1):71-75.
[21]
朱磊.基于word2vec词向量的文本分类研究[D].重庆:西南大学,2017.
[22]
gensim: models.word2vec–Deep learning with word2vec[EB/OL].[2018-06-08].https://radimrehurek.com/gensim/models/word2vec.html.
[23]
词向量与Embedding究竟是怎么回事? [EB/OL].[2018-06-08].https://spaces.ac.cn/archives/4122.
[24]
李锐,张谦,刘嘉勇.基于加权word2vec的微博情感分析[J].通信技术,2017,50(3):502-506.

作者贡献说明

李文江: 提出研究思路,设计研究方案,进行实验分析,起草论文初稿;

陈诗琴: 负责采集、清洗和分类标注数据,论文最终版本修订。


编辑: 刘远颖
PDF(1110 KB)

Accesses

Citation

Detail

段落导航
相关文章

/