
Research on Fine-grained Opinion Mining and Distribution Law of opinion for E-Commerce Customer Reviews
Lu Chenchen, Wang Hao, Shi Bin, Qiu Jingwen
Knowledge Management Forum ›› 2024, Vol. 9 ›› Issue (3) : 253-268.
Research on Fine-grained Opinion Mining and Distribution Law of opinion for E-Commerce Customer Reviews
[Purpose/Significance] E-commerce customer reviews contain a wealth of valuable information. By identifying user opinions and analyzing the distribution patterns and differences, this research aims to provide insights for consumers, businesses, and platforms. [Method/Process] Firstly, based on the UIE model, user opinions were extracted from customer reviews in the three industries of furniture, snack, and mobile phone. Secondly, based on the product feature thesaurus and BERT model, the semantic similarity between words was calculated to generalize and filter opinions. Finally, based on the IPA model, statistical analysis and visualization of user opinions were conducted to provide optimization suggestions for businesses and platforms. [Result/Conclusion] In terms of opinion mining, the model performs well across all three industries with the F1 values of 79.85%, 83.28%, and 85.71% respectively, which confirms the effectiveness of the method. In the mobile phone industry, regularity analysis indicates user attention mainly focuses on performance, appearance, and battery, but significant differences in opinion distribution are observed among various platforms and brands. Moreover, user satisfaction generally shows a shifting trend of positive to negative from initial reviews to follow-up reviews.
customer reviews / fine-gained sentiment analysis / opinion mining / pre-trained language model / IPA analysis
[1] |
HU M, LIU B. Mining and summarizing customer reviews[C]//Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. New York: Association for Computing Machinery, 2004, 168-177.
|
[2] |
周知,方正东.融合依存句法与产品特征库的用户观点识别研究[J].情报理论与实践,2021,44(7):111-117. (ZHOU Z, FANG Z D. Research on user opinion recognition based on dependency syntax and product feature thesaurus[J]. Information studies: theory & application, 2021, 44(7):111-117.)
|
[3] |
睢国钦,那日萨,彭振.基于深度学习和CRFs的产品评论观点抽取方法[J].情报杂志,2019,38(5):177-185. (SUI G Q, NA R S, PENG Z. Approach to extracting opinion from products reviews based on deep learning and CRFs[J]. Journal of intelligence, 2019, 38(5):177-185.)
|
[4] |
张诗林.基于Bi-LSTM和CRF的中文网购评论中商品属性提取[J].计算机与现代化,2019,282(2):93-97. (ZHANG S L. Commodity attributes extracting in Chinese shopping reviews based on Bi-LSTM and CRF[J]. Computer and modernization, 2019, 282(2):93-97.)
|
[5] |
李亚琴.电子商务平台用户在线评论比较研究[J].现代情报,2017,37(7):79-83. (LI Y Q. Comparative research on online consumer reviews of e-commerce platforms[J]. Journal of modern information, 2017, 37(7):79-83.)
|
[6] |
曹喆,郭慧兰,吴江,等.元宇宙的理想与现实:基于评论挖掘的VR产品用户感知研究[J].数据分析与知识发现,2023,7(1):49-62. (CAO Z, GUO H L, WU J, et al. The ideal and reality of metaverse: user perception of VR products based on review mining[J]. Data analysis and knowledge discovery, 2023, 7(1):49-62.)
|
[7] |
王克勤,刘朝明.基于在线评论的重要度绩效竞争对手分析的产品设计改进方法[J].计算机集成制造系统,2022,28(5):1496-1506. (WANG K Q, LIU C M. Product design improvement based on importance performance competitor analysis of online reviews[J]. Computer integrated manufacturing systems, 2022, 28(5):1496-1506.)
|
[8] |
石文华,龚雪,张绮,等.在线初次评论与在线追加评论的比较研究[J].管理科学,2016,29(4):45-58. (SHI W H, GONG X, ZHANG Q, et al. A comparative study on the first-time online reviews and appended online reviews[J]. Journal of management science, 2016, 29(4):45-58.)
|
[9] |
张艳丰,王羽西,彭丽徽,等.基于文本挖掘的在线用户追加评论内容情报研究——以京东商城手机评论数据为例[J].现代情报,2020,40(9):96-105. (ZHANG Y F, WANG Y X, PENG L H, et al. Research on information of online users' additional comments based on text mining——take the mobile phone review data of Jingdong mall as an example[J]. Journal of modern information, 2020, 40(9):96-105.)
|
[10] |
史丽丽,林军,朱桂阳.基于混合神经网络的中文在线评论产品特征提取及消费者需求分析[J].数据分析与知识发现,2023,7(10):63-73. (SHI L L, LIN J, ZHU G Y. A hybrid neural network for product feature extraction and customer requirements analysis on Chinese online reviews[J]. Data analysis and knowledge discovery, 2023,7(10):63-73.)
|
[11] |
韩玺,蒋佩瑶,韩文婷,等.医生在线评价信息特征的影响因素研究:社会资本和社会交换理论的视角[J].信息资源管理学报,2023,13(1):78-90. (HAN X, JIANG P Y, HAN W T, et al. Influencing factors of doctors’ online rating information characteristics: based on social capital theory and social exchange theory[J]. Journal of information resources management, 2023, 13(1):78-90.)
|
[12] |
余佳琪,赵豆豆,刘蕤.在线健康社区慢性病患者评论主题情感协同挖掘研究——以甜蜜家园为例[J/OL].数据分析与知识发现,2023,7(10):95-108. (YU J Q, ZHAO D D, LIU R. A topic-sentiment collaborative data mining on the chronic disease patients’ reviews in online health community —an evidence from “Sweet Homeland”[J]. Data analysis and knowledge discovery,2023,7(10):95-108.)
|
[13] |
孙宝生,敖长林,王菁霞,等.基于网络文本挖掘的生态旅游满意度评价研究[J].运筹与管理,2022,31(12):165-172. (SUN B S, AO C L, WANG J X, et al. Evaluation of ecotourism satisfaction based on online text mining[J]. Operations research and management science, 2022, 31(12):165-172.)
|
[14] |
邰杨芳.健康教育类在线课程的用户需求及评价挖掘分析[J].中国大学教学,2023(S1):100-113. (TAI Y F. User demand and evaluation mining analysis of health education online courses[J]. China university teaching, 2023(S1):100-113.)
|
[15] |
李冠,赵毅.基于在线评论的政府数据开放平台用户增量需求研究[J].数字图书馆论坛,2022(12):37-46. (LI G, ZHAO Y. Research on user incremental demand of government data open platform based on online comments[J]. Digital library forum, 2022(12):37-46.)
|
[16] |
肖宇晗,林慧苹.基于CWSA方面词提取模型的差异化需求挖掘方法研究——以京东手机评论为例[J].数据分析与知识发现,2023,7(1):63-75. (XIAO Y H, LIN H P. Mining differentiated demands with aspect word extraction: case study of smartphone reviews[J]. Data analysis and knowledge discovery, 2023, 7(1):63-75.)
|
[17] |
丁晟春,侯琳琳,王颖.基于电商数据的产品知识图谱构建研究[J].数据分析与知识发现,2019,3(3):45-56. (DING S C, HOU L L, WANG Y. Product knowledge map construction based on the e-commerce data[J]. Data analysis and knowledge discovery, 2019, 3(3):45-56.)
|
[18] |
李叶叶,李贺,沈旺,等.基于多源异构数据挖掘的在线评论知识图谱构建[J].情报科学,2022,40(2):65-73,98.(LI Y Y, LI H, SHEN W, et al. Construction of online comment knowledge graph based on multi-source heterogeneous data mining[J]. Information science, 2022, 40(2):65-73,98.)
|
[19] |
PENG H Y, XU L, BING L D, et al. Knowing what, how and why: a near complete solution for aspect-based sentiment analysis[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8600-8607.
|
[20] |
ZHANG C, LI Q, SONG D, et al. A multi-task learning framework for opinion triplet extraction[C]//Findings of the Association for Computational Linguistics: EMNLP 2020, Online: Association for Computational Linguistics, 2020, 819-828.
|
[21] |
XU L, CHIA Y K, BING L D. Learning span-level interactions for aspect sentiment triplet extraction[C]//Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). Online: Association for Computational Linguistics, 2021: 4755-4766.
|
[22] |
YAN H, DAI J Q, JI T, et al. A Unified generative framework for aspect-based sentiment analysis[C]//Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). Online: Association for Computational Linguistics, 2021: 2416-2429.
|
[23] |
CHEN S W, WANG Y, LIU J, et al. Bidirectional machine reading comprehension for aspect sentiment triplet extraction[J]. arXiv, 2021: arxiv.org/abs/2103.07665.
|
[24] |
ZHANG W X, LI X, DENG Y, et al. Towards generative aspect-based sentiment analysis[C]//Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (Volume 2: Short Papers). Online: Association for Computational Linguistics, 2021: 504-510.
|
[25] |
LU Y J, LIU Q, DAI D, et al. Unified structure generation for universal information extraction[C]//Proceedings of the 60th annual meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin: Association for Computational Linguistics, 2022: 5755–5772.
|
[26] |
吴江,侯绍新,靳萌萌,等.基于LDA模型特征选择的在线医疗社区文本分类及用户聚类研究[J].情报学报,2017,36(11):1183-1191. (WU J, HOU S X, JIN M M, et al. LDA feature selection based text classification and user clustering in Chinese online health community[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(11):1183-1191.)
|
[27] |
单晓红,孔维嘉,王蕊.社交媒体数据驱动的老年人智能化需求研究[J].情报理论与实践,2022,45(8):23-30. (SHAN X H, KONG W J, WANG R. Research on the intelligent needs of the elderly driven by social media data[J]. Information studies: theory & application, 2022, 45(8):23-30.)
|
[28] |
吴江,李秋贝,胡忠义,等.基于IPA模型的乡村旅游景区游客满意度分析[J].数据分析与知识发,2023,7(7):89-99.(WU J, LI Q B, HU Z Y, et al. Analysis on tourist satisfaction of rural tourism attractions based on IPA model[J]. Data analysis and knowledge discovery,2023,7(7):89-99.)
|
[29] |
叶佳鑫,熊回香,孟璇.基于细粒度评论挖掘的在线图书相似度计算研究[J].情报科学,2023,41(1):166-173. (YE J X, XIONG H X, MENG X. Online book similarity calculation based on fine-grained review mining[J]. Information science, 2023, 41(1):166-173.)
|
[30] |
肖寒琼,张馨遇,肖宇晗,等.基于方面词的用户消费心理画像方法[J].数据分析与知识发现,2022,6(6):22-31. (XIAO H Q, ZHANG X Y, XIAO Y H, et al. Creating consumer psychology portrait with aspect words[J]. Data analysis and knowledge discovery, 2022, 6(6):22-31.)
|
[31] |
DEVLIN J, CHANG M-W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv, 2019: arxiv.org/abs/1810.04805..
|
[32] |
MARTILLA J A, JAMES J C. Importance-performance analysis[J]. Journal of marketing, 1977, 41(1): 77-79.
|
[33] |
KANO N, SERAKU N, TAKAHASHI F, et al. Attractive quality and must-be quality[J]. Journal of the Japanese Society for Quality Control, 1984, 14(2): 39-48.
|
[34] |
黄官伟,邵立轲.基于在线评论与IPA-Kano模型的酒店服务质量管理研究[J].上海管理科学,2021,43(6): 12-17. (HUANG G W, SHAO L K. Research on hotel service quality management based on online reviews and IPA-Kano model[J]. Shanghai management science, 2021, 43(6): 12-17.)
|
[35] |
李贺,曹阳,沈旺,等.基于LDA主题识别与Kano模型分析的用户需求研究[J].情报科学,2021,39(8):3-11,36. (LI H, CAO Y, SHEN W, et al. User demand based on LDA subject identification and Kano model analysis[J]. Information science, 2021, 39(8):3-11,36.)
|
[36] |
SUN Y, WANG S, FENG S, et al. ERNIE 3.0: large-scale knowledge enhanced pre-training for language understanding and generation[J]. arXiv, 2021: arxiv.org/abs/2107.02137.
|
陆晨晨:负责模型构建和实验设计,论文起草、撰写与修改;
王 昊:确定研究思路,指导实验,提出论文框架,指导论文修改;
石 斌:指导论文修改;
裘靖文:指导论文修改。
/
〈 |
|
〉 |