Hot Topics and Evolution Analysis of Data Element Valorization Based on the BERTopic Model

Chen Wanming, Liu Yuanhua

Knowledge Management Forum ›› 2025, Vol. 10 ›› Issue (6) : 553-564.

PDF(3095 KB)
PDF(3095 KB)
Knowledge Management Forum ›› 2025, Vol. 10 ›› Issue (6) : 553-564. DOI: 10.13266/j.issn.2095-5472.2025.037  CSTR: 32306.14.CN11-6036.2025.037

Hot Topics and Evolution Analysis of Data Element Valorization Based on the BERTopic Model

Author information +
History +

Abstract

[Purpose/Significance] This study employs the BERTopic model to analyze the core themes, evolutionary pathways, and frontier dynamics of data element valorization in China, providing a macro perspective and objective foundation for advancing theoretical exploration and guiding practical applications. [Method/Process] Having used 1735 articles published in CNKI from 2019 to the present as the dataset, this research had applied the BERTopic topic modeling approach. First, a semantic embedding space had been constructed using a pre-trained language model, and UMAP had been applied for dimensionality reduction, combined with HDBSCAN for density-based clustering, to have initially generated the incorporation of a temporal analysis method to trace the dynamic evolution of research topics. [Result/Conclusion] The findings reveal that: ① Current research themes focus on five key areas: "Enterprise Innovation and Data Element Market Cultivation", "Data Infrastructure Systems and Industrial Digital Transformation", "Element Marketization Allocation and Institutional Reform", "Data Transaction Systems and Market Competition Mechanisms", and "Data Empowerment for Productivity Improvement and Regulatory Balance". ② Data element valorization exhibits multidimensional collaborative characteristics and forms a dynamic evolutionary mechanism.

Key words

data element valorization / BERTopic model / hot topics / evolutionary analysis

Cite this article

Download Citations
Chen Wanming , Liu Yuanhua. Hot Topics and Evolution Analysis of Data Element Valorization Based on the BERTopic Model[J]. Knowledge Management Forum. 2025, 10(6): 553-564 https://doi.org/10.13266/j.issn.2095-5472.2025.037

References

[1]
吴江, 陶成煦. 激活数据要素赋能千行万业——《“数据要素×”三年行动计划(2024—2026年)》政策解读[J]. 情报理论与实践, 2024, 47(3): 16-19.
WU J, TAO C X. Activating data elements empowers thousands of businesses: policy interpretation of "the 'Data Elements x' Three-year Action Plan(2024-2026)" [J]. Information studies: theory & application, 2024, 47(3): 16-19.
[2]
李海舰, 赵丽. 数据成为生产要素: 特征、机制与价值形态演进[J]. 上海经济研究, 2021(8): 48-59.
LI H J, ZHAO L. Data becomes a factor of production: characteristics, mechanisms, and the evolution of value form [J]. Shanghai journal of economics, 2021(8): 48-59.
[3]
张继栋. 地方国有企业数字化转型路径探讨[J]. 现代管理科学, 2021(3): 96-102.
ZHANG J D. Practical suggestions and approaches for digital transformation of local stateowned enterprises[J]. Modern management science, 2021(3): 96-102.
[4]
刘尚希, 邢丽, 樊轶侠, 等. 以数字化引领数实融合的内在机理与现实思考[J]. 财政研究, 2023, (12): 3-15.
LIU S X, XING L, FAN Y X, et al. The inner mechanism and realistic thinking of digitalization leading the integration of digital economy and real economy[J]. Public finance research, 2023(12): 3-15.
[5]
俞伯阳, 丛屹. 数字经济、人力资本红利与产业结构高级化[J]. 财经理论与实践, 2021, 42(3): 124-131.
YU B Y, CONG Y. Digital economy, human capital dividends, and advanced industrial structure[J]. The theory and practice of finance and economics, 2021, 42(3): 124-131.
[6]
尹西明, 林镇阳, 陈劲, 等. 数据要素价值化动态过程机制研究[J]. 科学学研究, 2022, 40(2): 220-229.
YIN X M, LIN Z Y, CHEN J, et al. Research on the dynamic value creation process of data element [J]. Studies in science of science, 2022, 40(2): 220-229.
[7]
柳卸林, 董彩婷, 丁雪辰. 数字创新时代:中国的机遇与挑战[J]. 科学学与科学技术管理, 2020, 41(6): 3-15.
LIU X L, DONG C T, DING X C. Innovation in the digital world: the opportunities and challenges of China [J]. Science of science and management of S.& T., 2020, 41(6): 3-15.
[8]
刘业政, 孙见山, 姜元春, 等. 大数据的价值发现:4C模型[J]. 管理世界, 2020, 36(2): 129-138.
LIU Y Z, SUN J S, JIANG Y C, et al. 4C model: value discovery in big data [J]. Journal of management world, 2020, 36(2): 129-138.
[9]
何伟. 激发数据要素价值的机制、问题和对策[J]. 信息通信技术与政策, 2020(6): 4-7.
HE W. Mechanism, problems and countermeasures to stimulate the value of data [J]. Information and communications technology and policy, 2020(6): 4-7.
[10]
杨思洛, 吴丽娟. 基于BERTopic模型的国外信息资源管理研究进展分析[J]. 情报理论与实践, 2024, 47(2): 189-197.
YANG S L, WU L J. Research progress of foreign information resource management: an analysis based on the BERTopic model[J]. Information studies: theory & application, 2024, 47(2): 189-197.
[11]
苏会灵. 档案数据要素流通的关键问题与对策研究[J]. 山西档案, 2024(2): 62-64.
SU H L. Research on key issues and countermeasures of archival data element circulation [J]. Shanxi archives, 2024(2): 62-64.
[12]
CHEN C L P, ZHANG C Y. Data-intensive applications, challenges, techniques and technologies: a survey on big data [J]. Information sciences, 2014(275):314-347.
[13]
CHEN M, MAO S, LIU Y. Big data: a survey[J].Mobile networks & applications, 2014, 19(2):171-209.
[14]
CHEN P. Visualization of real-time monitoring datagraphic of urban environmental quality [J]. EURASIP journal on image and video processing, 2019, 42: 1-9.
[15]
黄科满, 杜小勇. 数据治理价值链模型与数据基础制度分析[J]. 大数据, 2022, 8(4): 3-16.
HUANG K M, DU X Y. Value chain model of data governance and its application on data governance regulation analysis [J]. Big data research, 2022, 8(4): 3-16.
[16]
徐蔼婷, 宋妙缘. 基于“价值创造—实现”路径的数据要素核算问题研究[J]. 现代经济探讨, 2024(4): 13-21.
XU A T, SONG M Y. Research on data factor accounting based on the path of "value formation and release" [J]. Modern economic research, 2024(4): 13-21.
[17]
金骋路, 陈荣达. 数据要素价值化及其衍生的金融属性:形成逻辑与未来挑战[J]. 数量经济技术经济研究, 2022, 39(7): 69-89.
JIN C L, CHEN R D. Data valuations and its derived financial attributes: formation logic and future challenges [J]. Journal of quantitative & technological economics, 2022, 39(7): 69-89.
[18]
刘桂锋, 吴雅琪, 韩牧哲, 等. 面向数据要素价值化的数据资源应用场景创新研究[J]. 情报理论与实践, 2025, 48(1): 53-62.
LIU G F, WU Y Q, HAN M Z, et al. Research on the innovation of data resource application scenarios oriented to the valorization of data elements [J]. Information studies: theory & application, 2025, 48(1): 53-62.
[19]
GROOTENDORST M. BERTopic: neural topic modeling with a class-based TF-IDF procedure[J]. arXiv preprint, 2022, arXiv:2203.05794.
[20]
张敏, 沈嘉裕. 突发公共卫生事件中政务短视频主题与用户行为的关联演化研究[J]. 情报杂志, 2023, 42(3): 181-189.
ZHANG M, SHEN J Y. Research on the evolution of the association between governmental short video themes and user behavior in public health emergency[J]. Journal of intelligence, 2023, 42(3): 181-189.
[21]
DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint, 2018, arXiv:1810.04805.
[22]
EGGER R, YU J. A topic modeling comparison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts[J]. Frontiers in sociology, 2022, 7: 886498.
[23]
HENDRY D, DARARI F, NURFADILLAH R, et al. Topic modeling for customer service chats[C]//2021 International conference on advanced computer science and information systems(ICACSIS). Depok: IEEE, 2021: 1-6.
[24]
KIM R, KIM D. Research trends of COVID-19 in public administration and policy research studies using BERTopic topic modeling [J]. Institute of governmental studies, 2022, 28(3): 105-137.
[25]
张清慧, 陈谊, 武彩霞. 基于词表示模型的领域文献数据可视分析方法[J]. 图学学报, 2022, 43(4): 685-694.
ZHANG Q H, CHEN Y, WU C X. A visual analysis approach for domain literature data based on word representation model [J]. Journal of graphics, 2022, 43(4): 685-694.
[26]
刘洋, 柳卓心, 金昊, 等. 基于BERTopic模型的用户层次化需求及动机分析——以抖音平台为例[J]. 情报杂志, 2023, 42(12): 159-167.
LIU Y, LIU Z X, JIN H, et al. User hierarchical need and motivation analysis ased on BERTopic model: taking Douyin platform as an example [J]. Journal of intelligence, 2023, 42(12): 159-167.

陈婉铭:数据收集与分析,论文撰写;

刘媛华:论文结构设计,论文修改与定稿。

PDF(3095 KB)

Accesses

Citation

Detail

Sections
Recommended

/