AI-Enabled Scientific Data and Knowledge Management for Large-Scale Research Infrastructures

Zhang Lingling, Zhang Yueling, Xu Shangchong, Han Jiayi, Yang Zhen

Knowledge Management Forum ›› 2026, Vol. 11 ›› Issue (1) : 40-49.

PDF(3212 KB)
PDF(3212 KB)
Knowledge Management Forum ›› 2026, Vol. 11 ›› Issue (1) : 40-49. DOI: 10.13266/j.issn.2095-5472.2026.004  CSTR: 32306.14.CN11-6036.2026.004
Pioneering Exploration of AI-Empowered Knowledge Management and Services

AI-Enabled Scientific Data and Knowledge Management for Large-Scale Research Infrastructures

Author information +
History +

Abstract

[Purpose/Significance] Large-scale research infrastructures represent a symbol of national scientific competitiveness, and the scientific data they generate play a crucial role in scientific research, technological advancement, and economic development. However, traditional data management methods face challenges such as data redundancy, difficulties in sharing, and repetitive experiments, urgently requiring empowerment through next-generation artificial intelligence technologies. This paper aims to study a method for constructing an AI-enabled scientific data management and recommendation framework for large-scale research infrastructures, focusing on exploring a personalized recommendation system architecture that integrates knowledge graphs and large language models to enhance the integration and intelligent recommendation capabilities of scientific data. [Method/Process] This paper proposed an artificial intelligence-based framework for scientific data management and recommendation for large-scale research infrastructures. It introduced technologies such as knowledge graphs, link prediction, and large language models (LLMs), proposing a personalized data recommendation system architecture oriented towards the entire research process. [Result/Conclusion] The system architecture includes modules for data collection, knowledge extraction, semantic reasoning, and personalized recommendation. The data layer is responsible for the organization and integration of multi-source heterogeneous data, the analysis layer performs in-depth analysis using graph neural networks and large language models, and the application layer provides accurate recommendation services to researchers. The framework proposed in this paper contributes to improving data utilization, promoting scientific and technological innovation capabilities, and enhancing the integration and intelligent recommendation of multi-source heterogeneous data.

Key words

large-scale research infrastructure / scientific data management / knowledge graph / large language model / data recommendation system / multi-source heterogeneous data

Cite this article

Download Citations
Zhang Lingling , Zhang Yueling , Xu Shangchong , et al . AI-Enabled Scientific Data and Knowledge Management for Large-Scale Research Infrastructures[J]. Knowledge Management Forum. 2026, 11(1): 40-49 https://doi.org/10.13266/j.issn.2095-5472.2026.004

References

[1]
陈和生. 促进我国重大科技基础设施持续发展[J]. 科技导报, 2020, 38(10): 44-46.
CHEN H S. Promote the sustainable development of major scientific and technological infrastructure of China[J]. Science & technology review, 2020, 38(10): 44-46.
[2]
王贻芳. 中国重大科技基础设施的现状和未来发展[J]. 科技导报, 2023, 41(4): 5-13.
WANG Y F. Current status and future prospects of the national major infrastructure for science and technology [J]. Science & technology review, 2023, 41(4): 5-13.
[3]
郭华东, 陈和生, 闫冬梅, 等. 加强开放数据基础设施建设, 推动开放科学发展[J]. 中国科学院院刊, 2023, 38(6): 806-817.
GUO H D, CHEN H S, YAN D M, et al. Strengthening open data infrastructure and promoting open science [J]. Bulletin of Chinese Academy of Sciences, 2023, 38(6): 806-817.
[4]
杨小康, 许岩岩, 陈露, 等. AI for Science:智能化科学设施变革基础研究[J]. 中国科学院院刊, 2024, 39(1): 59-69.
YANG X K, XU Y Y, CHEN L, et al. AI for Science: AI-enabled scientific facility transforms fundamental research [J]. Bulletin of Chinese Academy of Sciences, 2024, 39(1): 59-69.
[5]
李国杰. 智能化科研(AI4R):第五科研范式[J]. 中国科学院院刊, 2024, 39(1): 1-9.
LI G J. AI4R: the fifth scientific research paradigm [J]. Bulletin of Chinese Academy of Sciences, 2024, 39(1): 1-9.
[6]
廖方宇, 李婧, 龙春, 等. 开放科学背景下科学数据开放共享安全挑战及我国对策思考[J]. 农业大数据学报, 2024, 6(2): 146-155.
LIAO F Y, LI J, LONG C, et al. Security challenges and countermeasures on open sharing of scientific data in the context of open science [J]. Journal of agricultural big data, 2024, 6(2): 146-155.
[7]
李树深. 数据与计算是科技创新的巨大驱动力[J]. 数据与计算发展前沿, 2019, 1(1): 1.
LI S S. Data and computation are powerful drivers for scientific and technological innovation [J]. Frontiers of data and computing, 2019, 1(1): 1.
[8]
中华人民共和国国务院办公厅. 科学数据管理办法[EB/OL]. [2026-02-12].
General Office of the State Council of the People's Republic of China. Measures for the management of scientific data[EB/OL]. [2026-02-12].
[9]
黎建辉, 李跃鹏, 王华进, 等. 科学大数据管理技术与系统[J]. 中国科学院院刊, 2018, 33(8): 796-803.
LI J H, LI Y P, WANG H J, et al. Scientific big data management technique and system [J]. Bulletin of Chinese Academy of Sciences, 2018, 33(8): 796-803.
[10]
BARROS M, MOITINHO A, COUTO F M. Using research literature to generate datasets of implicit feedback for recommending scientific items[J]. IEEE access, 2019, 7: 176668-176680.
[11]
GHANNADRAD A, AREZOUMANDAN M, CANDELA L, et al. Recommender systems for science: a basic taxonomy[C]//Proceedings of the 18th Italian research conference on digital libraries. Aachen: CEUR-WS.org, 2022.
[12]
MUKUND N, THAKUR S, ABRAHAM S, et al. An information retrieval and recommendation system for astronomical observatories[J]. The astrophysical journal supplement series, 2018, 235(1): 22.
[13]
YANG Y, ZHAO B. How building information modelling mitigates complexity and enhances performance in large-scale projects: evidence from China[J]. International journal of project management, 2025, 43(2): 102694.
[14]
HU Z Z, ZHANG J P, YU F Q, et al. Construction and facility management of large MEP projects using a multi-scale building information model[J]. Advances in engineering software, 2016, 100(10): 215-230.
[15]
ANTONELLO F, BARALDI P, ZIO E, et al. A novel metric to evaluate the association rules for identification of functional dependencies in complex technical infrastructures[J]. Environment systems and decisions, 2022, 42(3): 436-449.
[16]
KOEHLER M, SAUERMANN H. Algorithmic management in scientific research[J]. Research policy, 2024, 53(4): 104985.
[17]
QIN L, WU W S, LIU D, et al. Autonomous planning and processing framework for complex tasks based on large language models[J]. Acta automatica Sinica, 2024, 50(4): 862-872.
[18]
WANG W, FENG F, HE X, et al. Denoising implicit feedback for recommendation[C]//Proceedings of the 14th ACM international conference on Web search and data mining. New York: Association for Computing Machinery, 2021: 373-381.
[19]
潘晨辉.基于多模态知识图谱的电子商务智能推荐研究[D]. 沈阳:沈阳工业大学, 2024.
PAN C H. Research on intelligent recommendation of E-commerce based on multimodal knowledge graph[D]. Shenyang: Shenyang University of Technology, 2024.
[20]
张凯, 石栖. 基于知识图谱的实验方案推荐研究——以有机太阳能电池为例[J]. 知识管理论坛, 2024, 9(5): 448-459.
ZHANG K, SHI Q. Experimental scheme recommendation based on knowledge graph: a case study of organic solar cells [J]. Knowledge management forum, 2024, 9(5): 448-459.
[21]
ZHAO W X, ZHOU K, LI J, et al. A survey of large language models[J]. arXiv:2303.18223, 2023.
[22]
朱莹. 基于多模态知识图谱与大语言模型的视觉问答系统: CN202410480844.4[P]. 2024-07-26.
ZHU Y. Visual question answering system based on multimodal knowledge graph and large language model: CN202410480844.4[P]. 2024-07-26.

Funding

National Natural Science Foundation of China "Research on Recommendation Systems Based on Knowledge Graphs and Link Prediction and Their Application in Equipment Health Management"(72071194)
PDF(3212 KB)

Accesses

Citation

Detail

Sections
Recommended

/