
国内外大语言模型生成中文论文摘要对比研究——以图书情报领域为例
Comparative Research on the Abstracts of Chinese Papers Generating Large Language Models at Home and Abroad: Taking the Field of Library and Information as an Example
[目的/意义] 通过对国内外典型的大语言模型所生成的中文论文摘要进行对比分析,总结归纳两者间的异同点,为大语言模型后续的深度开发和发展研究提供参考。[方法/过程] 选取2023年国家社会科学基金年度项目中“图书馆、情报与文献学”学科的121个课题作为题目,通过ChatGPT4.0与文心大模型4.0分别生成中文摘要,经过数据预处理及文本分析,从高频词特征、词性分布、句子数量以及摘要内容长度等角度探讨国内外大语言模型生成内容的异同。然后,与中文期刊《图书情报工作》中的摘要写作做对比,判断大语言模型生成摘要是否贴合中文论文写作规范。[结果/结论] 文心一言生成摘要篇幅较短,字数较少,更贴合中文论文写作标准,GPT生成摘要的平均字数及句子数量较多,通过对比两个典型大语言模型生成内容的差距及特点,为大语言模型的完善与进一步深度开发提供一定的参考。
[Purpose/Significance] By comparing and analyzing the abstracts of Chinese papers generated by typical Large Language Models at home and abroad, we summarize the similarities and differences between the two, and provide references for the subsequent in-depth development of the Large Language Models and the development of research. [Method/Process] 121 topics in the discipline of "Library, Intelligence and Documentation " in the annual project of the National Social Science Foundation of China in 2023 were selected as the topics. The Chinese abstracts were generated by ChatGPT4.0 and ERNIE 4.0 respectively, and were analyzed in terms of the characteristics of high-frequency words, the distribution of words, the number of sentences, and the length of the abstract content to explore the similarities and differences of the content generated by the Large Language Models at home and abroad through the data preprocessing and the text analysis. Then, the comparison was also made with the abstracts written in the Chinese journal “Library and Intelligence Service” to determine whether the abstracts generated by the large language model are in line with the norms of Chinese thesis writing.[Result/Conclusion] The abstracts generated by ERNIE Bot are shorter, with fewer words, and more suitable for Chinese paper writing standards, while GPT generates abstracts with more words and sentences on average. By comparing the gaps and characteristics of the contents generated by the two typical Large Language Models, we can provide certain references for the improvement and further in-depth development of the Large Language Models.
ChatGPT4.0 / 文心大模型4.0 / 中文论文摘要
[1] |
丽台科技.大型语言模型有哪些用途?大型语言模型如何运作呢?[EB/OL].[2024-06-15].https://www.elecfans.com/d/2024516.html.(LEADTEK. What are the uses of large-scale language models? How do large-scale language models work? [EB/OL].[2024-06-15].https://www.elecfans.com/d/2024516.html.)
|
[2] |
清华大学.国产对话模型ChatGLM启动内测[EB/OL].[2024-06-15].https://www.tsinghua.edu.cn/info/1182/102133.htm.(TSINGHUA UNIVERSITY. A domestic dialogue model,ChatGLM has started internal testing[EB/OL].[2024-06-15].https://www.tsinghua.edu.cn/info/1182/102133.htm.)
|
[3] |
百度.最新成果!中国计算机大会现场王海峰揭秘文心大模型4.0[EB/OL].[2024-06-15].https://mp.weixin.qq.com/s/K5WRrfIoDtxPkZIlgXo9xQ.(BAIDU.Latest achievement! Wang Haifeng reveals ERNIE Bot4.0 at China computer conference [EB/OL].[2024-06-15].https://mp.weixin.qq.com/s/K5WRrfIoDtxPkZIlgXo9xQ.)
|
[4] |
JUNGWIRTH D, HALUZA D. Artificial intelligence and the sustainable development goals: an exploratory study in the context of the society domain[J].Journal of software engineering and applications,2023,16(4):91-112.
|
[5] |
CHOI W, ZHANG Y, STVILIA B. Exploring applications and user experience with generative AI tools: a content analysis of reddit posts on ChatGPT[J]. Proceedings of the Association for Information Science and Technology,2023,60(1):543-546.
|
[6] |
[7] |
KIM J H, KIM J, KIM S, et al. Effects of AI ChatGPT on travelers' travel decision-making[J].Tourism review,2024,79(5):1038-1057.
|
[8] |
翟其玲,张佳怡,刘宝瑞,等.基于LDA主题模型对AIGC的影响力分析[J].数据挖掘,2023,13(4):366-375. (ZHAI Q L, ZHANG J Y, LIU B R, et al. Influence analysis of AIGC based on LDA topic model[J].Hans journal of data mining,2023,13(4):366-375.)
|
[9] |
张新新,黄如花.生成式智能出版的应用场景、风险挑战与调治路径[J].图书情报知识,2023,40(5):77-86,27.(ZHANG X X, HUANG R H. Application scenarios, risk challenges and regulatory pathways of generative intelligent publishing[J].Documentation, information&knowledge,2023,40(5):77-86,27.)
|
[10] |
陆伟,马永强,刘家伟,等.数智赋能的科研创新——基于数智技术的创新辅助框架探析[J].情报学报,2023,42(9):1009-1017.(LU W,MA Y Q,LIU J W, et al. Data intelligence empowered innovation: an exploration of the innovation assistance framework based on data intelligence technology[J]. Journal of the China Society for Scientific and Technical Information,2023,42(9):1009-1017.)
|
[11] |
陆伟,刘家伟,马永强,等.ChatGPT为代表的大模型对信息资源管理的影响[J].图书情报知识,2023,40(2):6-9,70.(LU W,LIU J W,MA Y Q, et al. The influence of language models represented by ChatGPT on information resources management[J]. Documentation, information & knowledge,2023,40(2):6-9,70.)
|
[12] |
曹树金,曹茹烨.从ChatGPT看生成式AI对情报学研究与实践的影响[J].现代情报,2023,43(4):3-10.(CAO S J,CAO R Y. Influence of generative AI on the research and practice of information science from the perspective of ChatGPT[J]. Journal of modern information,2023,43(4):3-10.)
|
[13] |
赵浜,曹树金.国内外生成式AI大模型执行情报领域典型任务的测试分析[J].情报资料工作,2023,44(5):6-17.(ZHAO B,CAO S J. Test analysis of typical tasks in the information field performed by generative AI large models at home and abroad[J].Information and documentation services,2023,44(5):6-17.)
|
[14] |
张宏玲,沈立力,韩春磊,等.大语言模型对图书馆数字人文工作的挑战及应对思考[J].图书馆杂志,2023,42(11):31-39,61.(ZHANG H L, SHEN L L, HAN C L, et al. Challenges and reflections on the practical application of large language model in digital humanities work at libraries[J]. Library journal,2023,42(11):31-39,61.)
|
[15] |
张强,高颖,赵逸淳,等.ChatGPT在智慧图书馆建设中的机遇与挑战[J].图书馆理论与实践,2023(6):116-122.(ZHANG Q, GAO Y, ZHAO Y C, et al. The opportunity and challenge of ChatGPT in the construction of intelligent library[J]. Library theory and practice,2023(6):116-122.)
|
[16] |
ZUCKERMAN M, FLOOD R, TAN R J B, et al. ChatGPT for assessment writing[J].Medical teacher,2023,45(11):1224-1227.
|
[17] |
ALKAISSI H ,MCFARLANE S I. Artificial hallucinations in ChatGPT: implications in scientific writing[J].Cureus journal of medical science,2023,15(2):e35179.
|
[18] |
王一博,郭鑫,刘智锋,等.AI生成与学者撰写中文论文摘要的检测与差异性比较研究[J].情报杂志,2023,42(9):127-134.(WANG Y B, GUO X, LIU Z F, et al. Detection and comparative study of differences between AI-generated and scholar-written Chinese abstracts[J].Journal of intelligence,2023,42(9):127-134.)
|
[19] |
郭鑫,王一博,王继民.ChatGPT生成中文学术内容分析——以情报学领域为例[J].图书馆论坛,2024,44(3):134-143.(GUO X,WANG Y B,WANG J M. Feature analysis of Chinese academic content generated by ChatGPT: an example in the field of intelligence[J].Library tribune,2024,44(3):134-143.)
|
[20] |
王雅琪,曹树金.ChatGPT用于论文创新性评价的效果及可行性分析[J].情报资料工作,2023,44(5):28-38.(WANG Y Q,CAO S J. The effect and feasibility analysis of ChatGPT used in paper innovativeness evaluation[J].Information and documentation services,2023,44(5):28-38.)
|
[21] |
SALVAGNO M , TACCONE F S, GERLI A G. Can artificial intelligence help for scientific writing?[J].Critical care,2023,27(1):75-79.
|
[22] |
白如江,陈启明,张玉洁,等.基于ChatGPT+Prompt的专利技术功效实体自动生成研究[J]. 数据分析与知识发现,2024, 8 (4): 14-25.(BAI R J, CHEN Q M, ZHANG Y J, et al. Research on automatic entities generation of patent technology function matrix based on ChatGPT+Prompt[J].Data analysis and knowledge discovery,2024,8(4):14-25.)
|
[23] |
AYERS J W, POLIAK A, DREDZE M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum[J].JAMA internal medicine,2023,183(6):589-596.
|
邢 淼:文本分析,论文初稿撰写;
田 丽:研究设计,论文定稿。
/
〈 |
|
〉 |