
单篇论文被引频次影响因素及预测研究综述
A Review of Research on Influencing Factors and Prediction of Citation Frequency of a Single Paper
[目的/意义]梳理单篇论文被引频次的相关影响因素以及被引频次预测研究现状,为科研人员和科研机构研究单篇论文被引频次影响因素及预测提供一个全面系统的认知框架。[过程/方法]采用文献调研法,通过对现有文献进行系统的梳理,总结被引频次预测的影响因素、研究对象和研究方法的相关内容和特点,并通过列表的方式对比分析不同的方法,总结现有研究普遍存在的问题和一些创新的解决方案。[结果/结论] 在系统梳理和总结的过程中发现,影响因素与预测结果之间因果关系不明确,研究样本数据缺乏多样性,未明确研究结果的适用性与预测周期的关系,模型评估可解释性较弱。因此,应从解决问题的前提条件、选择有针对性的样本、改进影响因素提取方法、运用数学思维方式进行建模等方面提高后续研究的质量。
[Purpose/Significance] Combing the relevant influencing factors of the citation frequency of a single paper and the research status of the prediction of the citation frequency, this paper provides a comprehensive and systematic cognitive framework from the perspective of the involvement of scientific researchers and scientific research institutions in such research. [Method/Process] Using the literature research method, through the systematic combing of the existing literature, this paper summarized the relevant contents and characteristics of the influencing factors, research objects and research methods of citation prediction, compared and analyzed different methods by means of list, and summarized the common problems and some innovative solutions of the existing research. [Result/Conclusion] In the process of systematic combing and summarizing, it is found that the causal relationship between influencing factors and prediction results is not clear, the research sample data is lack of diversity, the relationship between the applicability of research results and prediction cycle is not clear, and the interpretability of model evaluation is weak. Therefore, we should improve the follow-up research quality from the aspects of solving the preconditions of the problem, selecting targeted samples, improving the extraction methods of influencing factors, and using mathematical thinking mode for modeling.
被引频次预测 / 影响因素 / 回归分析 / 机器学习 / 深度学习
the prediction of the citation frequency / influencing factors / regression analysis / machine learning / deep learning
[1] |
BARABASI A L, SONG C, WANG D. Publishing: handful of papers dominates citation[J]. Nature, 2012, 491(7422): 40.
|
[2] |
DIDEGAH F, THELWALL M. Determinants of research citation impact in nanoscience and nanotechnology [J]. Journal of the American Society for Information Science and Technology, 2013, 64(5): 1055-1064.
|
[3] |
BUELA-CASAL G, ZYCH I. Analysis of the relationship between the number of citations and the quality evaluated by experts in psychology journals[J]. Psicothema, 2010, 22(2): 270-275.
|
[4] |
FU L D, ALIFERIS C F. Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature[J]. Scientometrics, 2010, 85(1): 257-270.
|
[5] |
YAN Y, TIAN S, ZHANG J. The impact of a paper’s new combinations and new components on its citation[J]. Scientometrics, 2019, 122(2): 895-913.
|
[6] |
CHAKRABORTY T, KUMAR S, GOYAL P, et al. Towards a stratified learning approach to predict future citation counts[C]//IEEE/ACM joint conference on digital libraries (Jcdl): IEEE, 2014: 351-360.
|
[7] |
柴嘉琪,陈仕吉.论文新颖性测度研究综述[J].农业图书情报学报, 2020, 32(10): 56-61.
|
[8] |
ANTONIOU G A, ANTONIOU S A, GEORGAKARAKOS E I, et al. Bibliometric analysis of factors predicting increased citations in the vascular and endovascular literature[J]. Annals of vascular surgery, 2015, 29(2):286-292.
|
[9] |
魏瑞斌.论文平均引用时差与被引频次相关性分析[J].情报杂志, 2018, 37(2): 135-141.
|
[10] |
ROTH C, WU J, LOZANO S. Assessing impact and quality from local dynamics of citation networks[J]. Journal of informetrics, 2013, 6(1): 111-120.
|
[11] |
BARNETT G A, FINK E L. Impact of the internet and scholar age distribution on academic citation age[J]. Journal of the American Society for Information Science and Technology, 2008, 59(4): 526-534.
|
[12] |
BORNMANN L, SCHIER H, MARX W, et al. What factors determine citation counts of publications in chemistry besides their quality? [J]. Journal of informetrics, 2012, 6(1): 11-18.
|
[13] |
BISCARO C, GIUPPONI C. Co-authorship and bibliographic coupling network effects on citations[J]. Plos one, 2014, 9(6): e99502.
|
[14] |
LEIMU R, KORICHEVA J. What determines the citation frequency of ecological papers? [J]. Trends in Ecology & evolution, 2005, 20(1): 28-32.
|
[15] |
JAMALI H R, NIKZAD M. Article title type and its relation with the number of downloads and citations [J]. Scientometrics, 2011, 88(2): 653-661.
|
[16] |
ROSTAMI F, MOHAMMADPOORASL A, HAJIZADEH M. The effect of characteristics of title on citation rates of articles[J]. Scientometrics, 2014, 98(3): 2007-2010.
|
[17] |
MCCABE M J, SNYDER C M. Does online availability increase citations? theory and evidence from a panel of economics and business journals[J]. Review of economics and statistics, 2015, 97(1): 144–165.
|
[18] |
STREMERSCH S, CAMACHO N, VANNESTE S, et al. Unraveling scientific impact: citation types in marketing journals[J]. International journal of research in marketing, 2015, 32(1): 64-77.
|
[19] |
ZHANG X, XIE Q, SONG M. Measuring the impact of novelty, bibliometric, and academic-network factors on citation count using a neural network[J]. Journal of informetrics, 2021, 15(2):101-140.
|
[20] |
MONTEFUSCO A M, NASCIMENTO F P, SENNES L U, et al. Influence of international authorship on citations in Brazilian medical journals: a bibliometric analysis[J]. Scientometrics, 2019, 119(3):1487–1496.
|
[21] |
魏瑞斌.论文平均引用时差与被引频次相关性分析[J].情报杂志, 2018, 37(2):135-141.
|
[22] |
BORNMANN L, DANIEL H-D. Selecting manuscripts for a high-impact journal through peer review: a citation analysis of communications that were accepted by Agewandte Chemie International Edition, or rejected but published elsewhere [J]. JASIST, 2008, 59(12): 1841-1852.
|
[23] |
SKILTON P F. Does the human capital of teams of natural science authors predict citation frequency? [J]. Scientometrics, 2009, 78(3): 525-542.
|
[24] |
GUILERA G, GóMEZ-BENITO J, HIDALGO M D. Citation analysis in research on differential item functioning [J]. Quality & quantity, 2009, 44(6): 1249-1255.
|
[25] |
AKSNES D W. Characteristics of highly cited papers [J]. Research Evaluation, 2003, 12(3): 159-170.
|
[26] |
ONODERA N, YOSHIKANE F. Factors affecting citation rates of research articles [J]. Journal of the Association for Information Science and Technology, 2015, 66(4): 739-764.
|
[27] |
COLLET F, ROBERTSON D A, LUP D. When does brokerage matter? citation impact of research teams in an emerging academic field [J]. Strategic organization, 2014, 12(3): 157-179.
|
[28] |
YU T, YU G, LI P Y, et al. Citation impact prediction for scientific papers using stepwise regression analysis [J]. Scientometrics, 2014, 101(2): 1233-1252.
|
[29] |
AIN Q-U, RIAZ H, AFZAL M T. Evaluation of h-index and its citation intensity based variants in the field of mathematics [J]. Scientometrics, 2019, 119(1): 187-211.
|
[30] |
AMARA N, LANDRY R, HALILEM N. What can university administrators do to increase the publication and citation scores of their faculty members? [J]. Scientometrics, 2015, 103(2): 489–530.
|
[31] |
NOSEK B A, GRAHAM J, LINDNER N M, et al. Cumulative and career-stage citation impact of social-personality psychology programs and their members [J]. Personality & social psychology bulletin, 2010, 36(10): 1283-1300.
|
[32] |
BORSUK R M, BUDDEN A E, LEIMU R, et al. The influence of author gender, national language and number of authors on citation rate in Ecology [J]. Open ecology journal, 2009, 2(1): 25-28.
|
[33] |
PENG T-Q, ZHU J J H. Where you publish matters most: a multilevel analysis of factors affecting citations of internet studies [J]. Journal of the American Society for Information Science and Technology, 2012, 63(9): 1789-1803.
|
[34] |
VAN DER POL C B, MCINNES M D, PETRCICH W, et al. Is quality and completeness of reporting of systematic reviews and meta-analyses published in high impact radiology journals associated with citation rates? [J]. Plos one, 2015, 10(3): e0119892.
|
[35] |
ROLDAN-VALADEZ E, RIOS C. Alternative bibliometrics from impact factor improved the esteem of a journal in a 2-year-ahead annual-citation calculation: multivariate analysis of gastroenterology and hepatology journals [J]. European journal of gastroenterology & hepatology, 2015, 27(2): 115-122.
|
[36] |
ZHU X P, BAN Z J. Citation count prediction based on academic network features [C]// Proceedings 2018 IEEE 32nd international conference on advanced information networking and applications (Aina). New York: IEEE, 2018: 534-541.
|
[37] |
DING Y, JACOB E K, ZHANG Z X, et al. Perspectives on social tagging [J]. Journal of the American Society for Information Science and Technology, 2009, 60(12): 2388-2401.
|
[38] |
孔玲,王效岳,于纯良,等.学术论文离被引有多远——基于影响因素与预测方法的文献述评[J].情报资料工作, 2019, 40(6): 63-72.
|
[39] |
YAN R, HUANG C, TANG J, et al. To better stand on the shoulder of giants[C]// BOUGHIDA K. Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries. New York: ACM,2012:51-60.
|
[40] |
BUTUN E, KAYA M. Predicting citation count of scientists as a link prediction problem [J]. IEEE transactions on cybernetics, 2020, 50(10): 4518-4529.
|
[41] |
耿骞,景然,靳健,等.学术论文引用预测及影响因素分析[J].图书情报工作, 2018, 62(14): 29-40.
|
[42] |
RUAN X M, ZHU Y Y, LI J, et al. Predicting the citation counts of individual papers via a BP neural network [J]. Journal of informetrics, 2020, 14(3): 101039.
|
[43] |
LOKKER C, MCKIBBON K A, MCKINLAY R J, et al. Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study [J]. BMJ, 2008, 336(7645): 655-657.
|
[44] |
ABRAMO G, D’ANGELO C A, FELICI G. Predicting publication long-term impact through a combination of early citations and journal impact factor [J]. Journal of informetrics, 2019, 13(1): 32-49.
|
[45] |
BORNMANN L, LEYDESDORFF L, WANG J. How to improve the prediction based on citation impact percentiles for years shortly after the publication date? [J]. Journal of informetrics, 2014, 8(1): 175-180.
|
[46] |
程子轩,张向先,郭顺利.基于作者特征和期刊特征的学术论文被引频次预测模型构建与分析[J].情报科学, 2021, 39(3): 179-184,192.
|
[47] |
YAN R, TANG J, LIU X, et al. Citation count prediction: learning to estimate future citations for literature [C]//Proceedings of the 20th ACM international conference on information and knowledge management. Glasgow, Scotland: Association for Computing Machinery,2011: 1247–1252.
|
[48] |
CHEN J P, ZHANG C X. Predicting citation counts of papers [C]//Proceedings of 2015 IEEE 14th international conference on cognitive informatics & cognitive computing. New York: IEEE, 2015: 434-440.
|
[49] |
AFZAL M, PARK B J, HUSSAIN M, et al. Deep learning based biomedical literature classification using criteria of scientific rigor [J]. Electronics, 2020, 9(8): 9081253.
|
[50] |
ABRISHAMI A, ALIAKBARY S. Predicting citation counts based on deep neural network learning techniques [J]. Journal of informetrics, 2019, 13(2): 485-499.
|
[51] |
YUAN S, TANG J, ZHANG Y, et al. Modeling and predicting citation count via recurrent neural network with long short-term memory[EB/OL].[2022-02-09]. https://arxiv.org/abs/1811.02129.
|
[52] |
WEN J Q, WU L Y, CHAI J P. Paper citation count prediction based on recurrent neural network with gated recurrent unit [C]//Proceedings of 2020 IEEE 10th international conference on electronics information and emergency communication. New York: IEEE, 2020: 303-306.
|
[53] |
XU J, LI M, JIANG J, et al. Early prediction of scientific impact based on multi-bibliographic features and convolutional neural network [J]. IEEE access, 2019, 7: 92248-92258.
|
[54] |
DONG Y, JOHNSON R A, CHAWLA N V. Will this paper increase your h-index? [C]//Proceedings of the eighth ACM international conference on Web search and data mining. New York: ACM, 2015: 149-158.
|
[55] |
IBANEZ A, LARRANAGA P, BIELZA C. Predicting citation count of Bioinformatics papers within four years of publication [J]. Bioinformatics, 2009, 25(24): 3303-3309.
|
[56] |
WANG M, YU G, YU D. Mining typical features for highly cited papers [J]. Scientometrics, 2011, 87(3): 695-706.
|
[57] |
MA A, LIU Y, XU X, et al. A deep-learning based citation count prediction model with paper metadata semantic features [J]. Scientometrics, 2021, 126: 6803–6823.
|
[58] |
JIANG S, KOCH B, SUN Y. HINTS: citation time series prediction for new publications via dynamic heterogeneous information network embedding [C]//Proceedings of the Web conference 2021. New York: ACM, 2021: 3158-3167.
|
张素芳:框架指导,提出修改意见,论文校对及定稿
刘慧敏:论文撰写,数据整理
/
〈 |
|
〉 |