
Research on CORE Paper Association Discovery and Semantic Services Based on Semantic Similarity
Bai Linlin, Wan Ni
Knowledge Management Forum ›› 2021, Vol. 6 ›› Issue (5) : 271-281.
Research on CORE Paper Association Discovery and Semantic Services Based on Semantic Similarity
[Purpose/significance] This paper dissects the process and services of article association discovery in Connecting Repositories, and hopes to provide powerful reference for the recommendation and semantic linking of the content of articles in Chinese open access repositories. [Method/process] This paper analyzed the discovery process of article association based on semantic similarity and the semantic services based on article association. The discovery process of article association based on semantic similarity included metadata and full-text content harvesting, and semantic similarity calculation of article association. The semantic service based on the discovery process of article association included the CORE recommendation service and the linked open data service. And this paper summarized the application suggestions of CORE to Chinese institutional repositories. [Result/conclusion] This paper finds CORE system automatically harvests the metadata of the open access repositories through the existing OAI-PMH protocol, and further extracts the URI fields from the metadata to download the full-text through the HTTP protocol. Furtherly, providing article recommendation services and services of data linked articles based on the discovery of article semantic association enables third-party systems to utilize CORE datasets, it provides a powerful reference in recommendation and semantic linking of article association for open access repositories (such as institutional repositories and open access journals) in China.
Connecting Repositories / semantic similarity / article association / recommendation system / linked data
[1] |
Openaire-history [EB/OL]. [2021-03-01]. https://www.openaire.eu/openaire-history.
|
[2] |
SHARE [EB/OL]. [2021-02-27]. https://share.osf.io/.
|
[3] |
The open archive HAL [EB/OL]. [2021-03-01]. https://hal.archives-ouvertes.fr/.
|
[4] |
中国高校机构知识库联盟 [EB/OL]. [2021-03-01]. http://chair.calis.edu.cn/.
|
[5] |
Hong Kong Institutional Repositories (HKIR) [EB/OL]. [2021-03-01]. https://library.tu.ac.th/tu-digital-collections/hong-kong-institutional-repositories-hkir.
|
[6] |
CORE – Aggregating the world’s open access research papers [EB/OL]. [2021-03-01]. https://core.ac.uk/.
|
[7] |
COnnecting Repositories [EB/OL]. [2021-03-01]. https://en.wikipedia.org/wiki/COnnecting_REpositories.
|
[8] |
Knowledge Media Institute [EB/OL]. [2021-03-01]. https://news.kmi.open.ac.uk/rostra/news.php?r=11&t=2&id=18463=KMi.
|
[9] |
CORE | Jisc [EB/OL]. [2021-03-01]. https://www.jisc.ac.uk/core#.
|
[10] |
Digging into Connected Repositories (DiggiCORE) [EB/OL]. [2021-03-01]. https://diggingintodata.org/awards/2011/project/digging-connected-repositories-diggicore.
|
[11] |
Data Providers [EB/OL]. [2021-03-01]. https://core.ac.uk/dataproviders.
|
[12] |
CORE Services [EB/OL]. [2021-03-01]. https://core.ac.uk/services.
|
[13] |
CORE Dataset [EB/OL]. [2021-03-01]. https://core.ac.uk/documentation/dataset/.
|
[14] |
Connecting Repositories (CORE) | Digging Into Data [EB/OL]. [2021-03-01]. https://diggingintodata.org/repositories/connecting-repositories-core.
|
[15] |
Open Archives Initiative Protocol for Metadata Harvesting [EB/OL]. [2021-03-01]. http://www.openarchives.org/pmh/.
|
[16] |
OAIHarvester2 [EB/OL]. [2021-03-01]. https://www.oclc.org/research/activities/oaiharvester2.html.
|
[17] |
Technical standards [EB/OL]. [2021-03-01]. https://blog.core.ac.uk/2011/03/.
|
[18] |
Releasing 1.8 million open access publications from publisher systems for text and data mining [EB/OL]. [2021-03-01]. https://blogs.lse.ac.uk/impactofsocialsciences/2018/03/22/releasing-1-8-million-open-access-publications-from-publisher-systems-for-text-and-data-mining/.
|
[19] |
Java文件流 BufferedStream [EB/OL]. [2021-03-01]. https://blog.csdn.net/mariofei/article/details/51195055.
|
[20] |
Apache Lucene[EB/OL]. [2021-03-01]. http://lucene.apache.org/.
|
[21] |
KNOTH P, ROBOTKA V, ZDRAHAL Z. Connecting repositories in the open access domain using text mining and semantic data [C]// International conference on theory and practice of digital libraries :research and advanced technology for digital libraries. Berlin: Springer, 2011: 483-487.
|
[22] |
Apache Tika [EB/OL]. [2021-03-01]. https://tika.apache.org/https://tika.apache.org/.
|
[23] |
FRANCINE C, AYMAN F, THORSTEN B. Multiple similarity measures and source-pair information in story link detection[C]// Proceedings of the human language technology conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004. Boston: Association for Computational Linguistics, 2004: 313-320.
|
[24] |
CORE - Semantic Similarity of Open Access publications [EB/OL]. [2021-03-01]. https://lod-cloud.net/dataset/core.
|
[25] |
The EPrints Bazaar [EB/OL]. [2021-03-02]. https://bazaar.eprints.org/.
|
[26] |
CORE Recommender [EB/OL]. [2021-03-03]. https://core.ac.uk/services#recommender.
|
[27] |
Implementing the CORE Recommender in Strathprints: a “whitehat” improvement to promote user interaction [EB/OL]. [2021-03-03]. https://blog.core.ac.uk/2017/10/31/implementing-the-core-recommender-in-strathprints-a-whitehat-improvement-to-promote-user-interaction/.
|
[28] |
LA Referencia integrates CORE Recommender in its services [EB/OL]. [2021-03-03]. https://blog.core.ac.uk/2019/11/20/la-referencia-integrates-core-recommender-in-its-services/.
|
[29] |
CORE Recommender installation for DSpace [EB/OL]. [2021-03-03]. https://blog.core.ac.uk/2020/03/12/core-recommender-installation-for-dspace/.
|
[30] |
CORE Recommender now supports article discovery on arXiv [EB/OL]. [2021-03-03]. https://blog.arxiv.org/2020/10/15/core-recommender-now-supports-article-discovery-on-arxiv/.
|
[31] |
Sesame (framework) – Wikipedia [EB/OL]. [2021-03-06]. https://en.wikipedia.org/wiki/Sesame_(framework).
|
[32] |
The Similarity Ontology [EB/OL]. [2021-03-04]. http://grasstunes.net/ontology/similarity/0.2/musim.html.
|
[33] |
D'ARCUS B, GIASSON F. Bibliographic ontology specification [EB/OL]. [2021-03-05]. http://bibliontology.com/.
|
[34] |
Eclipse RDF4J – a Java framework for RDF [EB/OL]. [2021-03-10]. http://rdf4j.org/.
|
[35] |
Overview (OpenRDF Sesame 4.1.2 API) [EB/OL]. [2021-03-15]. http://archive.rdf4j.org/javadoc/sesame-4.1.2/.
|
[36] |
Apache Tomcat® [EB/OL]. [2021-03-15]. http://tomcat.apache.org/.
|
[37] |
Chapter1.Introduction: what is Sesame? [EB/OL]. [2021-03-17]. https://poc.vl-e.nl/distribution/manual/sesame-1.2.3/ch01.html.
|
[38] |
The SAIL API [EB/OL]. [2021-03-18]. http://docs.rdf4j.org/sail/.
|
白林林:负责数据获取、研究提纲确定与论文撰写;
万妮: 负责论文的修订。
/
〈 |
|
〉 |