RESEARCH
INTERNAL DEVELOPMENT PROJECTS
The KB engages continously in projects to improve the internal infrastructure. Reports on finalised projects are listed in our archive.
Comparative Analysis and Curation of German Metadata in Open Bibliometric Data (OPENBIB)
Term: May 2023 — December 2025
The goal of the project is to establish an open bibliometrics database within the German Kompetenznetzwerk Bibliometrie. This will open up the possibility for the fields of higher education research and science studies to use innovative and open data sources as an alternative to proprietary bibliometrics databases. At the same time, the database promises an enhanced analysis potential with regard to publication venues and modes that are not covered in the proprietary data.
Specifically, an open bibliometrics database based on OpenAlex is to be developed by the KB partners SUB Göttingen, Universität Bielefeld, FZ Jülich, GESIS and DZHW in collaboration with the KB hosting partner FIZ Karlsruhe and participation of further KB partners. The joint endeavour is pursuing four subsequent sub-goals:
- Database provision: Provision of a free and machine-readable developer instance of the bibliometric database OpenAlex as a basis for curating German publication data using an open licence.
- Database comparison: Comparative analysis of the coverage and quality of the open bibliometric database OpenAlex compared to the proprietary databases.
- Data curation: Development and application of technical procedures for curating the metadata of publications produced with the participation of authors from German research institutions.
- Networking and usage: Identification of national and international re-use opportunities.
Contact person: Najko Jahn (SUB Göttingen)
You can find more information on the project blog.
Data infrastructure
The KB operates a quality-assured data infrastructure hosted by FIZ Karlsruhe and derived from the contents of the Scopus (Elsevier) and Web of Science (Clarivate Analytics) databases. The OpenAlex databases will be integrated into the infrastructure in the same manner as the other two databases in the course of 2025.
The databases are checked using a series of automatic and semi-automatic procedures during the loading processes. Any errors during loading and mapping are corrected and data irregularities are reported to Elsevier and Clarivate. Unifications and standardizations, e.g. of journal names and country information, are carried out. Each database version is accompanied by an internally published quality assurance report and once a year aggregated data and indicators are compared with the previous year’s status in publicly available yearly reports.
The schemas of the databases are designed and optimized for bibliometric applications. In addition to the raw data the databases contain enhanced data and pre-computed indicators.
One particular improvement is the institutional address disambiguation of German institutions, that is, the cleaning and unification of address data. This sub-project is run by I²SOS at Bielefeld University.
To ensure reproducibility of bibliometric analyses a database incorporating the most recent data is generated four times a year and old versions are archived.
An article that presents conceptual considerations on the technical infrastructure and describes it, documents the database schema, as well as the loading processes and procedures for data curation and quality assurance, was developed in 2024 and published as a preprint on Zenodo.
The DDL script for creating the tables is also available on Zenodo.
Further details are provided in the respective reports.
PUBLICATIONS AND TALKS
These publications and talks were made possible by using the infrastructure of the KB:
Akbaritabar, A., Theile, T., & Zagheni, E. (2024)
SCIENTIFIC DATA, 11(1). https://doi.org/10.1038/s41597-024–03655‑9
Akbaritabar, A., Torres, A. F. C., & Lariviere, V. (2024)
A global perspective on social stratification in science.
Information Systems, 109, 102056.
https://doi.org/10.1016/j.is.2022.102056
Aman, V., & Besselaar, P. V. D. (2024)
JOURNAL OF INFORMETRICS, 18(2), 101500.
https://doi.org/10.1016/j.joi.2024.101500
Asanov, A.-M., Asanov, I., Buenstorf, G., Kadriu, V., & Schoch, P. (2024)
SCIENTOMETRICS, 129(4), 2389–2405.
https://doi.org/10.1007/s11192-024–04952‑1
Backes, T., & Dietze, S. (2024)
Connected components for scaling partial-order blocking to billion entities
JOURNAL OF DATA AND INFORMATION QUALITY, 16(1), 9. https://doi.org/10.1145/3646553
Boulanger, C., Creutzfeldt, N., & Hendry, J. (2024)
The Journal of Law and Society in Context: Network Analysis of Citations.
JOURNAL OF LAW AND SOCIETY. Journal of Law and Society Blog.
Bornmann, L., & Haunschild, R. (2024)
The Prize Winner Index (PWI): A proposal for an indicator based on scientific prizes.
JOURNAL OF INFORMETRICS, 18(4). https://doi.org/10.1016/j.joi.2024.101560
Donner, P. (2024)
Remarks on modified fractional counting.
JOURNAL OF INFORMETRICS, 18(4). https://doi.org/10.1016/j.joi.2024.101585
Haunschild, R., & Bornmann, L. (2024)
PLOS ONE, 19(12), e0308041.
https://doi.org/10.1371/journal.pone.0308041
Leibel, C., & Bornmann, L. (2024).
SCIENTOMETRICS, 129(12), 7971–7979.
https://doi.org/10.1007/s11192-024–05201‑1
Melnychuk, T., & Schultz, C. (2024)
JOURNAL OF PRODUCT INNOVATION MANAGEMENT. https://doi.org/10.1111/jpim.12750
Schmidt, M. (2024).
Why do some retracted articles continue to get cited?
SCIENTOMETRICS, 129(12), 7535–7563. https://doi.org/10.1007/s11192-024–05147‑4
Stephen D., & Stahlschmidt S., (2024):
SCIENTOMETRICS. https://doi.org/10.1007/s11192-024–05006‑2
Taubert, N., Hobert, A., Jahn, N., Bruns, A., & Iravani, E. (2024).
SCIENTOMETRICS, 129(5), 2801–2825. https://doi.org/10.1007/s11192-024–05003‑5
Taubert, N., Sterzik, L., & Bruns, A. (2024).
Mapping the German Diamond Open Access Journal Landscape.
MINERVA, 62(2), 193–227. https://doi.org/10.1007/s11024-023–09519‑7
Torres, A. F. C., & Akbaritabar, A. (2024).
The use of linear models in quantitative research.
QUANTITATIVE SCIENCE STUDIES, 5(2), 426–446. https://doi.org/10.1162/qss_a_00294
Wang, J., Frietsch, R., Neuhaeusler, P., & Hooi, R. (2024).
International collaboration leading to high citations: Global impact or home country effect?
JOURNAL OF INFORMETRICS, 18(4). https://doi.org/10.1016/j.joi.2024.101565
Wieczorek, O., Schmitz, A., Volle, J., Bayarkhuu, K., & Münch, R., 2024:
SOZIALE WELT, 26, 239–279. https://doi.org/10.5771/9783748925590–239
Wray, K. B., Paludan, S. R., Bornmann, L., & Haunschild, R., 2024:
SCIENTOMETRICS. https://doi.org/10.1007/s11192-024–05001‑7
Zhang, X. (2024).
JOURNAL OF INFORMETRICS, 18(4). https://doi.org/10.1016/j.joi.2024.101574
NETWORK PARTNERS
The KB is a cross-institutional network in which the partners cooperate to contribute to the further development of bibliometrics and its applicability on the basis of a shared data infrastructure.