BIBLIOMETRICS

Intro to Bibliometrics

Biblio­met­rics is an inter­dis­ci­pli­nary research and appli­ca­tion field concerned with the measur­a­bil­i­ty and quanti­ta­tive analy­sis of scien­tif­ic publi­ca­tion and citation data. How does it work? It is based on data that describes scien­tif­ic articles, such as authors’ names, titles, publi­ca­tion year, and authors’ affil­i­a­tions (= organi­za­tion­al address­es). This data is gener­at­ed as metada­ta in the publi­ca­tion system and is primar­i­ly processed for infor­ma­tion retrieval purpos­es in databas­es such as Web of Science, Scopus, and, more recent­ly, OpenAlex. On this world map, you can see that new scien­tif­ic papers are constant­ly being published around the globe—the map displays the latest 100 publi­ca­tions record­ed in the databas­es Cross­ref and OpenAlex within the past 48 hours. Each point marks the location of the first author on the world map. Click on the points to learn more about the publi­ca­tions. Do differ­ent or often the same geograph­i­cal clusters emerge over time? You’d have to look quite often to find out…!

Time is thus an impor­tant factor in biblio­met­rics, while author­ship and the affil­i­a­tions relat­ed to authors are anoth­er: To derive more robust evidence from isolat­ed patterns regard­ing, for example, the produc­tiv­i­ty of an organi­za­tion or a country’s research system, the publi­ca­tion data is aggre­gat­ed over time. Addition­al­ly, affil­i­a­tion data allows us to deter­mine authors’ assign­ments to their institutions—such as universities—and, through them, with countries, enabling publi­ca­tions to be aggre­gat­ed at these levels. The second, inter­ac­tive map above provides an overview of the 200 most produc­tive insti­tu­tions in Germany. Each circle repre­sents the number of publi­ca­tions in 2025; circles with a light border denote a single insti­tu­tion, while circles without a light border are further subdi­vid­ed with a click of the mouse. By click­ing on an insti­tu­tion circle, you can view the name of the respec­tive insti­tu­tion and visit its website. The assign­ment of affil­i­a­tion data to institutions—here based on OpenAlex—is curat­ed quarter­ly by the KB.

Other biblio­met­ri­cal­ly usable data includes subject categories for disci­pli­nary classi­fi­ca­tion, infor­ma­tion on research funding, and citation counts, which are also contin­u­ous­ly gener­at­ed within the publi­ca­tion system by the afore­men­tioned databas­es that analyze the refer­ence lists of newly published works and link them to the cited publications.

A central premise of biblio­met­rics is that infor­ma­tion on scien­tif­ic publications—as key outputs of research—and their citations can, beyond their individ­ual content, yield insights into the struc­tures and dynam­ics of the publi­ca­tion and science system when analyzed in aggre­gat­ed form. One of the histor­i­cal roots of biblio­met­rics lies in the search for empir­i­cal laws, such as Bradford’s Law, origi­nal­ly devel­oped in a library science context. In a relat­ed field, patents are often used to analyze knowl­edge trans­fer process­es towards commer­cial application.

A core area of classi­cal, evalu­a­tive biblio­met­rics is the devel­op­ment, assess­ment, and use of indica­tors to measure charac­ter­is­tics of publi­ca­tion corpo­ra. These indica­tors are used for the evalu­a­tion of insti­tu­tions, sectors, countries, or individ­ual researchers and research groups. Due to differ­ences in disci­pli­nary cultures, indica­tors should be field-normalised or other­wise not used for inter­dis­ci­pli­nary compar­isons. Field-normalised citation rates (FNCR), the field-normalised propor­tion of highly cited publi­ca­tions or collab­o­ra­tion indica­tors are used, for example. In contrast, the Journal Impact Factor (JIF) or the H‑index are often criti­cal­ly discussed in the field.

Due to the specif­ic nature of the data, which, as explained above, was not primar­i­ly creat­ed for this purpose, techniques for data cleans­ing, disam­bigua­tion of entities (especial­ly authors, organ­i­sa­tions and research funders) and match­ing differ­ent data sources, as well as methods such as cluster­ing based on co-citations or bibli­o­graph­ic coupling for science mapping, for example.

These techniques, methods, and indica­tors are also applied in explorato­ry research settings and contribute to quanti­ta­tive science and innova­tion studies—for example, in analyz­ing the impact of funding programs, gender dispar­i­ties, the identi­fi­ca­tion of innova­tions and emerg­ing research fields, or studies on scien­tif­ic misconduct.

While such questions are typical­ly addressed using statis­ti­cal analy­sis, biblio­met­rics has increas­ing­ly devel­oped method­olog­i­cal inter­faces with other fields, such as network analy­sis and infor­ma­tion retrieval. It can also be combined with quali­ta­tive approach­es. Moreover, growing access to full texts and advances in natur­al language process­ing (NLP) and large language models (LLMs) are addition­al­ly enabling the increased consid­er­a­tion of the seman­tic level in the analy­sis of publi­ca­tion corpora.

Further infor­ma­tion can be found in the Biblio­met­rics Quick Notes by Dr. Stephan Gauch, which were devel­oped in the context of the KB and funded by the Feder­al Ministry of Research, Technol­o­gy and Space.

The Open Access Monitor Germany

The Open Access Monitor Germany is a tool that monitors the publi­ca­tion output of German scien­tif­ic insti­tu­tions in scien­tif­ic journals. Data from exist­ing source systems, such as the database of the KB, are first collect­ed and aggre­gat­ed. These data are then made acces­si­ble and usable in a freely avail­able appli­ca­tion and, in a further step, used to inform research published in scien­tif­ic publi­ca­tions. In this way, these findings are made avail­able again to the scien­tif­ic commu­ni­ty and the inter­est­ed public, offer­ing libraries, funders and researchers a freely avail­able tool to analyse publi­ca­tions, the citations they contain, and the associ­at­ed publish­ing costs.

Further­more, the Open Access Monitor monitors and enables support for the change in the publi­ca­tion system towards Open Access via contin­u­ous analy­sis of funds spent on journal subscrip­tions and publish­ing fees. The frequent deliv­ery – up to weekly – of data from exist­ing data sources means that users are always provid­ed with up-to-date data. The abili­ty to filter search queries in the user inter­face supports differ­ent usage scenar­ios. The Feder­al Ministry of Educa­tion and Research (BMBF) funds the ongoing devel­op­ment and opera­tion of the Open Access Monitor Germany through the central library of the Research Centre Jülich in the project “OAM — Open Access Monitor­ing” (FKZ 16OAMO001).

The open access monitor records the publi­ca­tion output of German acade­m­ic insti­tu­tions in scien­tif­ic journals. The transi­tion to an open access system can be observed on the basis of analy­ses of subscrip­tion fees and publi­ca­tion fees.

Distri­b­u­tion of journal business models

The graph shows the current distri­b­u­tion of journals (33,150) across journal business models; based on the Cross­ref title list, and the journal lists used in the OAM (DOAJ, DOAG, trans­for­ma­tive agreements).

Distri­b­u­tion of journal articles in Germany

The graph shows the open/closed access ratio of journal articles (764,825) in Germany for the last five years (2018–2022) based on Dimen­sions, Unpay­wall, and the journal lists (DOAJ, DOAG) used in the OAM.