OpenCitations at LIBER Annual Conference 2021: ‘How Can Open Infrastructures Support the Role of Research Libraries?’

For the second year, OpenCitations has taken part in the LIBER annual conference.  LIBER (Ligue des Bibliothèques Européennes de Recherche – Association of European Research Libraries) is a network that gathers 440 research libraries, based in more than 40 countries all over the world, with the mission of supporting Europe’s research libraries by highlighting their value to policymakers, providing resources and training, and forming valuable partnerships. 

Since 1951, the LIBER Annual Conference is a key event for the entire network, a keenly anticipated meeting for research library professionals whose mission is “to identify the most pressing needs for research libraries, and to share information and ideas for addressing those needs”. Due to the ongoing pandemic restrictions, the 50th LIBER meeting (23-25 June 2021) was held online, as was the 2020 meeting, with digital co-hosting by the University of Belgrade Library in Serbia. The online-showcase format, however, didn’t constrain the creation of a vital virtual square, fostered by the voices of 70 speakers. The main theme of the conference, “Libraries and Open Knowledge: from vision to implementation” was deepened in 12 parallel sessions.

Professor Silvio Peroni, Director of OpenCitations, participated in Session #5 ‘How Can Open Infrastructures Support the Role of Research Libraries?’ with a presentation dedicated to the benefits of Open Infrastructures for libraries, dialoguing with James MacGregor (interim Managing Director of the Public Knowledge Project), Joanna Ball (Head of Roskilde University Library), and Niels Stern (director of OAPEN and co-Director of DOAB).  

The session, chaired by Maaike Napolitano (National Library of the Netherlands) opened with a presentation by Fidan Limani (Research assistant at ZBW– Leibniz Information Centre for Economics) about the integration of scholarly artifacts from the domain of economics using Knowledge Graphs (KG), and the creation of a network of entities describing objects of interest and connections, while keeping a library perspective. The use of citation links connecting datasets and citations, and the adoption of ontologies and data exportation in RDF would facilitate a possible beneficial collaboration between ZBW and Open Infrastructures such as OpenCitations (whose data is itself in the form of a Knowledge Graph). 

OpenCitations also shares some common features with the other Open Infrastructures described in the second presentation: the financial support from SCOSS project; the community-based approach; and their promising value for libraries and the entire scholarly community.  

OpenCitations is an independent not-for-profit infrastructure organization dedicated to open scholarship and the publication of open bibliographic and citation data by the use of Semantic Web (Linked Open Data) technologies, engaged in advocacy for open citations and open bibliographic metadata, as a founding member of both the Initiative for Open Citations (I4OC) and the Initiative for Open Abstracts (I4OA). It provides data containing more than 7 hundred million citations that the community can use for any purpose. Such data can be crucial as a vehicle for use in national and international research evaluation exercises to make such activities more transparent and reproducible as compared to other proprietary services. Librarians can use OC citation data (e.g., via our REST API) to enhance or develop tools to support their authors, researchers, students, institutional administrators in different kind of contests, for instance by providing metrics to monitor research at your institution and by improving the discoverability of research products such as publications and data. 

OAPEN is a no-profit foundation dedicated to increase the discoverability of open access books and trust around them. They are running three Open-Source platforms enabling open access to books:  the Directory of Open Access Books (DOAB) – a freely available basic indexing service easy integrable within library catalogues; OAPEN Library – a publication platform dedicated to hosting, preserving and distributing books; OAPEN OA Books Toolkit – public information resource for authors to build trust around open-access books. 

PKP (Public Knowledge Project) is a software and library project, consisting of three applications (Open Journal System, Open Pre-printer System and Open Monograph Press).  

The dialogue during this LIBER session wasn’t a mere presentation of these projects and their technical properties: the speakers emphasized the importance of ensuring the participation and the engagement of the stakeholder community, pointed out the crucial value of the support received – not only financial – from Research Libraries, and discussed how such Open Infrastructures can be beneficial for libraries. 

How can libraries support Open Infrastructures? And what role do they play in a long-term solution? According to Joanna Ball, from a librarian perspective, it’s not only a who-benefits-whom problem, but it’s more about finding a “third way, about developing mutually beneficial partnerships, and going beyond the traditional way of approaching things so that we can really play to each other’s strengths.” 

This approach is fully aligned with OpenCitations’ intentions. As Silvio Peroni underlined, in most of cases the active collaboration between Open Infrastructures and libraries is not only about the financial support, but in cooperatively reach a common goal. In particular, “if infrastructures like OpenCitations provide appropriate and easy-to-use interfaces and tools that allow librarians to contribute appropriate bibliographic metadata, and if librarians are willing to enter such metadata from their own records, libraries may become a significant reliable source of this kind of information”. The result of such a ‘crowd-sourced’ entry of bibliographic metadata by libraries would be an enrichment of the overall global open knowledge graph made available through citational links.  

In the last presentation, dedicated to two services provided by OPERAS, Emilie Blotière, (CNRS) and Tiziana Lombardo (Net7) reiterated the value of scholarly communication. COESO and GO TRIPLE, funded by the European Commission, aim in fact to create a persistent dialogue in the Social Sciences and Humanities community, by tackling the fragmentation and becoming a meeting point among different communities.  

What emerged from the session is the importance of communication, cooperation and networking between Open Infrastructures and Libraries, and this is a message that perfectly matches with the core values of LIBER, collaboration and inclusivity. The next LIBER annual conference is scheduled for June 2022 in Odense, hopefully recreating the physical and enthusiastic gathering of the previous meetings.  

You can find the recording of the full session here: LIBER 2021 Session #5: How Can Open Infrastructures Support the Role of Research Libraries? 

You can find the slides of the session on Zenodo.

Posted in Data publication, Open abstracts, open access, Open Citations, Open scholarship, Open Science, Uncategorized | Tagged , , , , , , , | Leave a comment

New research fellowship position to work on the EOSC

OpenAIRE-Nexus is an H2020 project funded by the European Commission which aims at bringing together, within the European Open Science Cloud (EOSC), fourteen new services focused on the development and promotion of Open Science. OpenCitations is directly involved in this project through the Department of Classical Philology and Italian Studies at the University of Bologna.

In the context of this OpenAIRE-Nexus project, our goal is to make all services offered by OpenCitations compatible with OpenAIRE, so as to guarantee semantic and technical interoperability with all the other Open Science services available in the EOSC. For this purpose, we now seek applicants for a new one-year research fellowship to be held from May 2021 (renewable for an additional year), for which the application closing deadline is 31 March 2021.

The goal of the Research Fellowship is to study the current limitations of the OpenCitations infrastructure, and possible improvements to introduce into it, in order to integrate it with OpenAIRE and the EOSC. The Research Fellow, who will work in collaboration with Silvio Peroni, Director of OpenCitations, is expected to address issues relating to the provision of Web services, the management of distributed and heterogeneous databases, and data ingestion and conversion processes.

The Call for Applications (in Italian and in English) is available online on the website of the University of Bologna. It also includes an attachment with a description of OpenCitations and of the activities related to the position. The position has a net salary (exempt from income tax, after deduction of social security contributions) in excess of 20K euros per year. As indicated in the Call for Applications, candidates need to apply exclusively through the University of Bologna web portal.

For further information, please contact Silvio Peroni (email: silvio dot peroni at unibo dot it).

Posted in Job, Open Citations, Open Science | Tagged , , , | Leave a comment

Seeking applicants for three-year research fellowship position

A year ago, at the end of 2019, OpenCitations was selected by the Global Sustainability Coalition for Open Science Services (SCOSS, https://scoss.org) for its second round of crowdfunding support, since SCOSS believes that OpenCitations aligns well with Open Science goals and is an innovative service. The goal of such support is to enable OpenCitations’ operations over the next three years as it transitions into a global scholarly infrastructure organization with a secure financial footing. As part of this work, we now plan to strengthen the current technical and computational infrastructure (server, parallel processing, backup, etc.) used by OpenCitations, which is currently hosted at the University of Bologna.

For this purpose, we now seek applicants for a new three-year research fellowship to be held from March 2021, for which the application closing deadline is 7 February 2021. The principal goals of this research fellowship are:

  1. to study the current limitations of the OpenCitations infrastructure and introduce improvements, and
  2. to design and implement new software control tools that will enable us to manage the infrastructure more efficiently.

Additionally, the selected research fellow will be expected to address issues relating to the provision of Web services, the management of distributed and heterogeneous databases, OpenCitations’ data conversion and ingestion processes involving parallel computing, and the overall security of the infrastructure. Particular attention will need to be given to data preservation and to the long-term maintenance and updating of the infrastructure.

The Call for Applications (in Italian and in English) is available online on the website of the University of Bologna. It also includes an attachment with a description of OpenCitations and of the activities related to the position. The position has a net salary (exempt from income tax, after deduction of social security contributions) in excess of 23K euros per year. As indicated in the Call for Applications, candidates need to apply exclusively through the University of Bologna web portal.

Posted in Job, Open Citations, Open Science | Tagged , , , | Leave a comment

Elsevier endorses DORA and opens its journal article reference lists

We congratulate and thank Elsevier, the world’s largest academic publisher, for endorsing the DORA Declaration on Research Assessment (https://sfdora.org/), thereby joining the hundreds of other publishers and scientific organizations which have endorsed DORA over the previous eight years, and also for making a commitment to open the references from all its journal articles submitted to Crossref. The text of Elsevier’s endorsement, dated 16th December 2020, is to be found at https://www.elsevier.com/connect/advancing-responsible-research-assessment, and includes the statement:

“We will make reference lists for all articles published in Elsevier journals openly available via Crossref so they can be available for reuse. This means other important initiatives like I4OC can draw on this metadata.”

Particular thanks are due to Kumsal Bayazit, Elsevier’s CEO, and to Andrew Plume, head of Elsevier’s International Center for the Study of Research (ICSR), for spearheading this change in stance on the part of Elsevier, which until this week has been alone among the major scholarly publishers in keeping its reference lists at Crossref closed, for which it has attracted much criticism from the academic community.

This change of heart on the part of Elsevier now means that by next spring, after Crossref has had a chance to implement this change in status over the large corpus of Elsevier journal metadata, the reference lists of articles in the vast majority of the world’s academic journals will be open, enabling such metadata to be used to enhance publication discovery and enable transparent research assessment.  The I4OC web site and COCI, OpenCitations Index of Crossref open DOI-to-DOI citations, will reflect this change once it has happened.

The Institute of Electrical and Electronics Engineers (IEEE), the American Chemical Society (ACS), and the University of Chicago Press now stand alone as the only significant scholarly publishers who choose not to make their publication reference lists open.

Posted in Bibliographic references, Citations as First-Class Data Entities, Open academic analytics, Open Citations, Open Science | Tagged , , , , | 3 Comments

The Social Dilemma and open academic analytics

Last night I watched the Netflix documentary The Social Dilemma (https://www.netflix.com/title/81254224), in which former employees of the big Silicon Valley social media companies expose the serious and sometimes tragic or even fatal consequences that social media may have on individual lives. These social media services are run by commercial companies under pressure from shareholders to make ever increasing profits. In this situation, the ultimate consumers of these services becomes not the individuals using them, but the advertisers, and the users of these services (ourselves) become the commodities whose user profiles and personal preferences are sold by the social media companies to the advertisers for use in targetting adverts.

The Social Dilemma is a compelling documentary, since it is told by those who know (since they helped build and run the systems). It is particularly relevant to those who have pre-teen and teenage children, whose lives and personal interactions are increasingly being shaped and to a large extent controlled by social media, particularly during the current Covid-19 lock-downs. As recent events in the United States have highlighted, social media also pose fundamental issues around the definition of “facts” and “beliefs”, moving the debate from epistemology to politics and affecting the future of our societies.

From social media to academic analytics

Jason Priem’s self-portrait as a phrenology illustration.

From https://www.flickr.com/photos/26158205@N04/4307548673. (CC BY-SA 2.0)

Jason Priem is co-founder of ImpactStory, Depsy, UnPayWall and other open analytic and open science infrastructures and services (https://our-research.org/projectsthat deserve ongoing support from the academic community.

Academic analytics is the application of statistical, predictive modelling, data mining and artificial intelligence (AI) techniques to analyse, evaluate and summarize various types of organizational, educational and bibliographic data derived from higher educational and research institutions, in order to provide numerical results that can be used to guide strategic planning and decision-making practices in these contexts. It is increasingly used for student and faculty assessment, for deciding the allocation of funding, and for evaluating the standing and productivity both of individual academic departments and of entire universities.

Examples of such analyses include the degree of cross-institutional and international authorship of scholarly publications, and their citation counts excluding self-citations, used as indicators of the importance of research project outputs; the correlation of student grades with their interactions with university services such as libraries and virtual learning environments, used improve the learning performance of individual students; and the drop-out rates and degree distributions of different universities, employed to evaluate the quality of teaching. Those using such analyses include not only university administrators and individual academics, but also, in the case of learning analytics, increasingly the students themselves and their parents.

The relevance of The Social Dilemma to academic analytics is that these, like social media, are increasingly controlled by commercial companies under similar pressures to turn a profit. Here it is the universities and their academic data that become the consumed commodities, while the commercial suppliers of academic analytical services are the financial beneficiaries of these data.

There are, of course, differences between these two situations. While social media companies and academic analytics companies both have shareholders that expect profits and users to whom they provide services, the social media companies have advertisers that bring in revenue, while academic analytics companies get most of their revenue directly from the academic community itself. There is thus a relatively close connection between those who provide the raw data and those who pay for the analytical services built over these data. Since the academic community is both data provider and the one who pays the piper, this means that the social dilemma around research analytics should be easier to resolve than the social dilemma surrounding social media.

A further important difference is the following: while participation in social media is strictly voluntary, most of the academic community are evaluated through data analytics and AI without their express consent. Information on faculty members is being collected and used with little or no recourse for the individuals affected, since there are few, if any, rights to disclosure, rights to opt out of data analytics and AI-powered reviews and decisions, rights to review the data for errors, rights to correct errors, or rights to appeal decisions based on such analytics. Academic positions carry with them the expectation of academic freedom, the principles of which are hard to reconcile with the intense individual scrutiny built into the deployment of academic analytics and AI.

The dangers of commercial analytic platforms in academia

In May 2020, Amy Brand and Claudio Aspesi published their seminal article In pursuit of open science, open access is not enough (Science 368: 574-577. https://doi.org/10.1126/science.aba3763), in which they argued cogently about the dangers of commercial dominance of academic data analytics and knowledge infrastructures, and the need for open alternatives. Details of this growing commercial dominance of academic analytics, among other platforms and services, are given in the excellent analysis by Penny C. S. Andrews in her chapter The Platformization of Open (https://doi.org/10.7551/mitpress/11885.003.0027), in the book Reassembling Scholarly Communications: Histories, Infrastructures, and Global Politics of Open Access edited by Martin Paul Eve and Jonathan Gray (The MIT Press, 2020: https://doi.org/10.7551/mitpress/11885.001.0001).

Take, for example, the major university rankings, such as the Times Higher Education ranking. These rankings are extremely powerful. They rely on proprietary data, ironically to a significant extent made freely available to the producers of the rankings by universities, which are then used to define how the performance of universities should be assessed. Times Higher Education, for instance, presents its World University Rankings as “the definitive list of the top universities globally” (https://www.timeshighereducation.com/world-university-rankings). The performance criteria used by the Times Higher Education ranking, and a few other major university rankings, now play an important role in the decision-making processes of universities all over the world. However, because the underlying data are proprietary, it is hard to challenge the rankings or to use the data to provide alternative perspectives on university performance. It has become increasingly difficult for universities to develop strategic priorities that do not align with the performance criteria used by the major university rankings. For example, at one major European university, discussions about the development of an open science strategy explicitly take into account the possible negative effects of open science practices on its position in the major university rankings.

Application of the message of The Social Dilemma to the realm of scholarly information shows how the rise of commercially controlled academic analytics might fundamentally threaten academic freedom and access to truth itself. As Penny Andrews points out, several of the big players in academic publishing and scholarly communication are now building suites of products based around scholarly data and analytics, whose platforms rarely have open and transparent governance, and are encouraging universities to subscribe to such suites, sometimes in deals that, in the name of open science, bundle access to the institution’s scholarly data and provision of analytics based upon them with open access publication of that institution’s scholarly outputs, as, for example, in the Dutch Universities’ recent deal with Elsevier (https://tinyurl.com/y5v7ua7u). By gaining proprietorial control of such data, and by providing the default means of information transfer and workflows between a university’s administrative CRIS systems, academic libraries and individual researchers, such commercial companies lock universities and national consortia into non-interoperable situations in which their academic data, whether relating to their own standing, to the sources and distribution of their external research funding, or to the publication records and relative academic merits of their faculty members, are no longer fully under their own control.

The issues posed by the commercial deployment of data analytics are clearly compounded when these services are performed by companies which conduct other business with the academic community. A researcher who is faced with the question of where to publish her next article can be forgiven for deciding that, at the margin, it cannot hurt to submit it to a journal owned by the company tasked with assessing her research performance. There is a massive conflict of interest when companies that derive significant parts of their profits from publishing research also assess it and offer guidance on what projects should be funded next.

The urgent need for open community-governed infrastructures

For the reasons discussed above, the present situation in academia is dire. The academic community should take control of the data analytics infrastructures it uses, which need to be kept open, with transparent governance, to ensure the healthy functioning of the academic community. While the existing scholarly publishing infrastructure is well-established and hard to change quickly, the use of data analytics and AI in academia is still nascent and in flux. Hence it should be relatively easy to prevent ceding complete control of these activities to commercial vendors, who, of course, are merely doing what they exist to do, namely to maximize profits for their owners and shareholders.

Resolving this situation is within the grasp of the academic community, and its clear responsibility, although this will not be without difficulties. It may be much easier for a university administrator to authorize payment for a subscription to academic analytical services from a commercial supplier “that knows what it is doing” than it is to collaborate with colleagues from other academic institutions – often seen as competitors – to develop or fund alternative services that are independent, open and transparently managed, with all the implications that has in terms of the creation of salaried posts, recruitment or retraining of staff, premises, administration, etc. However now is the time to act, even during the current pandemic-induced economic recession, before commercial lock-in becomes a reality. Given the huge sums that universities already spend on subscription services of various types, it is clear that the primary problem is not the redeployment of existing financial resources, but is more fundamentally philosophical: whether or not academia wishes to be in control of its own data, or beholden to commercial interests. The development of community-controlled platforms providing open academic analytical services should now be made a priority, and appropriate sustained financial support for these platforms should be provided by the academic community, including governmental and charitable funders of research.

This week’s online OPERA conference “The Future of Open Research Analytics” (18-19 November 2020; https://deffopera.dk/opera-conference-november-2020/), hosted by the Danish OPERA Project (https://deffopera.dk/), provides a timely forum in which to discuss these issues. 

I would like to acknowledge and thank Ludo Waltman and Claudio Aspesi for reviewing drafts of this blog post, and for their important and insightful suggestions for its improvement and expansion, which I have incorporated with their permission.

Posted in Open academic analytics, Open scholarship, Open Science | 1 Comment

The Initiative for Open Abstracts is launched

OpenCitations is proud to be part of the launch of the Initiative for Open Abstracts, a new cross-publisher initiative calling for the unrestricted availability of abstracts to boost the discovery of research.

The Initiative for Open Abstracts (I4OA), launched on September 24th, calls on all scholarly publishers to open their abstracts, and specifically to deposit them with Crossref, in order to facilitate large-scale access and promote discovery of critical research.

Making abstracts openly available helps scholarly publishers to maximize the visibility and reach of their journals and books. Open abstracts make it easier for scholars to discover, read and then cite these publications; promotes their inclusion in systematic reviews; expands and simplifies the use of text mining, natural language processing and artificial intelligence techniques in bibliometric analyses; and facilitates scholarship across all disciplines by those without subscription access to commercial bibliographic services.

Many abstracts are already available in various bibliographic databases, but these sources have limitations, for example because they require a subscription, are not machine-accessible, or are restricted to a specific discipline. I4OA thus calls on all scholarly publishers using Crossref DOIs to make their abstracts openly available by depositing them with Crossref. This can be done as part of established workflows that publishers already have in place for submitting publication metadata to Crossref.

As detailed on the I4OA web site at https://i4oa.org, 40 publishers have already agreed to support I4OA and to make their abstracts openly available. I4OA is also supported by 56 other stakeholders including research funders, libraries and library associations, infrastructure providers, and open science organizations, demonstrating the importance and relevance of this Initiative to the scholarly community. The launch press release is available at https://i4oa.org/press.html#pressrelease.

I4OA was inspired by the success of the Initiative for Open Citations (I4OC, https://i4oc.org/), which encourages the submission of references to Crossref. Since the launch of I4OC in 2017, over two thousand scholarly publishers have chosen to make the reference lists of their journal articles and book chapters openly available through Crossref. I4OA aims to replicate the success of I4OC by achieving a rapid jump in the open availability of scholarly abstracts via Crossref.

Further information may be obtained from the I4OA web site at https://i4oa.org, from the I4OA poster at https://doi.org/10.5281/zenodo.4047454, by attending the free I4OA launch webinar on October 5th 2020 at 4 pm CEST (register at https://tinyurl.com/i4oa-webinar), by emailing Professor Ludo Waltman (CWTS, Leiden University; coordinator of I4OA) at openabstracts@gmail.com, or by following @open_abstracts on Twitter.

Posted in Open abstracts, open access, Open scholarship, Open Science, Uncategorized | Tagged , , , , | Leave a comment

More than 733M citations now available in COCI

Today, we have published the bi-monthly release of COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. In this latest release (dated 6 September 2020), we extended COCI with more than 11 million additional citations. Now, COCI contains more than 733 million DOI-to-DOI citation links between more than 59.4 million bibliographic entities.

These new citations were harvested from the most recent Crossref data dump, downloaded on 19 August 2020, which includes the references of articles deposited in Crossref between 4 June 2020 and 3 August 2020.

We remind you that COCI has been fully described in our open-access article

Ivan Heibi, Silvio Peroni & David Shotton (2019). Software review: COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. Scientometrics, 121 (2): 1213-1228. DOI: https://doi.org/10.1007/s11192-019-03217-6

and that all the bibliographic and citation data in COCI:

A final remark: COCI and other OpenCitations services will be the topic of a presentation that we will have in the context of the Workshop on Open Citations and Open Scholarly Metadata 2020. The workshop is a 3-hour event for researchers, scholarly publishers, funders, policymakers, and opening citations advocates, interested in the creation, reuse, and improvement, of open citation data and open scholarly metadata, with invited speakers. No registration is needed to follow it – we hope to see you there on 9 September at 15:00 CET!

Where you can follow the Workshop on Open Citations and Open Scholarly Metadata 2020.
Posted in Citations as First-Class Data Entities, Data publication, Open Citations, Open Science | Tagged , , | Leave a comment