Save the dates: OpenCitations’ September events 

We are happy to announce OpenCitations’ participation in a number of online conferences and events during the next few weeks. Our directors Silvio Peroni and David Shotton will be speaking at the Open Science Fair 2021, the OASPA Conference 2021 and Open Access Tage.  

Open Science Fair 2021 (20-23 September) is an event organized by OpenAIRE, in collaboration with some key international initiatives in the area of Open Science: COAR, EIFL, Force11, LA Referencia, LIBER, OPERAS, Sparc, Sparc Europe. Like a real fair, the visitors can explore virtual pavilions, participating in various Keynote Talks, Parallel Sessions and Workshops dedicated to Open Science. Silvio Peroni will give two talks on Tuesday 21:  

  • In the Lightning Talk, “ScholeXplorer and OpenCitations as the new frontier of open citation indexing” (11:30 CEST), coauthored with Paolo Manghi (OpenAire), Alessia Bardi (CNR-ISTI) and Sandro La Bruzzo (CNR-ISTI), Silvio will be presented ScholeXplorer and OpenCitations, two of the services included in the MONITOR portfolio of the OpenAIRE-Nexus project. More information and registrations at: https://www.opensciencefair.eu/2021/lightning-talks/scholexplorer-and-opencitations-as-the-new-frontier-of-open-citation-indexing  
  • The Workshop “The perils of being invisible. Collective funding models for Open Science infrastructure” (16:30-18:00 CEST) “will help identify the main challenges of collective funding models for Open Science Infrastructure, as well as explore the path forward to make them more efficient”. Silvio Peroni, Niels Stern (DOAB/OAPEN) James MacGregor (PKP), Agata Morka (SPARC Europe/SCOSS), Jon Treadway (the Great North Wood Consulting), Jean-Francois Lutz (University of Lorraine) and Vanessa Proudman (SPARC Europe) will reflect on the evanescence of Open Science Infrastructure (OSI) in library budget considerations. The speakers will also promote interaction with other workshop participants in order to create a collective dialogue. You can register for the event here: https://www.opensciencefair.eu/2021/workshops/the-perils-of-being-invisible  

The OASPA Conference 2021 (21-23 September), entitled “Designing 21st Century Knowledge Sharing Systems”, will be dedicated to “many timely and fundamental topics relating to open scholarly communication”, including  “the ongoing impact of the pandemic”.  David Shotton will take part in the Poster Lightning Talks Session 3 (Thursday 23, 1-2 pm BST), with the title “OpenCitations – what does the future hold?”, a reflection on OpenCitations’ values, data, services, achievements so far, and plans for the future. For further information and registration: https://oaspa.org/conference/  

Silvio Peroni, together with James MacGregor (Public Knowledge Project) and Niels Stern (OAPEN) will hold the Workshop “How Open Infrastructure Benefits Libraries?” (September 27, 11:30-13 CEST) as part of the Open Access Tage 2021 (27-29 September), an annual event dedicated to Open Access initiatives and community. During the workshop, the speakers will investigate the social and economic value of open infrastructures for libraries. For more information and to register for the event: https://oat21.sched.com/event/kdFg/workshop-2-how-open-infrastructure-benefits-libraries  

We thank the organizers of these prestigious international events for having invited OpenCitations to participate. The Open Science resounds and grows through such community-centered initiatives.  

If you wish to learn more about Open Science, ongoing Open Access initiatives, and OpenCitations’ commitment to and activities within these areas, don’t miss the opportunity to participate in these on-line conferences … see you there! 

Posted in open access, Open Citations, Open scholarship, Open Science | Tagged , , , , , , | Leave a comment

California Digital Library invests in OpenCitations

OpenCitations is excited to announce that the California Digital Library (CDL) has joined our growing list of contributors.

CDL’s commitment to sustainable open scholarship has great value for the global scholarly community.  Through its investments and partnerships, CDL aims to create an international academic and librarian dialogue, trusting in the idea that “the university, its scholars and its libraries thrive when we transcend organizational boundaries and commit ourselves to shared investments”.

CDL’s contribution will generously support OpenCitations throughout 2021-2023. CDL funding in the fiscal year 2020-2021 also includes two other SCOSS-endorsed infrastructures, OAPEN and DOAB, the non-profit organization Open Access Switchboard, and the services PsyArXiv and SCOAP3 Books. As can be read in the recent post by Ellen Finnie, this investment reflects CDL’s “commitment to ’invest in open’ by allocating a portion of our collections funding to the development of open content and infrastructure in support of UC scholarship and teaching”.

OpenCitations team is grateful to be included in CDL’s ongoing investment in open infrastructure.  Thank you!

Posted in Open Citations, Open scholarship, Open Science | Tagged , , , , | Leave a comment

92 million new citations added to COCI

It’s been a month since the announcement of 1.09 Billion Citations available in the July 2021 release of COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations.  

We’re now proud to announce the September 2021 release of COCI, which is based on open references to works with DOIs within the Crossref dump dated August 2021. This new release extends COCI with more than 92 Million additional citations, giving a total number of more than 1.18 Billion DOI-to-DOI citation links.

This latest release includes citations from the most recent articles published by the American Chemical Society, whose bibliographic references were opened in February 2021. The ACS back number citations will be available in the next COCI release, when a new processing of all the Crossref data will be completed.

You can find more information about COCI in our open-access article 

Ivan Heibi, Silvio Peroni & David Shotton (2019). Software review: COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. Scientometrics, 121 (2): 1213-1228. DOI: https://doi.org/10.1007/s11192-019-03217-6  

Finally, just a reminder that the bibliographic and citation data in COCI: 

  • can be queried using the OpenCitations Indexes SPARQL endpoint; 
  • can be retrieved by using the COCI REST API
  • can be searched by using the OpenCitations Indexes Search Interface; 
  • are also available as dumps on Figshare in CSV, N-Triples, and Scholix; and 
  • can be freely re-used for any purpose. 
Posted in Bibliographic references, Citations as First-Class Data Entities, Data publication, open access, Open Citation Identifiers, Open Citations, Open Science | Tagged , , , , | Leave a comment

From little acorns . . . A retrospective on OpenCitations

The initial vision

Now that OpenCitations is hosting over one billion freely available scholarly bibliographic citations, this is perhaps an opportune moment to look back to the start of this initiative. A little over eleven years ago, on 24 April 2010, I spoke at the Open Knowledge Foundation Conference, OKCon2010, in London, on the topic

OpenCitations: Publishing Bibliographic Citations as Linked Open Data

I reported that, earlier that same week, I had applied to Jisc for a one-year grant to fund the OpenCitations Project (opencitations.net). Jisc (at that time ‘The JISC’, the Joint Information Systems Committee) was tasked by the UK government, among other things, to support research and development in information technology for the benefit of the academic community.

The purpose of that original OpenCitations R&D project was to develop a prototype in which we:

  • harvested citations from the open access biomedical literature in PubMed Central;
  • described and linked them using CiTO, the Citation Typing Ontology [1];
  • encoded and organized them in an RDF triplestore; and
  • published them as Linked Open Data in the OpenCitations Corpus (OCC).

I told those at the conference that in this demonstration project, with limited JISC funding, we could not hope to “boil the whole ocean”, but that nevertheless there would be substantial benefits from even partial coverage of citation data from the scholarly literature:

  • We could show the way and establish best practice.
  • Despite partial coverage, all key papers would most likely be cited several times.
  • The overall topological structure of the citation network would be revealed.
  • We would create a ‘benchmark’ corpus of high-quality RDF citation data that could be used to develop analytical and visualization tools.
  • We could show the value of open citation data in helping scholars to discover full text articles of all types, and thus encourage subscription-access publishers to release their reference metadata.

The important thing, I said, was to make a start!

The Jisc OpenCitations Project

That JISC grant application was funded, and the project, to last for a year with modest funding of £100K, started in my lab in the Department of Zoology at Oxford University on 1st June 2010, and was subsequently extended for a further six months.

Using data from the Open Access subset of PubMed Central, we created the first prototype release of the OpenCitations Corpus of linked bibliographic citation data, containing 6,529,815 independent bibliographic records of both citing and cited entities, comprising references to ~20% of all post-1980 articles recorded in PubMed, including those to all the most important highly cited papers in every field of biomedical endeavour.

This achievement was almost entirely the result of the excellent work by our chief data wrangler Alex Dutton, whose skill and natural feel for linked data did wonders for this project. Ben O’Steen, Graham Klyne and Alistair Miles made important contributions.

The project also resulted in many other development, described here, most which were developed or at least initiated during a short but wonderfully productive collaboration with Silvio Peroni, who spent six months with me in 2010 as a doctoral student intern from the University of Bologna, to which he subsequently returned to complete his thesis and develop his academic career.

These included:

  • the deconstruction and re-development of the original version of CiTO into a suite of orthogonal and complementary ontologies covering the whole domain of scholarly publishing – the SPAR (Semantic Publishing and Referencing) Ontologies [2, 3];
  • the mapping of various existing metadata schemas into RDF using SPAR, including the DataCite Metadata Schema, and subsequently JATS, now the default NISO standard for XML markup of scholarly documents) [4]; and
  • the initiation of the Semantic Publishing Blog and this OpenCitations Blog.

Life after Jisc – the flowering of OpenCitations

After the Jisc funding ended and I, after a long career in biological teaching and research, formally retired from the Department of Zoology at the Oxford University, members of the initial OpenCitations team moved on to other things. Like so many grant-funded academic project whose initial financial support had dried up, OpenCitations could have foundered at that stage, as an interesting prototype but with too little content to be useful. However, the concept of providing an open alternative to proprietary citation indexes was too important to abandon. But how could it be transitioned into something enduring and useful, particularly when as a matter of principle one had decided that the citation data should be made freely available, thus precluding income generation by charging for ‘premium’ services or the formation of a commercial spin-off?

Finally, I realized that something radical needed to be done to move OpenCitations forward. I had maintained a lively collaboration with Silvio Peroni at the University of Bologna, resulting between 2011 and 2014 in the publication of 18 articles and conference papers concerning the SPAR ontologies, ontology development, documentation and visualization, and related topics, and in 2015 I invited him to start working with me directly on OpenCitations. It was the best decision I could have made. We decided to take the initial concept and re-implement it from the bottom up. OpenCitations gave Silvio a major computer science project to which he could apply his considerable talent, and soon resulted in the development of a revised RDF data model for describing citation data, the OpenCitations Data Model (OCDM) [5] and a suite of new software tools to harvest, organise and publish citations at linked open data [6]. The credit for almost all the subsequent conceptual and technical developments within OpenCitations, which have incrementally led to our present situation, is due to Silvio Peroni, and the scholarly community is indebted to him for the intelligence, skill and diligent application he has given to OpenCitations over the past six years. I am truly honoured to have Silvio as co-Director of OpenCitations, and wish to take this opportunity to acknowledge his contributions and to thank him publicly.

Our work on OpenCitations at that stage, summarized in [7], would not have been possible without the enthusiastic support of Silvio’s senior colleague Fabio Vitali and of the Department of Computer Science and Engineering at the University of Bologna, which not only provided a stimulating environment for Silvio’s post-doctoral work, but also supplied computing services and infrastructure at no charge to OpenCitations. It was also greatly helped by Professor David De Roure of Oxford University, who gave me an academic home and a formal affiliation within the Oxford e-Research Centre after my retirement from the Department of Zoology, which enabled me to continue to hold research grants.

As has been documented in earlier posts in this blog, we greatly benefitted in 2017 from a grant from the Alfred P. Sloan Foundation which enabled us to purchase a new and more powerful computing infrastructure for the sole use of OpenCitations and to extend and improve our software, and subsequently in 2019 by a project grant from the Wellcome Trust to develop the Open Biomedical Citations in Context Corpus, that permitted the extension of OCDM and SPAR for the characterization of in-text references and their textual contexts.

A significant breakthrough came in January 2018 with our decision to treat citations as first-class data entities, each with its own persistent identifier (PID), the Open Citations Identifier (OCI) [8]. This gave Silvio the freedom to envision a new kind of database, a citation index in which each citation had its own metadata, including citation timespan, citation categorization (e.g. self-citation), and of course the DOIs of the citing and cited publications. The creation of this new database was possible only with the incredible effort by Ivan Heibi, who served as a Research Fellow in the project funded by the Alfred P. Sloan Foundation at that time, and who was entirely responsible for developing the first version of the code necessary for creating such a database. Having harvested all the open references from Crossref metadata dumps, Silvio and Ivan created COCI, the OpenCitations Index of Crossref DOI-to-DOI Citations, which immediately became our principal source of open citations, the original OpenCitations Corpus being retained as a ‘sandbox’ in which to experiment with new data representations, for example those required for the Open Biomedical Citations in Context Corpus. Access to COCI was facilitated by Silvio’s development of a REST API, using his software tool RAMOSE (Restful API Manager Over SPARQL Endpoints), which enables the easily configurable deployment of a REST API over any SPARQL endpoint to an RDF triplestore

A significant breakthrough came in January 2018 with our decision to treat citations as first-class data entities, each with its own persistent identifier (PID), the Open Citations Identifier (OCI) [8]. This gave Silvio the freedom to envision a new kind of database, a citation index in which each citation had its own metadata, including citation timespan, citation categorization (e.g. self-citation), and of course the DOIs of the citing and cited publications. The creation of this new index was possible only with the incredible effort by Ivan Heibi, who served as a Research Fellow in the project funded by the Alfred P. Sloan Foundation at that time, and who was entirely responsible for developing the first version of the code necessary for creating such a database. Having harvested all the open references from Crossref metadata dumps, Silvio and Ivan created COCI, the OpenCitations Index of Crossref DOI-to-DOI Citations, which immediately became our principal source of open citations, the original OpenCitations Corpus being retained as a ‘sandbox’ in which to experiment with new data representations, for example those required for the Open Biomedical Citations in Context Corpus. Access to COCI was facilitated by Silvio’s development of a REST API, using his software tool RAMOSE (Restful API Manager Over SPARQL Endpoints), which enables the easily configurable deployment of a REST API over any SPARQL endpoint to an RDF triplestore [9]. We were able to organize our all data, both ‘traditional’ and new, and to encode it in RDF, thanks to the comprehensive OpenCitations Data Model [5], itself based on our SPAR Ontologies [3], which we evolved as necessary to accommodate new data representation requirements.

During this period we published a number of definitions, conference papers and journal articles documenting these advances, details of which can be found here. Of these, the most recent canonical publication describing OpenCitations as an infrastructure for open scholarship, and its datasets, tools, services and activities, is Peroni and Shotton (2020) [10]. We also established the Research Centre for Open Scholarly Metadata at the University of Bologna, primarily to handle administrative, financial and academic aspects of OpenCitations activities.

OpenCitations’ future

The problem remained: how to sustain the OpenCitations infrastructure financially. We were greatly helped by Bilder, Lin and Neylon’s formulation of the Principles of Open Scholarly Infrastructures (POSI) [11], in which they clearly pointing out that reliance solely on grant funding for specific projects was not the answer. OpenCitations compliance with POSI is described here. We were thus immensely grateful that SPARC Europe and other institutions had the wisdom to establish SCOSS (The Global Sustainability Coalition for Open Science Services) to facilitate the crowd-sourced financial support of useful open infrastructures by the scholarly community, including academic libraries, government agencies and other stakeholders. OpenCitations applied for SCOSS support in 2019, which led to the selection of OpenCitations for support in the SCOSS second round.

The donations we are now starting to receive from such stakeholders, and the new staff that this funding has recently allowed us to hire, signal the start of our transition from a financially vulnerable academic project to a sustainable open scholarly infrastructure of real value to the community.

The work of opening more of the global citation graph now requires two things:

  • that each publisher takes responsibility for ensuring that the references from all of its journal articles and books are submitted, together with all other bibliographic metadata, to open scholarly bibliographic metadata aggregators such as Crossref and DataCite, from which they can be indexed into open citation indexes of sufficient quality, depth of detail and breadth of coverage that these offer genuine alternatives to the expensive proprietary citation indexing services upon which the academic community presently relies; and
  • that the entire scholarly stakeholder community re-directs a fraction of the enormous sums currently spent on its subscriptions to proprietary bibliographic services in order to support Open Science infrastructures such as OpenCitations that making citations and other forms of scholarly metadata and objects freely available.

References

[1] David Shotton (2010). CiTO, the Citation Typing Ontology. J. Biomedical Semantics 1 (Suppl. 1): S6. http://dx.doi.org/10.1186/2041-1480-1-S1-S6

[2] Silvio Peroni, David Shotton (2012). FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Web Semantics, 17: 33-34. https://doi.org/10.1016/j.websem.2012.08.001, OA at http://speroni.web.cs.unibo.it/publications/peroni-2012-fabio-cito-ontologies.pdf

[3] Silvio Peroni, David Shotton (2018). The SPAR Ontologies. In Proceedings of the 17th International Semantic Web Conference (ISWC 2018): 119-136. https://doi.org/10.1007/978-3-030-00668-6_8

[4] Peroni S, Lapeyre DA and Shotton D (2012) From Markup to Linked Data: Mapping NISO JATS v1.0 to RDF using the SPAR (Semantic Publishing and Referencing) Ontologies. Proc. 2012 JATS Conference, National Library of Medicine, Bethesda, Maryland, USA (October 2012): 16-17. http://www.ncbi.nlm.nih.gov/books/NBK100491/

[5] Marilena Daquino, Silvio Peroni , David Shotton (2020). The OpenCitations Data Model. Figshare. https://doi.org/10.6084/m9.figshare.3443876.v7

[6] Silvio Peroni, David Shotton, Fabio Vitali (2017). One Year of the OpenCitations Corpus: Releasing RDF-based scholarly citation data into the Public Domain. In The Semantic Web – ISWC 2017 (Lecture Notes in Computer Science Vol. 10588, pp. 184–192). Springer, Cham. https://doi.org/10.1007/978-3-319-68204-4_19

[7] Silvio Peroni, Alexander Dutton, Tanya Gray, David Shotton (2015). Setting our bibliographic references free: towards open citation data. Journal of Documentation, 71 (2): 253-277. http://dx.doi.org/10.1108/JD-12-2013-0166, OA at http://speroni.web.cs.unibo.it/publications/peroni-2015-setting-bibliographic-references.pdf

[8] Silvio Peroni, David Shotton (2019). Open Citation Identifier: Definition. Figshare. https://doi.org/10.6084/m9.figshare.7127816

[9] Daquino, M., Heibi, I., Peroni, S., & Shotton, D. (2021). Creating Restful APIs over SPARQL endpoints with RAMOSE. Semantic Web. http://arxiv.org/abs/2007.16079

[10] Silvio Peroni, David Shotton (2020). OpenCitations, an infrastructure organization for open scholarship. Quantitative Science Studies, 1(1): 428-444. https://doi.org/10.1162/qss_a_00023

[11] Geoffrey Bilder, Jenny Lin, Cameron Neylon (2015). Principles for Open Scholarly Infrastructure. http://dx.doi.org/10.6084/m9.figshare.1314859

Posted in Citations as First-Class Data Entities, JISC, Ontologies, Open Citation Identifiers, Open Citations, Open Science | Tagged , , , , , , , , , , , , | Leave a comment

Reflections on the global citation graph

In his call for open citations, Dario Taraborelli hailed the scholarly citation graph (in which the nodes (vertices) are individual academic publications and the links (edges) represent bibliographic citations from one publication to another) as one of humankind’s most important intellectual achievements.

We all understand that the inclusion within our own academic publications of bibliographic references to the works of others is one of the most explicit ways of acknowledging the thoughts, discoveries, achievements and influences of other scholars, and their contributions to our own work. Not only does what we gain from their publications enable us to make intellectual progress, by “standing on the shoulders of giants” as Newton once famously observed [1], but the influence of these publications extends forward in time across the entire intellectual landscape, like gigantic shadows cast at sunset, whether or not those influenced by these publications have occasion to reference them in their own works.

A bibliographic citation is not only “a conceptual directional link from a citing entity to a cited entity, created by a human performative act of making a citation”, but it is additionally both enduring and retrospective. Enduring, because once made it persists for ever within the global corpus of scholarly literature, and retrospective because (with the exception of occasional contemporaneous citations) the cited publication predates the citing publication.

At the anterior margin of a crawling cell, cellular protrusive extension (for example of a pseudopodium) is achieved by the catalysed polymerization of new filaments of the cytoskeletal protein actin from attachment sites on an existing stationary actin filament network, pushing the cell margin forward [2]. The scholarly citation network (or citation graph, the two terms here being used interchangeably) is similarly dynamic and temporally directional, being extended forward as new works of scholarship are published. Extension of knowledge is achieved by the catalytic inspiration provided by existing academic publications, themselves temporally stationary within the expanding citation network, leading to the publication of new works of scholarship that cite these previous publications and thus extend the citation network further into the future. The citation graph is thus not just an acyclic directed graph, but an acyclic temporally directed graph. Indeed, it is this temporal aspect of the citation network that is one of its most important features.

To use another analogy, the human genealogical tree is inherently multidimensional and difficult to represent pictorially in its entirety, because each new birth brings together the family trees of the child’s two parents. However, unless the parents are seriously promiscuous, the resulting genealogical tree is not impossibly complex. In contrast, the scholarly citation network is much more highly interlinked, since each new publication cites not just two but many preceding (‘parent’) publications, which themselves may beget many other citations.

Visualization of the global scholarly citation graph, or portions of it, is thus inherently difficult, and the important temporal aspect of the graph is the one ignored by almost every method used for visualizing aspects of that graph. Existing methods may take the broad view, showing the links, and the strength of those links, between one scholarly domain and another, thus visualizing the ‘structure of science’. Alternatively, they may take a more detailed view of a small section of the graph, visualize the proximity of individual publications to one another. Often a radial display is chosen for this, that shows in closest proximity those papers directly referenced by the selected publication in the centre, then at a greater radius those papers referenced by the cited papers shown in the inner circle, and so on. Because of the graph’s complexity, such displays quickly looses intelligibility after two citation links.

Among a small number of visualization applications that do not ignore the temporal aspect of the graph is Citeology, a temporally based citation network visualization tool developed some years ago by Justin Matejka and colleagues at the design software company Autodesk [3]. Unfortunately, this innovative software prototype was not central to that company’s mission, development ceased, and the Citeology Java app is no longer available. However, in his last email to me, Justin Matejka kindly offered to help others re-create this application.

There is thus an urgent need for innovative new open-source visualization tools that will clearly and dynamically display portions of the global citation graph, for example the direct and indirect citation connections between any two publications or any two individuals, along the temporal axis of publication date. Developers within the open science community please step forward!

References

[1] Isaac Newton, in a 1675 letter to Robert Hooke, wrote “If I have seen further it is by standing on the shoulders of Giants.” https://discover.hsp.org/Record/dc-9792/

[2] Bruce Alberts et al. (2014). Molecular Biology of the Cell. 6th Edition. Garland Science. Chapter 16, The Cytoskeleton.

[3] Justin Matejka, Tovi Grossman, George Fitzmaurice (2012). Citeology: Visualizing Paper Genealogy. ACM Extended Abstracts on Human Factors in Computing Systems. https://www.autodesk.com/research/publications/citeology https://d2f99xq7vri1nk.cloudfront.net/CiteologyVideo.mp4

Posted in Bibliographic references, Information visualization, Open Citations, Open scholarship, Open Science, Semantic Publishing | Tagged , , , , | Leave a comment

OpenCitations’ compliance with the Principles of Open Scholarly Infrastructure

What should an open scholarly infrastructure look like? 

An answer to this tough question can be found in the original February 2015 blog post by Geoffrey Bilder, Jennifer Lin and Cameron Neylon

Bilder G., Lin J., Neylon C. (2015) Principles for Open Scholarly Infrastructure , http://dx.doi.org/10.6084/m9.figshare.1314859

and in the summary of the principles to be found as:  

Bilder G, Lin J, Neylon C (2020), The Principles of Open Scholarly Infrastructurehttps://doi.org/10.24343/C34W2H : 

Infrastructure at its best is invisible. We tend to only notice it when it fails.  If successful, it is stable and sustainable. Above all, it is trusted and relied on by the broad community it serves. Trust must run strongly across each of the following areas: running the infrastructure (governance), funding it (sustainability), and preserving community ownership of it (insurance)”. 

These areas are fully define the Principles of Open Scholarly Infrastructure (POSI), which provide a set of guidelines by which open scholarly infrastructure organizations and initiatives that support the research community can be run and sustained.  

As far as we are aware, Crossref was the first infrastructure to publish its compliance with POSI, detailed in Geoffrey Bilder’s December 2020 blog post

Crossref’s Board votes to adopt the Principles of Open Scholarly Infrastructure.

OpenCitations too espouses POSI and, in January 2021, we monitored the extent of our own compliance with POSI, the results of which are shown in the following diagram. 

Governance 

 Coverage across the research enterprise We gather citations from global scholarship 
 Stakeholder governed Advisory board 
currently lacks
executive power and is not elected 
 Non-discriminatory membership Membership open to all those espousing 
open science 
● Transparent operations Everything is open 
 Cannot lobby OpenCitations lobbies to achieve open 
scholarly citations 
and bibliographic 
metadata; 
it does not engage in political or financial 
lobbying 
 Living will Since all our data open, others can 
recreate our service 
 Formal incentives to fulfill mission & wind-down No formal plan for wind-down 
has yet been drawn up 

Sustainability 

 Time-limited funds used only for time-limited activities Grant income should 
be used solely for grantprojects 
 Goal to generate surplus Goal not yet realized – 
income so far too limited 
 Goal to create contingency fund to support operations for 12 months Goal not yet realized – 
income so far too limited 
 Mission-consistent revenue generation Membership fees and 
solicited donations 
 Revenue based on services, not data All data and services freely given to community, and thus do not 
generate income 

Insurance 

 Open source All software under open source licenses 
 Open data All data available 
under CC0 waiver 
 Available data All data available via REST APIs, SPARQL endpoints, query interfaces and data dumps 
 Patent non-assertion We will not 
patent anything: 
OpenCitations’ 
infrastructure 
is free to replicate 

 
We at OpenCitations are proud of the results reached in the Insurance area, but realise that we still have some was to go in the other areas. Although the general situation is already satisfying, we are working to strengthen our weak points.  

Posted in Data publication, Open Citations, Open scholarship, Open Science | Tagged , , , , | Leave a comment

Swiss Institutions pledge 89,250 Euros to OpenCitations

We want to express our gratitude to the 18 institutional members and customers of the Consortium of Swiss Academic Libraries which have now pledged 89,250 euros to support OpenCitations over the next three years. This generous donation is part of a total funding of 320,250 euros destined for the three services currently being promoted by SCOSSDOAB and OAPEN, PKP, and OpenCitations.  

The Consortium of Swiss Academic Libraries involves all cantonal universities, the ETH Domain, the Swiss National Library and other institutions from the fields of education and research as well as from the public sector, with the core task of licensing of e-resources (electronic journals, databases, eBooks) for its members and customers.  

As can be read in this post, Susanne Aerni, Head of Consortial Services commented on the pledge: “This pledge exemplifies the broad Swiss commitment to vital infrastructure for Open Access and Open Science. All Swiss Universities, all institutions of the ETH-domain, some Universities of Applied Science, CERN, and the Swiss National Science Foundation support these three vital services through the Consortium of Swiss Academic Libraries.” 

Thank you, Switzerland, for your support to OpenCitations! 

Posted in Open Citations, Open scholarship, Open Science | Tagged , , , | Leave a comment

Crossing a significant threshold: more than one billion citations now available in COCI!

“The competitive benefits of closing access to citation data diminish with each new citation released to the public domain, but the benefits of open data remain. Going forward, citation data is almost completely public domain”.

With these words, from the article “A tipping point for open citations data” (July 15, 2021), Ian Hutchins celebrated the threshold crossing of one billion citations on public-domain databases in February 2021.

Now, a new significant milestone has been reached. We are enthusiastic to announce that COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations has just been extended with 334 million additional citations. Its most recent release, the COCI July 2021 release, now contains a total of 1.09 billion DOI-to-DOI citation links derived from open references within Crossref,which includes the references of articles deposited or opened in Crossref between November 2020 and January 2021.

These numbers make us proud, and confirm the essential value of the Initiative for Open Citations (I4OC). Since 2018, the mission of I4OC has been to persuade publishers to provide open citation data by means of the Crossref platform. The I4OC untiring commitment has led the major academic publishers to a progressive change of heart regarding open citations, and the scholarly community to a deeper interest in this openness.

These factors contributed to the creation of COCI in 2018, the first open citation index created by OpenCitations, in which we applied the concept of citations as first-class data entities (Heibi I., Peroni S., Shotton D., 2019). Over the last three years, COCI has been extended in a series of releases, by harvesting citations mostly from Crossref data dumps, starting from an initial coverage of 300 million citations (First release).

A crucial event that preceded (and delayed!) this latest COCI release was Elsevier’s endorsement in the DORA Declaration on Research Assessment in December 2020, thereby making “reference lists for all articles published in Elsevier journals openly available via Crossref so they can be available for reuse. This means other important initiatives like I4OC can draw on this metadata”. As described in our previous post, Elsevier’s welcome commitment led to the opening of many previously closed references from its numerous academic journals submitted to Crossref. Now, after an extended period of data ingestion and processing, all these newly opened Elsevier references are available at OpenCitations within COCI.

Elsevier’s involvement has both an effective and a symbolical value. Even if publishing more than one billion citations is a thrilling achievement, and – as Hutchins wrote – we are now at a tipping point with regard to open citations data, this milestone is not the last stop. Together with the other organizations and projects that participate in the Initiative for Open Citations, we will keep claiming the urgency for the remaining academic publishers to join our cause, and sharing our values with the whole academic community to make all existing citations data freely open and accessible. Recalling what Dario Taraborelli wrote in the conclusion of his article “The citation graph is one of humankind’s most important intellectual achievements“, “the world is waiting for the citation graph to become a public good”.

Posted in Bibliographic references, Citations as First-Class Data Entities, Data publication, open access, Open Citation Identifiers, Open Citations, Semantic Publishing | Tagged , , , , , , , , | Leave a comment