More than 624 million citations now available on COCI

COCI is the OpenCitations Index of Crossref open DOI-to-DOI citations, all released as CC0 material, and is described in the article

Heibi I, Peroni S, Shotton D (2019). Software review: COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. Scientometrics 121(2): 1213-1228. https://doi.org/10.1007/s11192-019-03217-6

COCI is our first OpenCitations Index of open citations, in which we have applied the concept of citations as first-class data entities, each identified using a unique persistent Open Citation Identifier (OCI), to index the contents of one of the major databases of open scholarly citation information, namely Crossref, and to render and make available this information in machine-readable RDF.

We are now proud to announce the third release of COCI, which contains more than 624 million DOI-to-DOI citation links coming from both ‘the ‘Open’ and the ‘Limited’ sets of Crossref reference data. This represents an increase of 40% in the number of indexed citations, compared with the second release of COCI on 12th November 2018, which indexed more than 445 million citations. The data model used for this third release of COCI is the updated revision of the OpenCitation Data Model, published on 8 November 2019 and available at https://doi.org/10.6084/m9.figshare.3443876.

This new release of COCI has been created using new software developed specifically for this purpose, which is available on our GitHub repository under an open ISC license. This software automates the process of creating an OpenCitations Index compliant with the OpenCitations Data Model and creates the citation data and related provenance information in three different formats: CSV, N-Triples (RDF), and Scholix. The support for Scholix – a high-level interoperability framework supported by Crossref, DataCite, Europe PubMed Central, OpenAIRE and others  – has recently been added to provide an additional format for the exchange of information about the links between scholarly literature and datasets.

A great advantage of the new software is that it will now enable us to extend COCI (and any other OpenCitations Index) by means of incremental additions, rather than having to re-create the entire index at each update. This should enable us to release index updates more frequently than hitherto, thus keeping the index more closely in synchrony with the latest reference data released by Crossref. Note that we are currently run the software on previous dumps of Crossref data so as to retrieve all the citations that involve references in citing articles that were in the ‘Limited’ set when we downloaded it, but that currently appear in the Crossref ‘Closed’ data set due to more recent restrictive policy decisions taken by their publishers.

Finally, we wish to remind you that all the bibliographic and citation data in COCI:

This entry was posted in Citations as First-Class Data Entities, Data publication, Open Citation Identifiers, Open Citations and tagged , , , . Bookmark the permalink.

3 Responses to More than 624 million citations now available on COCI

  1. Pingback: Open Citations: More Than 624 Million Citations Now Available on COCI | LJ infoDOCKET

  2. Pingback: The Open Biomedical Citations in Context Corpus: Progress Report | OpenCitations

  3. Pingback: Introducing InTRePIDs – In-Text Reference Pointer Identifiers | OpenCitations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s