New release of COCI: 450M DOI-to-DOI citation links now available

As introduced in a previous blog post, COCI is the OpenCitations Index of Crossref open DOI-to-DOI references, all released as CC0 material. It is our first OpenCitations Index of open citations, in which we have applied the concept of citations as first-class data entities to index the contents of one of the major databases ofopen scholarly citation information, namely Crossref, and to render and make available this information in machine-readable RDF.

We are now proud to announce a new release of COCI, the second, which now contains almost 450 million DOI-to-DOI citation links coming from both ‘the ‘Open’ and the ‘Limited’ sets of Crossref reference data.  This represents an increase of 42% in the number of indexed citations, compared with the initial release of COCI on 4th June 2018, which indexed 316,243,802 citations involving 45,145,889 bibliographic resources. In addition, the data model for COCI has now been extended so as to state directly the presence of journal self-citations and author self-citations.

Extended data model

The previous data model used for storing the citation data in COCI – which is itself a subset of the OpenCitations Data Model – has been extended so as to keep track of two particular types of self-citation, as shown in the following figure.

The new data model used in COCI for describing its citation data, which includes classes for describing two kinds of self-citations, i.e. journal self-citations and author self-citations.

Generally speaking, a self-citation is citation in which the citing and the cited entities have something significant in common with one another, over and beyond their subject matter. The two kinds of self-citations we are now tracking are:

  • journal self-citation (class cito:JournalSelfCitation), i.e. a citation in which the citing and the cited entities are published in the same journal. This information has been obtained by comparing the ISSNs of the journals where two journal articles related by a citation have been published, as provided by Crossref. If they share the same ISSN, then the citation is described as journal self-citation;
  • author self-citation (class cito:AuthorSelfCitation), i.e. a citation in which the citing and the cited entities have at least one author in common. This information has been obtained by comparing the ORCIDs associated to the authors of a citing bibliographic entity with the ORCIDs of the authors of the cited entity. In this case, if any ORCID is shared, then the citation is described as author self-citation.  This categorization excludes authors bearing the same name where the ORCIDs are not known, since, while these instances may be author self-citations, they may alternatively merely represent name coincidences of distinct individuals.

It is worth mentioning that, while the ISSN information are usually present in the data returned by Crossref, the presence of ORCID id data associated with the authors of the various paper represented in Crossref is presently very limited, so that the number of recorded author self-citations in COCI is likely to be a considerable underestimate.

In this new release, COCI contains 449,842,374 citations, of which 30,114,696 are recorded as journal self-citations and 251,699 are recorded as author self-citations.

Extended REST API

The REST API for querying COCI has been extended so as to return information about the aforementioned self-citations. In particular, the response to the operations “references” and “citations” now has two more fields, i.e. “journal_sc” and “author_sc”, that are set to “yes” if the citation returned is a journal self-citation or an author self-citation respectively, or “no” otherwise.

Using the capabilities of the REST API, it is also possible to keep in or exclude from the result set those citations that are (or are not) one of the aforementioned types of self-citation. For instance, the following call

https://w3id.org/oc/index/coci/api/v1/citations/10.1002/pol.1987.140251103?filter=journal_sc:yes

returns all the citations having the article with DOI “10.1002/pol.1987.140251103” that are journal self-citations.

Conclusions

In this blog post we have introduced the second release of COCI, the OpenCitations Index of Crossref open DOI-to-DOI references, a citation index which now contains almost 450 million open citations created from the ‘Open’ and ‘Limited’ references included within Crossref.

As a reminder, all the data in COCI:

We plan soon to extend the OpenCitations Indexes by adding indexes of citations coming from other source datasets, including Wikidata and DataCite.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s