Citations as First-Class Data Entities: The Open Citation Identifier Resolution Service

Requirements for citations to be treated as first-class data entities

In my introductory blog post, I listed five requirements for the treatment of citations as first-class data entities.  The fifth and final of these requirements is that there must be a Web-based identifier resolution service that takes the citation identifier as input and returns a description of the citation.

At the recent PIDapalooza Conference on persistent identifiers, held in Gerona, Spain, I described the Open Citation Identifier Resolution Service, the new resolution service for Open Citation Identifiers created and operated by OpenCitations [1].

In this post, I describe this Open Citation Identifier Resolution Service, which supports the resolution of Open Citation Identifiers not only of the citations documented in the OpenCitations Corpus (OCC), but also of open citations recorded in other bibliographic databases.

What is the Open Citation Identifier Resolution Service

The Open Citation Identifier Resolution Service runs on the OpenCitations server, presenting itself to the user as a web page with the URI http://opencitations.net/oci.

When a user enters a valid OCI and clicks the “Look up citation” button, this activates the resolution service, which, after a brief delay, returns information about the citation itself and about the citing and cited bibliographic resources, as shown in the following screen image (which for clarity omits the provenance data associated with this citation).

This information can optionally be returned to the user in a variety of other formats: RDF/XML, Turtle or JSON-LD.

Clicking on the links provided will return additional metadata held by the OpenCitations Corpus for the citing and the cited documents.  In the near future, this service will be integrated with LUCINDA, the forthcoming OCC browse interface, to present this information in a more user-friendly fashion.

Using the Resolution Service with citations in an external resource via a SPARQL endpoint

The Open Citation Identifier Resolution Service currently works for citations between bibliographic resources both within the OpenCitations Corpus and within external bibliographic databases, provided that the external service uses bibliographic resource identifiers having a unique numerical part, and provides a SPARQL endpoint to makes available information about bibliographic resources and the references they contain.

It can therefore resolve OCIs identifying citations within Wikidata, such as oci:01027931310-01022252312, where, as explained in the previous blog post, “010” is the assigned OCC supplier prefix for Wikidata.

Entering this OCI in the Open Citation Identifier Resolution Service pulls live data from the Wikidata SPARQL endpoint and returns the following information about that citation, as shown in the following screen image (which, again, omits for clarity the provenance data associated with that citation):

Clicking on the links provided here returns information about the relevant Wikidata entities.

Citing paper:

Cited paper:

How the Resolution Service works

The bibliographic database supplying the metadata for a particular citation identified by an OCI is specified by the assigned OCC supplier prefix that forms part of the OCI, as described in the previous blog post. Each OCI is thus specific for and unique within a particular bibliographic database.

The resolution service takes the OCI entered into the search box, recognises the supplier prefix specifying the bibliographic database holding the citation information, parses the OCI into the database identifiers for the citing and cited entities, and then sends an appropriate SPARQL query to interrogate the SPARQL endpoint of the relevant database. When that database has returned information about the citation itself and about the citing and cited bibliographic resources, this is displayed to the user as shown in screen images above – or in other RDF formats (Turtle, JSON-LD, RDF/XML) according to the request.

It is important to realize that no other databases are contacted during this resolution process, and that the quality and accuracy of the metadata retrieved by the Open Citation Identifier Resolution Service is the responsibility of the database hosting that citation.  The OCI Resolution Service does no more than retrieve this information, and does nothing to address possible errors or omissions in the metadata coming from the hosting database.

Using the Resolution Service with external citations via a REST API

While the resolution service presently works only to retrieve information from bibliographic databases having a SPARQL endpoint, we plan soon to extend this resolution service to work with information supplied by a bibliographic database via a REST API.

Coupled with the ability to create OCIs by numerical conversions of Digital Object Identifiers (DOIs), as explained in the previous blog post, the Open Citation Resolution Service could then be used to pull metadata live from the Crossref REST API for any of the ~350 million Crossref open references in which the cited paper as well as the citing paper has a DOI, and for which an OCI can thus be created.

Watch this space!

References

[1]     David Shotton (2018). Citations as first-class data entities. Open Citation Identifiers.  Conference presentation. PIDapalooza 2018, Girona, 23-23 January 2018. https://doi.org/10.6084/m9.figshare.5844972

This entry was posted in Bibliographic references, Citations as First-Class Data Entities, Open Citation Identifiers, Open Citations, Semantic Publishing and tagged , , , , , . Bookmark the permalink.

3 Responses to Citations as First-Class Data Entities: The Open Citation Identifier Resolution Service

  1. Pingback: Citations as First-Class Data Entities: Introduction | OpenCitations

  2. Pingback: The Crossref Open Citation Index (COCI) | OpenCitations

  3. Pingback: The OpenCitations Enhancement Project – final report | OpenCitations

Leave a comment