Bibliographic references are the links that knit together independent scholarly endeavours. I am thus delighted to announce that Nature Publishing Group, publisher of Nature, Nature Genetics and many other leading journals, has agreed to open its articles’ reference lists, initially for a selected number of NPG journals, and contribute the bibliographic citations contained in these lists as open linked data to an expanded Open Citations Corpus, where they will be freely available for everyone to use in whatever manner they choose. Preparations to expand the corpus in this manner, by integration with the reference processing pipeline of the CrossRef Cited-By Linking service, will be undertaken over the next six months of this year, and incorporation of the references from the selected NPG journals into the expanded Open Citations Corpus is planned to commence in the first half of 2013.
As the first subscription-access publisher to opening its reference lists in this way, Nature Publishing Group is further demonstrating its commitment to ‘lead from the front’ in its embrace of new semantic publishing technologies. Only two months ago, this publisher announced its decision to open up the bibliographic records of its journal articles as open linked data. On 4th April, NPG’s Linked Data Platform press release read:
“Nature Publishing Group (NPG) today is pleased to join the linked data community by opening up access to its publication data via a linked data platform. NPG’s Linked Data Platform is available at http://data.nature.com. The platform includes more than 20 million Resource Description Framework (RDF) statements, including primary metadata for more than 450,000 articles published by NPG since 1869. In this first release, the datasets include basic bibliographic information (title, author, publication date, etc) as well as NPG-specific ontology terms. These datasets are being released under an open metadata license, Creative Commons Zero (CC0), which permits maximal use/re-use of this data. NPG’s platform allows for easy querying, exploration and extraction of data and relationships about articles, contributors, publications, and subjects. Users can run web-standard SPARQL Protocol and RDF Query Language (SPARQL) queries to obtain and manipulate data stored as RDF. The platform uses standard vocabularies such as Dublin Core, FOAF, PRISM, BIBO and OWL, and the data is integrated with existing public datasets including CrossRef and PubMed. More information about NPG’s Linked Data Platform is available at http://developers.nature.com/docs. Sample queries can be found at http://data.nature.com/query. ”
We very much hope that NPG’s example will encourage other subscription-access publishers to open their own journal article reference lists, and become early adopters of Open Citations. Reasons why subscription-access publishers should willingly join NPG and open their citation data are given in my previous blog post. While at first coverage among subscription-access publishers will be incomplete, this expanded Open Citation Corpus will, I am sure, draw in increasing numbers of publishers, and in true Web 2.0 style will become more useful the more publishers participate, resulting in value-added bibliographic and bibliometic services being created over the open data. Other subscription-access publishers who would like to contribute their journal article references to the Open Citations Corpus should contact me at <email@example.com>.