The OpenCitations Enhancement Project funded by Sloan
The Alfred P. Sloan Foundation, which funds research and education in science, technology, engineering, mathematics and economics, including a number of key technology projects relating to scholarly communication, has agreed to fund The OpenCitations Enhancement Project, a new project to develop and enhance the OpenCitations Corpus.
As readers of this blog will know, the OpenCitations Corpus is an open scholarly citation database that freely and legally makes available accurate citation data (academic references) to assist scholars with their academic studies, and to serve knowledge to the wider public.
The OpenCitations Enhancement Project, funded by the Sloan Foundation for 18 months from May 2017, will make the OpenCitations Corpus (OCC) more useful to the academic community both by significantly expanding the volume of citation data held within the Corpus, and by developing novel data visualizations and query services over the stored data.
At OpenCitations, we will achieve these objectives in the following ways:
(a) By establishing a new powerful physical server to handle the Corpus data and offer adequate performance for query services.
(b) By increasing the rate of data ingest into the Corpus, by integrating with server 30 small data-ingest computers, Raspberry Pi 3Bs, working in parallel to harvest references, thus increasing the current rate of corpus data ingest some thirty-fold to about half a million citation links per day.
(c) By employing a post-doctoral computer science research engineer specifically to develop information visualisation interfaces and sense-making tools that will both provide smart ways of envisaging and comprehending the citation data stored within the OpenCitations Corpus, and will also ease the task of manual curation of the OCC.
This post-doctoral appointment will start in the autumn of 2017, once the new hardware has been commissioned and programmed. We seek a highly intelligent, skilled and motivated individual who is an expert in Web Interface Design and Information Visualization, and who can demonstrate a commitment to increasing the openness of scholarly information. A formal advertisement for this post, which will be held at the University of Bologna in Italy under the supervision of Dr Silvio Peroni, will be published in the near future. In the mean time, individuals with the relevant skills and background who would like to express early interest in joining the OpenCitations team in this role should contact him by e-mail to <email@example.com>.
By the end of the OpenCitations Enhancement Project, we will have harvested approximately 190 million citation links obtained from the reference lists of about 4.4 million scholarly articles (~15% of Web of Science’s coverage). In this way, in a significant initial step towards the comprehensive literature coverage we seek for the OCC, we will establish the OpenCitations Corpus as a valuable and persistent free-to-use global scholarly on-line Linked Open Data service.
In so doing, we aim at empower the global community by liberating scholarly citation data from their current commercial shackles, publishing such data with a Creative Commons CC0 Public Domain Dedication that will enable novel third-party services to be built over them.