Openness of non-Elsevier references

For completeness, this post, also based on analyses performed by Daniel Ecer of eLife (d.ecer@elifesciences.org) on data he downloaded from Crossref in September 2017 (Ecer, 2017), complements the two preceding posts, and details the openness of references from scholarly publishers other than Elsevier.

 The main conclusion is that, of the 650,093,489 references stored in Crossref from journal articles published by publishers other than Elsevier, 486,041,671 (74.76%) are open.

The detailed statistics derived from the Crossref data at the time of sampling relating to all publishers except Elsevier are as follows:

Number of works recorded at Crossref from publishers other than Elsevier

Crossref has records of 93,184,372 works with DOIs, of which 69,699,633 (74.80%) are journal articles and 23,484,739 (25.20%) are works that are not journal articles (i.e. book chapters, proceedings articles, datasets, etc.).

Of the 93,184,372 works, 76,795,932 (82.41%) were published by publishers other than Elsevier.

Of the 69,699,144 journal articles, 54,440,761 (78.11%) were in journals with publishers other than Elsevier.

Of the 23,484,739 works that are not journal articles, 22,355,171 (95.19%) were published by publishers other than Elsevier.

Numbers of non-Elsevier works with references

Of all 76,795,932 works with DOIs recorded in Crossref from publishers other than Elsevier, 27,609,963 (35.95%) have accompanying references and 49,185,969 (64.05%) lack references.

Of the 54,440,761 journal articles recorded in Crossref from publishers other than Elsevier, 23,459,805 (43.09%) have accompanying references, and 30,980,956 (56.91%) lack references.

Of the 22,355,171 works that are not journal articles recorded in Crossref from publishers other than Elsevier, 4,150,158 (18.56%) have accompanying references, and 18,205,013 (81.44%) lack references.

Number of non-Elsevier references at Crossref

Of the 1,075,133,743 references stored in Crossref from all works, 732,513,350 (68.13%) are from works published by publishers other than Elsevier.

Of the 956,050,193 references stored in Crossref from journal articles, 650,093,489 (68.00%) are from journals published by publishers other than Elsevier.

Of the 119,083,550 references stored in Crossref from works that are not journal articles, 82,419,861 (69.21%) are from works published by publishers other than Elsevier.

Average numbers of references per non-Elsevier work

The 732,513,350 non-Elsevier references stored in Crossref come from 27,609,963 works of all types with accompanying references, giving an average of 26.53 references per work.

650,093,489 non-Elsevier references come from 23,459,805 non-Elsevier journal articles with accompanying references, giving an average of 27.71 references per journal article.

82,419,861 non-Elsevier references come from 4,150,158 non-Elsevier works with accompanying references that are not journal articles, averaging 19.86 references per work.

Proportion of non-Elsevier works that have open references

Of the 27,598,963 non-Elsevier works of all type documented in Crossref that have accompanying references, 18,228,221 (66.05%) have open references.

Of the 23,459,805 non-Elsevier journal articles documented in Crossref that have accompanying references, 17,072,801 (72.77%) have open references.

Of the 4,139,158 non-Elsevier works documented in Crossref that are not journal articles and that have accompanying references, 1,155,420 (27.91%) have open references.

Proportion of non-Elsevier references that are open

Of the 732,513,350 references stored in Crossref from all works published by publishers other than Elsevier, 523,186,205 (71.42%) are open, and 209,327,145 (28.58%) are not open.

Of the 650,093,489 references stored in Crossref from journal articles published by publishers other than Elsevier, 486,041,671 (74.76%) are open, and 164,051,818 (25.24%) are not open.

Of the 82,419,861 references stored in Crossref from works published by publishers other than Elsevier that are not journal articles, 37,144,534 (45.07%) are open, and 45,275,327 (54.93%) are not open.

Proportion of references which are not open that are published by publishers other than Elsevier

Of the 551,932,682 references from all works stored at Crossref that are not open, 209,327,145 (37.93%) are from works published by publishers other than Elsevier.

Of the 470,008,522 references from journal articles stored at Crossref that are not open, 164,051,818 (34.90%) are from journal articles published by publishers other than Elsevier.

Of the 81,924,160 references from works that are not journal articles stored at Crossref that are not open, 45,275,327 (55.26%) are from works published by publishers other than Elsevier.

 

 Details for all publishers combined, and for Elsevier separately, are given in the two previous posts.

 

Reference

Ecer, D. (2017). Crossref Data Notebook. Available at https://elifesci.org/crossref-data-notebook

 

Advertisements
Posted in open access, Open Citations, Open scholarship | Tagged , , | 1 Comment

Elsevier references dominate those that are not open at Crossref

Yesterday (November 23rd 2017) I was working with Daniel Ecer of eLife (d.ecer@elifesciences.org) to dig some hard facts out of the analyses he undertook on data he downloaded from Crossref in September 2017 (Ecer, 2017).  Because of its dominant position in the scholarly publishing world, in this, the second of two related posts, I report the results for references from works published by Elsevier.

These show that, of all 956,050,193 references from journal articles stored at Crossref, 305,956,704 (32.00%) are from journal articles published by Elsevier, none of which are in the Crossref “Open” category, freely available for others to use.

Put another way, of the 470,008,522 references from journal articles stored at Crossref that are not open, 305,956,704 (65.10%) are from journals published by Elsevier.

On behalf of I4OC, I appeal to Elsevier to join the other major academic publishers and to submit and open all its references without delay.

The detailed statistics derived from the Crossref data at the time of sampling relating to Elsevier publications are as follows:

Number of Elsevier works recorded at Crossref

Crossref has records of 93,184,372 works with DOIs, of which 69,699,633 (74.80%) are journal articles and 23,484,739 (25.20%) are works that are not journal articles (i.e. book chapters, proceedings articles, datasets, etc.).

Of the 93,184,372 works of all types, 16,388,440 (17.59%) were published by Elsevier.

Of the 69,699,633 journal articles, 15,258,872 (21.89%) were published in Elsevier journals.

Of the 23,484,739 works that are not journal articles, 1,129,568 (4.81%) were published by Elsevier.

Numbers of Elsevier works with references

Of all 16,388,440 Elsevier works with DOIs recorded in Crossref, 10,835,273 (66.12%) have accompanying references, and 5,553,167 (33.88) lack references.

Of the 15,258,872 Elsevier journal articles recorded in Crossref, 10,212,958 (66.93%) have accompanying references, and 5,045,914 (33.07%) lack references.

Of the 1,129,568 Elsevier works that are not journal articles recorded in Crossref, 622,315 (55.09%) have accompanying references, and 507,253 (44.91%) lack references.

Number of Elsevier references at Crossref

Of the 1,075,133,743 references stored in Crossref from all works, 342,620,393 (31.87%) are from works published by Elsevier.

Of the 956,050,193 references stored in Crossref from journal articles, 305,956,704 (32.00%) are from journals published by Elsevier.

Of the 119,083,550 references stored in Crossref from works that are not journal articles, 36,663,689 (30.79%) are from works published by Elsevier.

Average numbers of references per Elsevier work

The 342,620,393 Elsevier references stored in Crossref come from 10,835,273 works of all types with accompanying references, giving an average of 31.62 references per work.

305,956,704 Elsevier references come from 10,212,958 Elsevier journal articles with accompanying references, giving an average of 29.96 references per journal article.

36,663,689 Elsevier references come from 622,315 Elsevier works with accompanying references that are not journal articles, averaging 58.92 references per work.

Proportion of Elsevier works that have open references

Of the 10,846,273 Elsevier works of all type documented in Crossref that have accompanying references, 417 (0.0038%) have open references.

Of the 10,212,958 Elsevier journal articles documented in Crossref that have accompanying references, none (0.0000%) have open references.

Of the 633,315 Elsevier works documented in Crossref that are not journal articles and that have accompanying references, 417 (0.0658%) have open references.

Proportion of Elsevier references that are open

Of the 342,620,393 references stored in Crossref from works of all types published by Elsevier, 14,856 (0.0043%) are open, and 342,605,537 (99.9957%) are closed.

Of the 305,956,704 references stored in Crossref from journal articles published by Elsevier, none (0.0000%) are open, 100% being closed.

Of the 36,663,689 references stored in Crossref from works published by Elsevier that are not journal articles, 14,856 (0.0405%) are open, and 36,648,833 (99.9595%) are closed.

Proportion of references which are not open that are published by Elsevier

Of the 551,932,682 references from all works stored at Crossref that are not open, 342,605,537 (62.07%) are from works published by Elsevier.

Of the 470,008,522 references from journal articles stored at Crossref that are not open, 305,956,704 (65.10%) are from journal articles published by Elsevier.

Of the 81,924,160 references from works that are not journal articles stored at Crossref that are not open, 36,648,833 (44.74%) are from works published by Elsevier.

 

Details for all publishers combined are given in the previous post, and those for all publishers other than Elsevier in the following post.

 

[Note: As a result of further calculations undertaken by Daniel Ecer on 27th November 2017, which are recorded in his updated Crossref Data Notebook (Ecer, 2017), the figures in this blog have been expanded to show the average number of references for Elsevier works submitting references.  At the same time, very minor corrections have been made the total numbers of works in each category, which have not altered the percentages and main conclusions presented in this post.]

 

Reference

Ecer, D. (2017). Crossref data notebook. Available at https://elifesci.org/crossref-data-notebook

 

Posted in open access, Open Citations, Open scholarship | Tagged , , | 1 Comment

Milestone for I4OC – open references at Crossref exceed 50%

Yesterday (November 23rd 2017) I was working with Daniel Ecer of eLife (d.ecer@elifesciences.org) to dig some hard facts out of the analyses he undertook on data he downloaded from Crossref in September 2017 (Ecer, 2017).  In this, the first of two related posts, I report the results for all publishers.

The analyses show that, of the 33,672,763 journal articles documented in Crossref that have accompanying references, 17,072,801 (50.70%) have open references, and of the 956,050,193 references from journal articles stored at Crossref, 486,041,671 (50.84%) are now classified as “Open”, and are freely available for third parties to download and use for any purpose.

This is a significant milestone for the Initiative for Open Citations (I4OC, https://i4oc.org/), which since early 2017 has been campaigning for scholarly publishers to open their reference lists, and a major gain for the world of open scholarship.

The academic community is deeply indebted to all those publishers whose references are now open (https://i4oc.org/#publishers), and to Crossref itself (https://www.crossref.org/), for making these references freely available, providing a tremendous resource for bibliometric analysis.

However, 51.7% of the journal articles recorded in Crossref lack accompanying references, and of the references that are submitted together with the metadata for the remaining journal articles, 49.16% are yet not open.

On behalf of I4OC, I strongly encourage those publishers who are not yet submitting references to Crossref with their article metadata to start to do so, and those other publishers who are submitting references but have not yet made them open to open them without delay, by sending a message requesting this to support@crossref.org.

The detailed statistics derived from the Crossref data at the time of sampling relating to all publishers are as follows:

Works with DOIs documented at Crossref

Crossref has records of 93,184,372 works with DOIs, of which 69,699,633 (74.80%) are journal articles and 23,484,739 (25.20%) are works that are not journal articles (i.e. book chapters, proceedings articles, datasets, etc.).

Numbers of works with references

Of all 93,184,372 works with DOIs recorded in Crossref, 38,445,236 (41.3%) have accompanying references and 54,739,136 (58.7%) lack references.

Of the 69,699,633 journal articles recorded in Crossref, 33,672,763 (48.3%) have accompanying references, and 36,026,381 (51.7%) lack references.

Of the 23,484,739 works that are not journal articles recorded in Crossref, 4,772,473 (20.32%) have accompanying references, and 18,709,979 (79.68%) lack references.

Numbers of references at Crossref

Crossref stores 1,075,133,743 references from all 93,184,372 works.

Of these references, 956,050,193 references (88.92%) are from journal articles, and 119,083,550 references (11.08%) are from works that are not journal articles.

Average numbers of references per work

The 1,075,133,743 references stored in Crossref come from 38,445,236 works of all types with accompanying references, giving an average of 27.97 references per work.

956,050,193 references come from 33,672,763 journal articles with accompanying references, giving an average of 28.39 references per journal article.

119,083,550 references come from 4,772,473 works with accompanying references that are not journal articles, averaging 24.95 references per work.

Proportion of works that have open references

Of the 38,445,236 works of all type documented in Crossref that have accompanying references, 18,228,638 (47.41%) have open references.

Of the 33,672,763 journal articles documented in Crossref that have accompanying references, 17,072,801 (50.70%) have open references.

Of the 4,772,473 works documented in Crossref that are not journal articles and that have accompanying references, 1,155,837 (24.22%) have open references.

Proportion of references that are open

Of the 1,075,133,743 references stored in Crossref from works of all types, 523,201,061 (48.66%) are open, and 551,932,682 (51.34%) are not open.

Of the 956,050,193 references stored in Crossref from journal articles, 486,041,671 (50.84%) are open, and 470,008,522 (49.16%) are not open.

Of the 119,083,550 references stored in Crossref from works that are not journal articles, 37,159,390 (31.20%) are open, and 81,924,160 (68.80%) are not open.

The majority of the references that are not yet open are from works published by Elsevier, as detailed in the next post.

 

[Note: As a result of further calculations undertaken by Daniel Ecer on 27th November 2017, which are recorded in his updated Crossref Data Notebook (Ecer, 2017), the figures in this blog have been expanded to show the average number of references for works submitting references.  At the same time, very minor corrections have been made the total numbers of works in each category, which have not altered the percentages and main conclusions presented in this post.]

 

Reference

Ecer, D. (2017). Crossref Data Notebook. Available at https://elifesci.org/crossref-data-notebook

Posted in open access, Open Citations, Open scholarship | Tagged , , | 2 Comments

The Sloan Foundation funds OpenCitations

The OpenCitations Enhancement Project funded by Sloan

The Alfred P. Sloan Foundation, which funds research and education in science, technology, engineering, mathematics and economics, including a number of key technology projects relating to scholarly communication, has agreed to fund The OpenCitations Enhancement Project, a new project to develop and enhance the OpenCitations Corpus.

As readers of this blog will know, the OpenCitations Corpus is an open scholarly citation database that freely and legally makes available accurate citation data (academic references) to assist scholars with their academic studies, and to serve knowledge to the wider public.

Objectives

The OpenCitations Enhancement Project, funded by the Sloan Foundation for 18 months from May 2017, will make the OpenCitations Corpus (OCC) more useful to the academic community both by significantly expanding the volume of citation data held within the Corpus, and by developing novel data visualizations and query services over the stored data.

At OpenCitations, we will achieve these objectives in the following ways:

(a) By establishing a new powerful physical server to handle the Corpus data and offer adequate performance for query services.

(b) By increasing the rate of data ingest into the Corpus, by integrating with server 30 small data-ingest computers, Raspberry Pi 3Bs, working in parallel to harvest references, thus increasing the current rate of corpus data ingest some thirty-fold to about half a million citation links per day.

(c) By employing a post-doctoral computer science research engineer specifically to develop information visualisation interfaces and sense-making tools that will both provide smart ways of envisaging and comprehending the citation data stored within the OpenCitations Corpus, and will also ease the task of manual curation of the OCC.

Personnel

This post-doctoral appointment will start in the autumn of 2017, once the new hardware has been commissioned and programmed. We seek a highly intelligent, skilled and motivated individual who is an expert in Web Interface Design and Information Visualization, and who can demonstrate a commitment to increasing the openness of scholarly information. A formal advertisement for this post, which will be held at the University of Bologna in Italy under the supervision of Dr Silvio Peroni, will be published in the near future. In the mean time, individuals with the relevant skills and background who would like to express early interest in joining the OpenCitations team in this role should contact him by e-mail to <silvio.peroni@opencitations.net>.

Expected Outcomes

By the end of the OpenCitations Enhancement Project, we will have harvested approximately 190 million citation links obtained from the reference lists of about 4.4 million scholarly articles (~15% of Web of Science’s coverage). In this way, in a significant initial step towards the comprehensive literature coverage we seek for the OCC, we will establish the OpenCitations Corpus as a valuable and persistent free-to-use global scholarly on-line Linked Open Data service.

In so doing, we aim at empower the global community by liberating scholarly citation data from their current commercial shackles, publishing such data with a Creative Commons CC0 Public Domain Dedication that will enable novel third-party services to be built over them.

Posted in Data publication, Information visualization, Open Citations, Semantic Publishing, Web interface design | Tagged , , , , , , | Leave a comment

Querying the OpenCitations Corpus

OpenCitations makes available a SPARQL endpoint for querying the data included in the OpenCitations Corpus. While several queries are possible according to the model described in the website (and, with more details, in the official metadata document of the Corpus), we have received some requests by users of the service for exemplar queries. We have chosen two of them, which are particularly relevant with regard to the work that has been done in the past months by the Initiative for Open Citations – that we have already introduced in another blog post.

Query: return all the papers (including their titles) citing the article with DOI “10.1038/227680a0”.

PREFIX cito: <http://purl.org/spar/cito/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX datacite: <http://purl.org/spar/datacite/>
PREFIX literal: <http://www.essepuntato.it/2010/06/literalreification/>
SELECT ?citing ?title WHERE {
  ?id a datacite:Identifier ;
    datacite:usesIdentifierScheme datacite:doi ;
    literal:hasLiteralValue "10.1038/227680a0" .
  ?br 
    datacite:hasIdentifier ?id ;
    ^cito:cites ?citing .
  ?citing dcterms:title ?title
}

Query: return all the papers cited by the bibliographic resource “br/4186” included in the OCC, including the text of bibliographic references used in “br/4186” for making the citations and the titles of the cited papers.

PREFIX cito: <http://purl.org/spar/cito/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX biro: <http://purl.org/spar/biro/>
PREFIX frbr: <http://purl.org/vocab/frbr/core#>
PREFIX c4o: <http://purl.org/spar/c4o/>
SELECT ?cited ?cited_ref ?title WHERE {
  <https://w3id.org/oc/corpus/br/4186> cito:cites ?cited .
  OPTIONAL { 
    <https://w3id.org/oc/corpus/br/4186> frbr:part ?ref .
    ?ref biro:references ?cited ;
      c4o:hasContent ?cited_ref 
  }
  OPTIONAL { ?cited dcterms:title ?title }
}
Posted in Open Citations, Semantic Publishing | Tagged , , , , , , | Leave a comment

The Initiative for Open Citations

OpenCitations are pleased to announce the launch of the Initiative for Open Citations (I4OC) , a fresh momentum in the scholarly publishing world to open up data on the citations that link research publications.  OpenCitations are proud to be a founder of I4OC, and we encourage those remaining publishers whose journal article reference lists are still closed to embrace this sea change in attitude towards open citation data. The other I4OC founding organizations are Wikimedia Foundation, PLOS, eLife, DataCite, and the Centre for Culture and Technology at Curtin University,

Until recently, the vast majority of citation data were not openly available, even though all major publishers freely share their article metadata through Crossref. Before I4OC started, only about 1% of the reference data deposited in Crossref were freely available. Today that figure has jumped to 40% [1].

Publishers

In recent months, following earlier indications of willingness reported in this blog, several publishers have made the decision to release these metadata publicly, including the American Geophysical Union, Association for Computing Machinery, BMJ, Cambridge University Press, Cold Spring Harbor Laboratory Press, EMBO Press, Royal Society of Chemistry, SAGE Publishing, Springer Nature, Taylor & Francis, and Wiley. These publishers join other publishers who have been opening their references through Crossref for some time. The full list of scholarly publishers now opening their reference data via Crossef is given in [2].

These decisions stem from discussions that have been taking place since a call-to-action to open up citations was made by Dario Taraborelli of the Wikimedia Foundation at the 2016 OASPA Conference on Open-Access Publishing. The creation of I4OC was spearheaded by Jonathan Dugan, Martin Fenner, Jan Gerlach, Catriona MacCallum, Daniel Mietchen, Cameron Neylon, Mark Patterson, Michelle Paulson, Silvio Peroni, myself and Dario Taraborelli. The purpose of I4OC is to coordinate these efforts and to promote the creation of a comprehensive, freely-available corpus of scholarly citation data.

Benefits

Such a corpus will be valuable for new as well as existing services, and will allow many more interested parties to explore, mine, and reuse the data for new knowledge. The key benefits that arise from a fully open citation dataset include:

  1. The establishment of a global public web of linked scholarly citation data to enhance the discoverability of published content, both subscription access and open access. This will particularly benefit individuals who are not members of academic institutions with subscriptions to commercial citation databases.
  2. The ability to build new services over the open citation data, for the benefit of publishers, researchers, funding agencies, academic institutions and the general public, as well as to enhancing existing services.
  3. The creation of a public citation graph to explore connections between knowledge fields, and to follow the evolution of ideas and scholarly disciplines.

Stakeholders

The Internet Archive, Mozilla, the Wellcome Trust, and twenty eight other projects and organizations have formally put their names behind I4OC as stakeholders in support of openly accessible citations. The full list of stakeholders is given in [3].

Endorsements

Dario Taraborelli, Head of Research at the Wikimedia Foundation, said:

“Citations are the foundation for how we know what we know. Today, tens of millions of scholarly citations become available to the public with no copyright restriction. We look forward to more organizations joining this initiative to release, and build on these data.”

Liz Ferguson, VP Publishing Development, Wiley, said:

“Wiley is delighted to support I4OC by opening our citation metadata via Crossref. Collaborating with other publishers further contributes to sustainable and standardized infrastructure that will benefit the research community. We are particularly excited by the potential to expose networks of research that would otherwise lie hidden or take years to discover.”

Robert Kiley, Head of Open Research at the Wellcome Trust, said:

“The open availability of citation data will help all funders better evaluate the research they fund. The progress that I4OC has made is an essential first step and we encourage all publishers to publicly share this data.”

Mark Patterson, Executive Director of eLife, said:

“It’s fantastic to see the interest that’s being shown by so many publishers in making their reference list metadata publicly available. We hope that this new momentum will encourage all publishers to follow suit, and that new services and tools can be built around these open data.”

Catriona MacCallum Advocacy Director, PLOS, said:

“Creating an open database of citations will allow researchers to perform independent analyses of how scientific ideas are communicated through article citations, and a transparent way of tracking the influence of particular articles. By opening up these metadata via Crossref, publishers are providing a vital contribution to Open Science.”

Future growth

Many other publishers have expressed interest in opening up their reference data. They can do this very easily via Crossref, with a simple email to support@crossref.org requesting they turn on reference distribution for all their DOI prefixes. This is required even for publishers of open access articles, since by default references submitted to the Crossref Cited-By Linking service are closed, as previously explained here.  I4OC will provide regular updates on the growth of the public citation corpus, how the data are being used, additional stakeholders and participating publishers as they join, and as new services are developed.

I4OC and OpenCitations

Through the efforts of I4OC, scholarly citation data will be increasingly available to any interested party through all of Crossref’s Metadata Delivery Services, including the REST API and bulk metadata dumps. From this open source, OpenCitations will progressively import the citation data into the OpenCitations Corpus, describe them using the SPAR Ontologies according to the OCC metadata model, and make them available in RDF under a Creative Commons public domain dedication as Linked Open Data.  Potential users should be aware that is will take some considerable time before all the new citation data now available via the Crossref API are ingested into the OpenCitations Corpus.

I4OC links

Footnotes

[1] 40% is the percentage of publications with open references out of the total number of publications with reference metadata deposited with Crossref. As of March 2017, nearly 35 million articles with references are deposited with Crossref.

[2] Full list of publishers now making their citation data open via Crossref.

[3] Full list of I4OA supporting stakeholder organizations.

Posted in Open Citations, Semantic Publishing | Tagged , , , | Leave a comment

Open Citations is dead. Long live OpenCitations.

OpenCitations logo 50% with words greyBG

In October 2015, I asked Silvio Peroni, my long-term colleague in the development of the SPAR Ontologies, to become Co-Director of the Open Citations Project, and to work with me in taking forward the prototype Open Citations Corpus (OCC), originally developed at the University of Oxford with the support of Jisc, with the aim of developing it into a production service of real use to scholars.

The result is OpenCitations, a new instantiation of the OCC hosted by the Department of Computer Science and Engineering of the University of Bologna, based on a new metadata schema and employing several new technologies to automate the ingestion of fresh citation metadata from authoritative sources.

Since the beginning of July 2016, OpenCitations has been ingesting and processing accurate bibliographic references harvested from the reference lists of scholarly papers available in Europe PubMed Central, enriched by metadata from Crossref. These scholarly citation data are described using the SPAR Ontologies according to the new OpenCitations metadata document [1], and are published under a Creative Commons public domain dedication (CC0), so that others may freely build upon, enhance and reuse them for any purpose, without restriction under copyright or database law. We have described the new OpenCitations Corpus, and the new software developed by Silvio to create it, in [2].

OpenCitations is being continuously populated from the scholarly literature, and, as of 30th March 2017, has ingested the references from 123,989 citing bibliographic resources, and contains information about 5,307,857 citation links to 3,469,648 cited resources.

The whole OCC is now available for querying (via SPARQL), and for browsing by means of a very simple Web interface that shows only the data about bibliographic entities (e.g. https://w3id.org/oc/corpus/br/1). Additional more user-friendly interfaces will be available in the coming months. The entire contents of the OpenCitations Corpus (OCC) are also archived every month as data dumps that are made available online through Figshare. Each dump comprises several zip archives, each containing either data or provenance information of a particular sub-dataset of the OCC.

Despite the fact that OpenCitations presently contains only a small proportion of global citation data, it is important to realize that, because of the very nature of scholarly citation, even this partial coverage includes citations of the most important papers in every biomedical field, these critical papers being characterized by the high number of their inward citation links.

[1] Silvio Peroni, David Shotton (2016). Metadata for the OpenCitations Corpus. figshare. https://dx.doi.org/10.6084/m9.figshare.3443876

[2] Silvio Peroni, David Shotton, Fabio Vitali (2016). Freedom for bibliographic references: OpenCitations arise. Proceedings of 2016 International Workshop on Linked Data for Information Extraction (LD4IE 2016): 32-43.
https://w3id.org/oc/paper/occ-lisc2016.html

Posted in Open Citations | Tagged , , , | Leave a comment