Introducing the Semantic Publishing and Referencing (SPAR) Ontologies

This blog post is to introduce the first four ontologies of SPAR, the Semantic Publishing and Referencing Ontologies, an integrated ecosystem of generic ontologies shown diagrammatically in the ‘flower’ diagram below (Figure 1). The ontologies can be used either individually or in conjunction, as need dictates. Each is encoded in the Web ontology language OWL 2.0. Together, they provide the ability to describe far more than simply bibliographic entities such as books and journal articles, by enabling RDF metadata to be created to relate these entities to reference citations, to bibliographic records, to the component parts of documents, and to various aspects of the scholarly publication process.

Flower diagram of the SPAR ontologies

Figure 1: The flower diagram, created by Benjamin 0’Steen, showing the component ontologies of SPAR.

The first four ontologies, FaBiO, CiTO, BiRO and C4O, which are now available for inspection, comment and use, are useful for describing bibliographic objects, bibliographic records and references, citations, citation counts, citation contexts and their relationships to relevant sections of cited papers, and the organization of bibliographic records and references into bibliographies, ordered reference lists and library catalogues.

Four additional ontologies, DoCO, PRO, PSO and PWO, are in preparation to provide structured controlled vocabularies for document components, publishing roles, publishing status and publishing workflows. A simple architectural diagram of the eight SPAR ontologies is shown in Figure 2.

SPAR architecture diagram

Figure  2: A simple architectural diagram, created by Silvio Peroni, showing the interactions and dependencies between the component ontologies of SPAR. Four ontologies (DoCO, PRO, PSO and PWO), some of which will import FOAF (http://xmlns.com/foaf/spec/20100809.rdf), are shown with faint outline because they are still under development.

The original motivation for creating the first of these ontologies, the Citation Typing Ontology CiTO, was provided by the semantic publishing work undertaken in 2008, described in [3]. Version 1.6 of the original CiTO ontology developed from that work is described in [4].

Since that publication, as part of a harmonization activity with the SWAN ontologies (http://swan.mindinformatics.org/ontology.html) described in [5], we have separated out from CiTO those aspects describing bibliographic entities into FaBiO, the FRBR-aligned Bibliographic Ontology, and those aspects describing the quantification of citations into C4O, the Citation Counting and Context Characterization Ontology, leaving the current version of CiTO (v2.0) with the sole role of describing the nature and character of the citations themselves.

Where appropriate, the SPAR ontologies, specifically FaBiO and BiRO, the Bibliographic Reference Ontology, employ the FRBR (Functional Requirements for Bibliographic Records) classification model, a conceptual entity-relationship model developed by the International Federation of Library Associations and Institutions (IFLAI) as a “generalized view of the bibliographic universe, intended to be independent of any cataloging code or implementation” [1, 2].  FRBR distinguishes Works, Expressions, Manifestations and Items.

In FRBR, a Work is a distinct intellectual or artistic creation, an abstract concept recognized through its various expressions (for example, your latest research paper); an Expression is the specific form that a Work takes each time it is ‘realized’ in physical or electronic form (for example, as a journal article); a Manifestation of an expression of a work defines its particular physical or electronic embodiment (for example online, print or PDF); and an Item is a particular copy of that you might own (for example the print copy of a journal issue on your desk). FRBR is widely recognized as a sound fundamental model for bibliographic records, and permits a clarity of description that is lacking when using ‘flat’ ontologies and vocabularies that do not employ the FRBR data model.

While the individual ontologies will be described in greater detail in subsequent blog posts and papers, their characteristics and benefits can be summarized as follows:

SPAR, the Semantic Publishing and Referencing Ontologies

  • An integrated ecosystem of independent and reusable ontology modules, capable of use to create comprehensive machine-readable RDF metadata for semantic publishing and referencing, comprising FaBiO, CiTO, BiRO, C4O, DoCO, PRO, PSO and PWO.

FaBiO, the FRBR-aligned Bibliographic Ontology (version 1.0; http://purl.org/spar/fabio/)

  • An ontology, structured according to the FRBR data model, to permit the description of bibliographic entities.
  • Comprehensive coverage of publication entity types, including born-digital entities.
  • Imports the FRBR Core ontology.
  • Uses PRISM terminology.
  • Extends the FRBR data model by the provision of new properties linking Works and Manifestations (fabio:hasManifestation and fabio:isManifestationOf), Works and Items (fabio:hasProtrayal and fabio:isPortrayedBy), and Expressions and Items (fabio:hasRepresentation and fabio:isRepresentedBy).
  • Harmonized with the SWAN ontologies (http://swan.mindinformatics.org/ontology.html), with the SWAN Citations Module deprecated in favour of using FaBiO to describe bibliographic entities.
  • RDF mappings of BIBO classes and properties to FaBiO in preparation.
  • RDF mappings of BibTEX entities to FaBiO to follow.

CiTO, the Citation Typing Ontology (version 2.0; http://purl.org/spar/cito/)

  • An ontology to permit the characterization of the type or nature of citations, both factual (e.g. cito:citesAsMetadataDocument; cito:sharesAuthorsWith) and rhetorical (e.g. cito:confirms, cito:qualifies), able to deal with both direct and explicit, and indirect and implicit citations.
  • Integrated with the SWAN Scientific Discourse Relationships Module (http://swan.mindinformatics.org/spec/1.2/scientificdiscourse.html).

BiRO, the Bibliographic Reference Ontology (version 1.0; http://purl.org/spar/biro/)

  • An ontology, structured according to the FRBR data model, to define bibliographic records (as subclasses of frbr:Work) and bibliographic references (as subclasses of frbr:Expression), and their compilation into bibliographic collections and bibliographic lists, respectively.
  • Imports the FRBR Core Ontology (http://purl.org/vocab/frbr/core).
  • Imports the SWAN Collections Ontology (http://swan.mindinformatics.org/ontologies/1.2/collections.owl) to permit the description of ordered lists.
  • Provides a logical system for relating an individual bibliographic reference, such as appears in the reference list of a published article (which may lack the title of the cited article, the full names of the listed authors, or indeed the full list of authors):
  1. to the full bibliographic record for that cited article, which in addition to missing reference fields may also include the name of the publisher, and the ISSN or ISBN of the publication;
  2. to collections of bibliographic records; and
  3. to bibliographic lists, such as reference lists and library catalogues.
  • Has the ability, used in conjunction with the SWAN Collections Ontology, to specify ordered lists:
  1. of authors,
  2. of references,
  3. of all the in-text reference pointers within an article, and
  4. of those in-text reference pointers specific for a single reference.

C4O, the Citation Counting and Context Characterization Ontology (version 1.0; http://purl.org/spar/c4o/)

  • An ontology that permits the characterization of bibliographic citations in terms of their number and their context.
  • Imports BiRO, and thus indirectly imports the FRBR Core Ontology and the SWAN Collections Ontology.
  • Provides the ontological structures to permit the number of citations a cited entity has received globally to be recorded, as determined by a bibliographic information resource such as Google Scholar, Scopus or Web of Knowledge on a particular date.
  • Provides the ontological structures to permit recording of the number of in-text citations of a cited source, (i.e. number of in-text reference pointers to a single reference in the citing article’s reference list).
  • Enables ontological descriptions of the context within the citing document in which an in-text reference pointer appears.
  • Permits that context to be related to relevant textual passages in the cited document.

N.B. The following four ontologies are under development, and will be published shortly.

DoCO, the Document Components Ontology

  • An ontology for the characterization of the component parts of a bibliographic document.
  • Provides a structured vocabulary of document components (e.g. Introduction, Discussion, Acknowledgements, Reference List, Figures, Appendix) in OWL, enabling these to be described in RDF.

PRO, the Publication Roles Ontology

  • An ontology for the characterization of the roles of agents (people, corporate bodies and computational agents; e.g. author, editor, reviewer, publisher, librarian) in the publication process, as they relate to bibliographic entities
  • Permits the recording of time/date information about when roles are held.

PSO, the Publications Status Ontology

  • An ontology for the characterization of the status of a document and other bibliographic entities at various stages in the publication process (e.g. submitted manuscript, rejected manuscript, accepted manuscript, proof, Version of Record, catalogued book).

PWO, the Publications Workflow Ontology

  • An ontology for the characterization of the main stages in the workflow associated with the publication of a document (e.g. under review, XML capture, page design, publication to Web).

Further blog posts will describe each of the SPAR ontologies in greater detail, will give examples of their use in encoding bibliographic and referencing information, and will describe mapping of other bibliographic metadata systems that do not employ the FRBR data model to FaBiO, specifically of BIBO, the non-FRBR bibliographic ontology, and of BiBTEX terminologies.

We invite community feedback and engagement on the four published SPAR ontologies, their improvement and their application.

This work forms part of the JISC Open Citations Project described in this blog.

The relevant hash tags when referring to this post are #jiscopencite and #spar.

David Shotton and Silvio Peroni
University of Oxford, October 2010

References

[1] Saur KG: FRBR (Functional Requirements for Bibliographic Records) Final Report. International Federation of Library Associations and Institutions; 1998. http://www.ifla.org/files/cataloguing/frbr/frbr_2008.pdf.
[2] Tillett B: What is FRBR? A Conceptual Model for the Bibliographic Universe. Washington DC, USA: Library of Congress, Cataloguing Distribution Service; 2003. http://www.loc.gov/cds/downloads/FRBR.PDF.
[3] Shotton D, Portwin K, Klyne G, Miles A: Adventures in semantic publishing: exemplar semantic enhancements of a research article. PLoS Comput Biol 2009, 5:e1000361. http://dx.doi.org/10.1371/journal.pcbi.1000361.
[4] Shotton D: CiTO, the Citation Typing Ontology. Journal of Biomedical Semantics 2010, 1 (Suppl. 1): S6. http://dx.doi.org/10.1186/2041-1480-1-S1-S6.
[5] Ciccarese P, Shotton D, Peroni S and Clark T: CiTO + SWAN: The web semantics of bibliographic records, citations, evidence and discourse relationships. (Submitted for publication).

Enhanced by Zemanta
About these ads
This entry was posted in JISC, Ontologies, Open Citations, Semantic Publishing and tagged , , , , , , , , . Bookmark the permalink.

31 Responses to Introducing the Semantic Publishing and Referencing (SPAR) Ontologies

  1. Pingback: jodischneider.com/blog » CiTO in the wild

  2. Antoine says:

    Very interesting! I’ve been struggling at work recently with how to present highly structured and highly cited documents for electronic consumption.

    I’m pretty new to RDF, so a lot of the vocabular used is a bit bewildering.

  3. Jan Polowinski says:

    Does some of the vocabulary cover the following use case? We would like to annotate ontology classes with quotations pointing to exact bibliographic references (already existant in bibtex). FaBiO (similar to BIBO) offers the concept “Quotation” and “Work” that could be used to define the quotations and document themselves. However, owl:AnnotationProperties to link domain-ontology classes to these Quotations still need to be defined. Is this something that could become an extension of SPAR, or am I just missing it?
    It is not feasible to do the linking without tooling support, so it is an important requirement that the vocabulary can easily be used within ontology tools, such as Protégé.
    Some more detailed thoughts can be found on: http://www.semanticoverflow.com/questions/2006/what-is-the-best-way-to-annotate-a-class-with-a-citation-pointing-to-an-exact-bib.

  4. Pingback: JiscEXPO Quarterly Executive Newsletter | JISC IE Technical Foundations

  5. putra says:

    good article !!!

  6. Graeme S says:

    This work looks promising. I’ve been interested in ontologies for bibliographic data and citation purposes for a while. Does the scope of this work extend beyond developing ontologies? Are there plans for tool support?

  7. Jakob says:

    Interesting work! I only had a look at BiRO and miss a reference to the Dublin Core Collection Description Type (CDType) Vocabulary (http://purl.org/cld/cdtype/). Moreover bibliographic lists and library catalogues are not ordered. You could define a bibliographic collection (unordered) and a bibliographic list (ordered). A library catalogue is subclass of the former. I use cdtype:CatalogueOrIndex, subclass of dcmitype:Collection to model library catalogues – please don’t reinvent Classes that already exist.

    • davidshotton says:

      You are quite right. We have now corrected BiRO to show biro:LibraryCatalogue as a sub-class of biro:BibliographicCollection.

      The reason we choose to create it as a BiRO class, rather than use an existing DC term, is to permit us to add the restrictions that a biro:LibraryCatalogue is a FRBR Work and that it can contain only bibliographic records.

  8. Pingback: Notes for SUNCAT » Linked Data Focus

  9. zazi says:

    Hi, generally this is an interesting work. Regarding c4o I recommend you to have a look at the Counter Ontology (http://purl.org/ontology/co/core#), which defines a general multiple purpose counter concept. Maybe you can (re-)utilized parts of this ontology in c4o.

  10. zazi says:

    Here are some additional notes regarding ontology reutilization that might be a look worth:

    – for PSO: Event Ontology (http://motools.sourceforge.net/event/event.html), OWL Time Ontology (http://www.w3.org/TR/owl-time/)
    – for PWO: Event Ontology, OWL Time Ontology, Ordered List Ontology (http://purl.org/ontology/olo/core#)
    – for PRO: OWL Time Ontology

  11. Pingback: Maurice Vanderfeesten » Blog Archive » PACE — Ping-back for Academic Citation Enhancements

  12. Pingback: Advantages of Ontological Standards in Scholarly Publishing | JISC Open Citations

  13. Pingback: Extending FRBR within FaBiO | JISC Open Citations

  14. Pingback: Comparison of BIBO and FaBiO | JISC Open Citations

  15. Pingback: CiTO4Data – a new data-centric citation typing ontology | JISC Open Citations

  16. Pingback: Using FaBiO to describe data entities | JISC Open Citations

  17. Pingback: The citation processing pipeline and the Open Citations Corpus | JISC Open Citations

  18. Pingback: JISC Open Citations Project – Final Project Blog Post | JISC Open Citations

  19. Pingback: API to check if a publication is “Open Access” | Random Stuff that Matters

  20. Pingback: Open Citations and Semantic Publishing | Open Citations and Semantic Publishing

  21. Pingback: Mapping JATS to RDF | Open Citations and Semantic Publishing

  22. Pingback: Mapping JATS to RDF | Semantic Publishing

  23. Pingback: Open Citations and Semantic Publishing | Semantic Publishing

  24. Pingback: Using FaBiO to describe data entities | Semantic Publishing

  25. Pingback: CiTO4Data – a new data-centric citation typing ontology | Semantic Publishing

  26. Pingback: Extending FRBR within FaBiO | Semantic Publishing

  27. Pingback: Comparison of BIBO and FaBiO | Semantic Publishing

  28. Pingback: Advantages of Ontological Standards in Scholarly Publishing | Semantic Publishing

  29. Pingback: Libraries and linked data #5: Using the SPAR ontologies to publish bibliographic records | Semantic Publishing

  30. Pingback: Making scholarly articles born semantic | Robert Stevens' Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s