1 Introduction
2 Related Work
2.1 Archiving and Versioning
2.2 Ontology Repositories and Platforms
2.3 Ontology Evaluation and Validation
3 Archivo Platform Model
3.1 Versioning and Persistence on the Databus
dcat:downloadURL
links in the metadata. Crawled ontologies and metadata are persisted on the DBpedia download server8. Creating a mirrored archive of ontology versions such as Archivo is, of course, not infallible. We consider it, however, a sufficiently reliable fall-back to improve persistence of ontologies on the Semantic Web.3.2 Evaluation Plugins and SHACL Library
3.3 Feature Plugins
4 Archivo Implementation
4.1 Ontology Discovery and Indexing
owl:Ontology
or skos:ConceptScheme
(which should carry additional metadata and makes the ontology spottable in reliable way) in the triples output of the failure-tolerant parser. If multiple valid serialization candidates exist, we give preference to the serialization having the highest triple count (this will archive the correct FOAF version without license). Finally, the NIR is appended to the index and the chosen serialization is passed over for a release on the Databus. If the spotted NIR doesn’t match with the candidate IRI it started with, the retrieved NIR becomes a new NIR and the process starts again (see Fig. 2). The crawling candidate IRIs representing properties and classes with a slash URI scheme require a special treatment in case the resolution does not return the ontology itself. We use skos:inScheme
and rdfs:isDefinedBy
as pointers to a new candidate IRI.4.2 Analysis, Plugins and Release
ntriples
, turtle
and owl
version to simplify the access (UC5 and UC6). Additionally, to the plugins and validation methods described in Sect. 3, the reasoner Pellet14 is used for checking the consistency (UC9) of the ontology and determining the OWL profile. Furthermore an OOPS report (UC8) is generated to detect common pitfalls of the ontology. All reports are stored alongside the original snapshot with appropriate DataID metadata to augment the snapshot.groupId
and the path serves as the name for the artifactId
. Archivo’s lookup component15 with Linked Data interface allows to resolve the mapping from a non-information URI to the stable and persistent Databus identifier.4.3 Versioning and Persistence
E-Tag
, Last-Modified
and content-length
to detect via a HEAD-request if the respective ontology resource could have changed. If any of the headers changed (or if none of the headers is available), the vocabulary is downloaded and checked locally for changes.patch
. If only new axioms were added, we consider this as a new minor
version. If new classes/properties are added, this usually leads to no backward-compatibility problems for existing applications, but there are cases (e.g. adding a deprecated or disjoint relation to a class) which might have consequences in combination with A-boxes. Any deletion of already existing axioms (thus including renaming) is considered as major
change potentially seriously affecting backward-compatibility. This semantic versioning “overlay” allows a more fine-grained update decision than the binary “take it or leave it” (UC4a-c). Users can refine the trade-off with custom solutions based on the semantic versioning and axiom diffs. We plan that more sophisticated versioning overlays can augment the Archivo snapshots with open contributions via Databus mods (see Sect. 7).5 A Consumer-Oriented Ontology Star Rating
5.1 Two Star Baseline
a
owl:Ontology
and some form of license could be detected. A high degree of heterogeneity is permissible for this star regarding the used property/subproperty as well as object: license URI (resolvable linked data or web link), xsd:string
or xsd:anyURI
(UC7). [OBO fp1, OOPS! P38 P41, VocUse 4]5.2 Quality Stars
dct:license
as object property with a URI (not string or anyURI). If a resolvable Linked Data URI is used, we expect the URI to match the URI used in the machine readable license (UC7). We discovered many irregularities such as trailing ‘/’ which violate RDF requirements that URIs need to be exactly the same in RDF as opposed to Linked Data resolution. In the future, we plan to tighten up this criterion and expect machine readable license, which we will collect on the DBpedia Databus in a similar manner as Archivo. [OBO fp1, OOPS! P41, VocUse 4]owl:disjointWith
axioms are nice, unless they render the ontology unusable for reasoning.#Ont. | Stars1 | License-I2 | License-II2 | Consistency2 | LODE3 | Expressivity4 |
---|---|---|---|---|---|---|
735 | 11/453/10/134/127 | 275/460/0 | 137/598/0 | 687/23/25 | 1/30/702/2 | 103/91/9/29/15/488 |
5.3 Further Stars and Ratings
6 Evaluation
6.1 Archivo and Rating Statistics
6.2 System Comparison
Dimension | Coverage | r | Access | q | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
System name | TY | DO | IM | DI | UP | UV | SV | ID | PE | OF | MA | TE |
Archivo | A | I | ||||||||||
Bioportal | all | S |
1 | - |
2 | - |
1 | - | - | |||
LOV | C,A,I | I | - | - | ||||||||
OBO foundry | C | S | -/- | - | - | - | - | -/- | ||||
Ontobee | I | S | -/- | - | - | - | - | - | - | |||
Ontohub.org | D | I |
1 | - |
3 | - | - |
4/- | ||||
OntoMaven repo | A | - |
1 | - | - | - | - |
5/- | ||||
Ont. Lookup Svc | I | S | -/- | - | - | - | - | - |
6 | - |