RDF Specifics
SST data is RDF data and thus follows the definitions as provided in the W3C Recommendation RDF 1.1 Concepts and Abstract Syntax:
The Resource Description Framework (RDF) is a framework for representing information in the Web. This document defines an abstract syntax (a data model) which serves to link all RDF-based languages and specifications. The abstract syntax has two key data structures:
RDF graphs are sets of subject-predicate-object triples, where the elements may be IRIs, blank nodes, or datatyped literals. They are used to express descriptions of resources.
RDF datasets are used to organize collections of RDF graphs, and comprise a default graph and zero or more named graphs. RDF 1.1 Concepts and Abstract Syntax also introduces key concepts and terminology, and discusses datatyping and the handling of fragment identifiers in IRIs within RDF graphs.
Based on the RDF definitions the following keywords are used throughout SST:
-
Dataset: consisting of the default NamedGraph and other directly or indirectly imported NamedGraph and to some degree other referenced only NamedGraphs
-
NamedGraph: consisting of Triples that are interlinked by sharing the same nodes
-
Triple: consisting of 3 Resources in the role Subject, Predicate and Object. Depending on how a particular node is used in a triple we identify the triple as Subject Triple, Predicate Triple or Object Triple.
-
IBNode: either an IRI Node or a BlankNode. A BlankNode might be a Collection (see Turtle)
-
Literal: a data value that is not a node, but that has a Datatype
-
Resource: either an IBNode or a Literal
-
Fragment: identifier that provides a secondary resource of an IRI-Node
Beyond of what is defined in RDF, SST adds further restrictions:
-
For the purpose of SST the RDF concepts of a NamedGraph and Namespace are unified in the way that each NamedGraph has it’s own base URI without a fragment that is also establishing a Namespace with an associated prefix, either a default one or a dynamically assigned one. The consequences are:
- Every NamedGraph is identified by its namespace URI (Uniform Resource Identifier) that is either a URN (Uniform Resource Name) or a URL (Uniform Resource Locator). The namespace URI of a NamedGraph must not have a Fragment
- each NamedGraph can be represented in a single Turtle file (or any other RDF format)
- a collection of NamedGraph — e.g. the ones that make up a Dataset — can be represented as a single TriG file
- the base IRI of all IRI nodes in a NamedGraph that are used as subject, must be the same as the IRI of the NamedGraph with an additional fragment. As a consequence every IRI node is owned by exactly one NamedGraph.
- in addition every NamedGraph contains one implicit IRI node that has no fragment and that represents the whole NamedGraph as an
owl:Ontology. - similar blank nodes defined in a NamedGraph can only be referenced from within this NamedGraph.
-
The default graph of a Dataset is also treated as a NamedGraph and so the Dataset is identified by the same URI as it’s default NamedGraph
-
for every IBNode at least one subject triple with the predicate rdf:type must be defined from which the base nature of this IBNode can easily be deduced, so on whether it is a class, an individual or a property (see further details under xxx-TBD)
-
There are two ways on how NamedGraphs other than the default NamedGraph are used in a Dataset; either by explicitly importing these NamedGraph using
owl:importsor by just referencing the whole NamedGraph or single IRI-nodes in it. On whether another NamedGraph is imported or only referenced has the following consequences:- SST operates on a closed world assumption for a Dataset. The closed world is primarily defined by the default NamedGraph together with all directly or indirectly imported NamedGraph. A “window” from the closed world to the open world is provided by other NamedGraph that are only referenced.
- Validating a Dataset is by default only done for the default NamedGraph together with all directly and indirectly imported NamedGraph. NamedGraph that are only referenced (other than the default SST ontologies) are not validated.
- The SST Core provides Revision Control of a Dataset only for the default NamedGraph together will all directly and indirectly imported NamedGraphs. NamedGraphs that are only referenced are not under Revision Control
These restrictions are widely used outside of SST anyway; but for performance and validation reasons SST insists that these restrictions are not violated.
SST is widely using random Version 4 UUIDs (Universally unique identifier) for application data; as both UUID-URNs and as Fragment. So a typical NamedGraph may be identified as
urn:uuid:fa446d9a-429e-4c73-a860-d87ff7ac7d79
and an IRI node of this NamedGraph as
urn:uuid:fa446d9a-429e-4c73-a860-d87ff7ac7d79#8287edba-f6c1-407e-bc93-e16efa016129.
Main reason for this is that application data can freely be used between different organisations without enforcing any ownership in the data itself. In addition the random nature ensures that not by chance the same IRI is used twice for different purpose.