Core concepts and terms
You should become familiar with the following core concepts and terms before attempting to develop your own application using the SST Core API:
-
programming languages : The SST Core is fully implemented in the GO programming language; so this is what you should use to take full advantage of the provided functionality. Programming for the web-browser with JavaScript /TypeScript is possible using the standard or the SST customised version of JSON-LD; but this is restricted on receiving (reading) and sending (committing) NamedGraphs
-
Semantic Web: a set of standard recommendations of the World Wide Web Consortium (W3C) to make data on the Internet machine-readable. Major parts of the Semantic Web supported by SST are:
- RDF: Resource Description Framework
- RDFS: Resource Description Framework Schema
- OWL: Web Ontology Language
- Turtle: Terse RDF Triple Language
-
IRI (URI , URN , URL) and RDF : Internationalized Resource Identifiers (IRI) are the basis of the Resource Description Framework (RDF) that are needed to represent
Triples. IRIs are an extension of Uniform Resource Identifiers (URI) supporting international characters (Unicode). The majority of URIs used by SST are Uniform Resource Locator (URL) and Uniform Resource Names (URN). -
UUID : Universally unique identifiers are widely used throughout SST. Primarily SST is using random UUID (variant 4) to ensure that no one else is creating unintentionally the same UUID. In a few special cases SST is also using the “namespace name-based” (version 5) when a hash value of some data is needed.
-
UUID URN : By default SST is generating a random UUID-URN to identify a newly created pair of a
Datasetand it’s defaultNamedGraph. Example:urn:uuid:8D8AC610-566D-4EF0-9C22-186B2A5ED793. But SST also fully supports the use of other IRIs such as URLs to identify such pairs.URLs are e.g. used to identify the higher level ontologies used by SST. -
UUID Fragment: By default SST is generating random UUID for the IRI fragments of
IBNodes within a `NamedGraph’; however every other valid fragment is supported as well. -
Hash : This is a binary SHA256 value that is used to uniquely identify either a
DatasetRevision, aNamedGraphRevisionor aCommit -
GIT : “Git is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively”. SST inherits major concepts from GIT, but adopt them to RDF data instead of programming code. So in a similar way we can say that “SST is a distributed version control system that tracks versions of RDF data. It’s purpose is to control RDF data in a collaborative way. "
-
Repository : The place to store persistently RDF Datasets. A Remote Repository is an application in it’s own whose access is controlled by the OAUTH2 protocol. Otherwise the repositories are stored within the local file system and it is up to the application that uses the SST Core to decide about if there is any access control and how to realise this. A full featured Repository in SST comes with GIT like revision control of the contained data, query capabilities, and transaction control; simple repository types don’t have this.
-
Dataset : The main content of a Repository are Datasets as defined in RDF; consisting of a default graph and other NamedGraphs . For SST also the default graph is a
NamedGraph, and it’s IRI is the same as the one of the correspondingDatasetfor which it is the default one. For SST the “other”NamedGraphs of a Dataset are explicitly imported in SST byowl:imports. For the SST API this is addressed by a special Import functionality of NamedGraph -
DatasetRevision : A particular revision of a Dataset consisting of a default NamedGraphRevision and all other NamedGraphRevision that are either directly or indirectly imported into the default NamedGraphRevision. A DatasetRevision is identified by a
Hashvalue -
NamedGraph : RDF graph that is defined as a set of RDF triples and that is identified by an IRI (without a fragment), In a
Repositorywith revision history support, several revisions of the sameNamedGraphcan exist. In aStageonly one particular revision of a NamedGraph can exist at a time. TheNamedGraphs in a Stage are either local , meaning that all their triples are in memory as well, or they are in referenced state, meaning that the triples are not in memory. Expectation is that referenced NamedGraphs can be turned into local NamedGraph somehow, e.g. by loading the triples from some repository or by reading an RDF file. By default the triples in a local NamedGraph in a Stage can be modified and saved; e.g. by Commit into a Repository or written out into an RDF or SST file -
NamedGraphRevision: A particular version of a NamedGraph in a Repository, identified by a
Hashvalue. -
Stage: The place in memory where NamedGraphs are opened, created, modified or deleted. When a Stage is closed, all the NamedGraph objects in it become invalid. An SST application typically opens at least one Stage, but may use several in parallel. NamedGraphs can be either copied or moved from one Stage into another one. A Stage might be linked to a Repository to make the
NamedGraphs in it persistent. -
Commit : The operation to write modified
NamedGraphs in a Stage into the linkedRepositoryof thisStage. During Commit for each- a new NamedGraph and a new Dataset entry for each new NamedGraph in the Stage
- a new NamedGraphRevision and a new DatasetRevision for each new or modified NamedGraph in the Stage
- and in addition a new DatasetRevision for each NamedGraph in the Stage that is not new or modified, but that imports directly or indirectly other modified NamedGraphs
-
Triple : A statement consisting of a subject node, a predicate node and either an object node or a literal. For SST the subject node is owned by the NamedGraph in which the triple is defined.
-
IBNode : Using RDF terms, an IBNode is either an IRI node, a blank node or a collection (here
ObjectCollection). Note that the optimisedLiteralCollectionsare not treated asIBNodes.- IRI node : For SST, an IRI node is composed by the base IRI used for it’s NamedGraph followed by the “#” symbol” and a fragment. By default the fragment of a newly created IRI node is a random
UUID. - Blank node : An RDF blank node that is “owned” by a NamedGraph in which it is defined. BlankNodes in one NamedGraph can not be referenced from outside the NamedGraph in which they are defined. Internally in SST a blank node is identified by a “namespace name-based” (version 5)
UUID
- IRI node : For SST, an IRI node is composed by the base IRI used for it’s NamedGraph followed by the “#” symbol” and a fragment. By default the fragment of a newly created IRI node is a random
-
Literal :
-
ObjectCollection : An RDF collection that is an ordered list of RDF nodes or literals. An ObjectCollection can also contain other ObjectCollections as elements.
-
LiteralCollection : a special optimised RDF collection type where all items must be of the same literal type and that can only be used as Object in a triple. LiteralCollections can not be shared as Objects for several triples.
-
Triplex structure : A special data structure in memory to represent all the triples of the NamedGraphs in a Stage. The structure is optimised to allow fast traversing an RDF graph in any direction:
- Uni-Directional: here only the forward direction of triples is available; starting from a subject to predicates and objects; this is common for most tools
- Bi-Directional: here in addition the reverse direction of triples is available; starting from an object node to predicates and subjects; a number of tools can support this only by a query operation
- Tri-Directional: here in addition a middle direction of triples is available; starting from a predicate to subjects and objects. This has special importance as SST is widely using “punning” in the sense that a node can be both a property and an individual TBD
-
RDF / Turtle file: W3C defines several kinds/formats of RDF files. For the time being the only RDF format supported by SST is Turtle; typically with the file extension
*.ttl. SST restricts Turtle files to contain exactly one NamedGraph with all triples having subjects that are either IRI nodes with the same bases IRI as this NamedGraph or are blank nodes or ObjectCollections of this NamedGraph. -
SST-File: An SST specific binary RDF file format that is highly optimised for speed. This file format is also used as storage format inside an SST Repository and for the communication between an SST client and a remote SST Repository
-
OAUTH2 : remote SST Repositories use the OAUTH2 protocol to verify the access rights of an application/client to access the content of the repository.
-
OIDC : Open ID Connect
-
gRPC: SST is using the Google Remote Procedure Call framework for the communication between an SST Client and a remote SST Repository