Is a simple, extensible, and compact way to associate arbitrarily encoded objects with encoding information.
A TBB packet is encoded as follows:
The TBB format does not provide any way to indicate the length or checksum of the packet; that is left to the container (such as a UDP packet, a file, a database record, etc).
A schema should indicate a format for the payload and a prototype.
How to interpret these properties is up to the application, and they
can be either rdf:resource
links or inline RDF+XML. It's
expected that format
will often be an opaque resource URI
identifying a predefined format, but it may also point to a
machine-readable format description (such
as ASN.1
encoding rules), or simply be a block of human-readable text
describing the format.
As an example, say there are QQ objects to be described (nevermind what a QQ represents) that have, among other properties, foofiness and color, and that 23 and purple are such common values for those properties that we don't want to repeat them in each serialization of a QQ object. In that case, the schema might be:
<Schema xmlns="http://ns.nuke24.net/TBB/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <format rdf:resource="http://example.com/QQ/CompactQFormat1"/> <prototype> <qq:QQObject xmlns:qq="http://example.com/QQ/"> <qq:foofiness>23</qq:foofiness> <qq:color>purple</qq:color> </qq:QQObject> </prototype> </Schema>
In addition to format
and prototype
, a
schema may specify other, format-specific information. When
represented as RDF, format-specific properties should be namespaced
appropriately (not in the http://ns.nuke24.net/TBB/
namespace).
As an alternative to RDF+XML, the schema itself may be encoded in the TBB format, or any other encoding, so long as it's distinguishable and understood by the target application.
Applications whose objects have only a few predetermined schemas may have interpreters pre-loaded, so that they do not need to load schemas at all, but simply look them up by their hash. For future compatibility, those hashes should still be based on RDF+XML documents that actually describe the schemas.
How to go about this is left completely up to the application. Since the SHA-1 sum of the schema is known, you are free to fetch it from untrusted sources. Since many objects will share a single schema, and because schemas are uniquely identified by their hash, applications can easily cache an efficient representation.
A format with semantics identical to those of TOGoS Binary Blocks,
but with a text header of the form "#TTB
" + datatype URI + newline,
or a shebang line (i.e. starting with "#!
"), which is ignored, followed by
a "#TTB
" line.
Rather than opaque 20-byte format identifiers, XML datatype URIs can be used.
To refer to format that is in turn described by another document,
use a URI of the form document URI + "#
" + fragment ID,
where fragment ID may be blank for cases where the entire document unambiguously
describes a single concept, such as RDF+XML documents where the root
node is a description.
TOGoS Text Blocks is most useful as an alternative to TOGoS Binary Blocks when the
content is also a text-based format, but it not required.
So long as a format has both a 20-byte ID and a datatype URI defined,
it could be encapsulated in either a TBB or a TTB document.
They can even embed each other!
Or at least will be able to once I define
a 20-byte ID corresponding to http://ns.nuke24.net/Datatypes/Subject
TODO: Define a standard, repeatable method for converting XML datatype URIs to TBB schema IDs, maybe using v5 UUIDs. Embed a JS form on this page to do the calculation.
A URI of the form document URI + "#
",
where document URI is the URI of a TBB or TTB document,
refers to the object described by that document.
You can also use the http://ns.nuke24.net/Datatypes/Subject
XML datatype
to indicate the lexical-to-value mapping of a RDF literal, or in other cases
where such datatype URIs are used, such as TTB itself. As a silly example,
you could have a TTB document prefixed with any number of "#TTB http://ns.nuke24.net/Datatypes/Subject
" lines,
which would act as no-ops, as they would essentially say
"interpret this document the same way you were already about to, but start at the next line".