CoolCereal

CoolCereal is a compact binary serialization format based on TBB, where the document referenced by the SHA-1 sum in the TBB header (i.e. bytes 4 through 23) is either the root CoolCereal format description document (urn:sha1:HQ25ZRMI7O3UTJWTUYLUH4WXNKMMPSHL) or another CoolCereal-formatted blob which acts as a preamble for the referencing one.

Referencing a CoolCereal blob in the header is equivalent to including the blob inline. CoolCereal blobs may reference preambles to arbitrary depth, though applications may impose a reasonable limit.

The payload is a stream of variable-length bytecode instructions. By default, all opcodes except for 0x41 push the opcode's number (interpreted as an 8-bit signed integer, so opcode 0xFF represents -1) onto the stack. 0x41 is the 'load opcode' opcode. It is followed by one byte giving the index in the opcode table to replace, and 20 bytes giving the 'name' of the opcode to load. It is assumed that these 20 bytes are the SHA-1 sum of some human and/or machine-readable representation of the opcode, though an application is not required to actually load the referenced resource. The total length of this instruction is therefore 22 bytes.

A reasonable convention to follow while encoding would be to have a standard set of reserved opcodes, including the default 0xC0..0xFF and 0x00..0x40 opcodes to push the literal values -64 to 64, and 0x41 as the 'load opcode' opcode (the text of urn:sha1:HQ25ZRMI7O3UTJWTUYLUH4WXNKMMPSHL says 0xA0 instead of 0xC0 because I was feeling bad at hex when I wrote it, but since it's only a suggestion for encoders it shouldn't really matter).

The encoded value is the value on top of the stack after all instructions have been applied. It is considered an encoding error for the stack to be empty at that point.

Older versions

urn:sha1:HQ25ZRMI7O3UTJWTUYLUH4WXNKMMPSHL differs in the default instruction set: Only opcode 0x01 is defined, and it is defined as loading a library of opcodes identified by the 20 bytes following it. The library defines the opcode number → behavior mapping rather than the including opcode stream.