0.0.0
Specification for dr4 version 0.0.0
dr4
A dr4
document shall be defined as a sequence of bytes, that includes a document header, document body, and a termination sequence.
A dr4
document header shall be defined as a sequence of eight unsigned 8-bit integers at the beginning of the document. These are designated as follows:
- Bytes 1-3 are the magic sequence: 83, 94 and 121.
- Bytes 4-6 are the version bytes.
- Byte 7 is the sizer byte for varieties.
- Byte 8 is reserved for future use.
Note: This version of the dr4
spec is non-standard. It does not enforce the presence of a version in the beginning of a document, nor a size variety.
A dr4
document body shall be defined as a series of data rows, terminated by a 4-byte sequence, all of which have the value 0.
- A document body is always terminated by 0 0 0 0, also read as \x00\x00\x00\x00
A dr4
row shall be defined as a sequence of bytes, which includes a header, body, and stop byte.
A dr4
row header shall be defined as sequence of either unsigned 8-bit, 16-bit, or 32-bit integers, which represent the size, length, and offsets of the row. The header values are defined as follows:
- Bytes 1-4 of the header are an unsigned (8, 16, or 32-bit) integer holding the row size.
- Bytes 5-8 of the header are an unsigned (8, 16, or 32-bit) integer holding the row length.
- The row length shall never be greater than the row size.
- The row length shall never be zero.
- The row size shall never be zero.
- Bytes 9-f of the header are a sequence of unsigned (8, 16, or 32-bit) integers.
The offset array of the row header's end byte, f, is derived by the following formulas:
8-bit
f = 2 + length - 1
16-bitf = 4 + (length * 2) - 1
32-bitf = 8 + (length * 4) - 1
The offset of a row's field is defined as the first byte of the field at an offset of n
bytes from byte 0
of the row's body. The rules of the offsets held within a row header's offset array are defined as:
- An offset value is never greater than the size of the row.
- No two offset value's can be the same within the same row.
- The length of the row and the size of the offset array must always be equal.
- The first offset in the offset array is always
0
.- Each row must always have at least one offset.
A dr4
row body shall be defined as a sequence of bytes, that is terminated by a stopping byte. The stopping byte is an unsigned 8-bit integer with a value of 0
. The contents within the row body correspond to the typed fields which hold the data of the dr4
format.
A type shall be defined as a unique, 8-bit unsigned integer with indicates the type of data stored within a field. A type mark is defined as the first byte of each type, which indicates the type.
The binary notation used adheres to the format:
\x01 == 1 == 00000001
| | |
fmt dec binary
The max value for a byte is \x255
and the minimum is \x00
.
The types supported by this specification are as follows:
STOP \x00
NONE (none type) \x01
BOOL (boolean type) \x02
WILD (wild card) \x03
SI32 (signed 32-bit int) \x04
The none
type is a data type with a size of 1
byte. It is intended to represent the absence of a value. In common programming languages, this data type is equivalent to NULL
from c++
, or None
from Python.
The bool
type is a data type with a size of 2
bytes. The first byte is the bool type byte, \x02
, while the second byte holds the state. \x01
corresponds to the true state, and \x00
for the false state.
The wild
type is a data type with a size of 1
byte. It is meant to represent "any" type and value, such that in equality comparison to other fields or rows, it shall always return true. It is included directly in the type system to facilitate queries and searching.
The si32
type is a data type with a size of 5
bytes. It is meant to hold the value of a signed 32-bit integer. The first byte is \x04
, while the next four bytes hold the integer.