0.0.0

Specification for dr4 version 0.0.0

Notes

  • This is the first ever drafted specification for dr4
  • This is the only non-standardized spec, other specifications do not support backwards compatability.
  • This specification used a very minimal span of data types.

Document

A dr4 document shall be defined as a sequence of bytes, that includes a document header, document body, and a termination sequence.

Document Header

A dr4 document header shall be defined as a sequence of eight unsigned 8-bit integers at the beginning of the document. These are designated as follows:

  • Bytes 1-3 are the magic sequence: 83, 94 and 121.
  • Bytes 4-6 are the version bytes.
  • Byte 7 is the sizer byte for varieties.
  • Byte 8 is reserved for future use.

Note: This version of the dr4 spec is non-standard. It does not enforce the presence of a version in the beginning of a document, nor a size variety.

Document Body

A dr4 document body shall be defined as a series of data rows, terminated by a 4-byte sequence, all of which have the value 0.

  • A document body is always terminated by 0 0 0 0, also read as \x00\x00\x00\x00

Rows

A dr4 row shall be defined as a sequence of bytes, which includes a header, body, and stop byte.

Row Header

A dr4 row header shall be defined as sequence of either unsigned 8-bit, 16-bit, or 32-bit integers, which represent the size, length, and offsets of the row. The header values are defined as follows:

  • Bytes 1-4 of the header are an unsigned (8, 16, or 32-bit) integer holding the row size.
  • Bytes 5-8 of the header are an unsigned (8, 16, or 32-bit) integer holding the row length.
  • The row length shall never be greater than the row size.
  • The row length shall never be zero.
  • The row size shall never be zero.
  • Bytes 9-f of the header are a sequence of unsigned (8, 16, or 32-bit) integers.

The offset array of the row header's end byte, f, is derived by the following formulas:

8-bit f = 2 + length - 1 16-bit f = 4 + (length * 2) - 1 32-bit f = 8 + (length * 4) - 1

The offset of a row's field is defined as the first byte of the field at an offset of n bytes from byte 0 of the row's body. The rules of the offsets held within a row header's offset array are defined as:

  • An offset value is never greater than the size of the row.
  • No two offset value's can be the same within the same row.
  • The length of the row and the size of the offset array must always be equal.
  • The first offset in the offset array is always 0.
  • Each row must always have at least one offset.

Row Body

A dr4 row body shall be defined as a sequence of bytes, that is terminated by a stopping byte. The stopping byte is an unsigned 8-bit integer with a value of 0. The contents within the row body correspond to the typed fields which hold the data of the dr4 format.

Types

A type shall be defined as a unique, 8-bit unsigned integer with indicates the type of data stored within a field. A type mark is defined as the first byte of each type, which indicates the type.


Notation

The binary notation used adheres to the format:

\x01 == 1 == 00000001
 |      |       |
 fmt    dec    binary 

The max value for a byte is \x255 and the minimum is \x00.

Type Listings

The types supported by this specification are as follows:


 STOP                     \x00
 NONE (none type)         \x01
 BOOL (boolean type)      \x02
 WILD (wild card)         \x03
 SI32 (signed 32-bit int) \x04

None

The none type is a data type with a size of 1 byte. It is intended to represent the absence of a value. In common programming languages, this data type is equivalent to NULL from c++, or None from Python.

Bool

The bool type is a data type with a size of 2 bytes. The first byte is the bool type byte, \x02, while the second byte holds the state. \x01 corresponds to the true state, and \x00 for the false state.

Wild

The wild type is a data type with a size of 1 byte. It is meant to represent "any" type and value, such that in equality comparison to other fields or rows, it shall always return true. It is included directly in the type system to facilitate queries and searching.

Si32

The si32 type is a data type with a size of 5 bytes. It is meant to hold the value of a signed 32-bit integer. The first byte is \x04, while the next four bytes hold the integer.