Modula-2 Reloaded

A Modern Typesafe & Literate Programming Notation

Site Menu

Project

Specification

Implementation

Recommendations

Reference

Needs Updating

Work in Progress

Wastebasket

Wiki Manual

edit SideBar

Serialised Scalar Format

Synopsis

The following describes an ASCII based serialisation format for numeric scalar values and conversion primitives to convert between numeric and serialised representation.

Objectives

The serialised format needs to meet the following objectives:

  • avoid non-printable characters
  • store all meta-data fields at fixed offsets
  • preserve the radix system of the value's internal representation
  • store explicit length of the string holding the serialised value
  • store explicit number of digits for both significand and exponent
  • store the sign and digit stream of the significand at a fixed offset
  • efficient calculation of the offset of the exponent's sign and digit stream
  • encode all numeric meta-data in a power-of-two radix system
  • encode digit streams using contiguous symbol sets for maximum efficiency
  • avoid apostrophe, quotation marks, backquote and backslash in symbol sets
  • encoding of significands with up to 999 digits and exponents with up to 15 digits

Data Fields

A serialised scalar contains the following data fields:

  • version: 1 octet, protocol version, at present always 1
  • length: 2 octets (8..1023), total number of octets used to encode the scalar
  • encoding: 1 octet ("D" or "H"), designator of radix system used to encode the scalar
  • significand digit count: 2 octets (1..999), number of digits encoded in significand field
  • exponent digit count: 1 octet (0..15), number of digits encoded in exponent field
  • significand sign: 1 octet ("+" or "-"), sign of the significand
  • significand digits: variable length, digit stream of significand
  • exponent sign [1]: 1 octet ("+" or "-"), sign of the exponent
  • exponent digits[1]: variable length, digit stream of exponent
  • terminator: 1 octet (ASCII NUL), string terminator

[1] If the value of the exponent's digit count field is zero, then no exponent is encoded.

Encoding of Numeric Values

The digits in an encoded digit stream appear in highest to lowest significance order and represent the roots of a polynom of the form:

value = digit0 * radix n + digit1 * radix n-1 + digit2 * radix n-2 + ... + digitn-1 * radix 1 + digitn * radix 0

  • Numeric meta-data is always encoded with radix 32, using Base32 serialisation.
  • Significand and exponent digit streams are encoded as follows:
    • radix 10, using Base10 serialisation if the value of the encoding field is "D"
    • radix 16, using Base16 serialisation if the value of the encoding field is "H"

Base10 Serialisation

Base10 serialisation is a radix 10 based ASCII encoding of numeric values. Digits are represented by the ASCII characters in the code point range from decimal 48 ("0") indicating 0, to decimal 57 ("9") indicating 9.

Base16 Serialisation

Base16 serialisation is a radix 16 based ASCII encoding of numeric values. Digits are represented by the ASCII characters in the code point range from decimal 48 ("0") indicating 0, to decimal 63 ("?") indicating 15.

Base32 Serialisation

Base32 serialisation is a radix 32 based ASCII encoding of numeric values. Digits are represented by the ASCII characters in the code point range from decimal 48 ("0") indicating 0, to decimal 79 ("O") indicating 31.

EBNF

serialisedScalarFormat :
    version length encoding sigDigitCount expDigitCount
    sigSign sigDigits ( expSign expDigits )? terminator ;

version:
    digitB32 ; protocol version, at present the value is always 1

length :
    digitB32 digitB32 ;    // allocated length, value between 8 and 1023

encoding :
    "D" | "H" ;    // radix system, D for base 10, H for base 16

sigDigitCount :
    digitB32 digitB32 ;    // digit count of significand, value between 1 and 999

expDigitCount :
    digitB32 ;    // digit count of exponent, value between 0 and 15

sigSign :
    "+" | "-" ;    // sign of significand

sigDigits :
    digitB10+ | digitB16+ ;    // digits of significand

expSign :
    "+" | "-" ;    // sign of exponent

expDigits :
    digitB10+ | digitB16+ ;    // digits of exponent

digitB10 :
    "0" .. "9" ;    // representing values between 0 and 9

digitB16 :
    "0" .. "?" ;    // representing values between 0 and 15

digitB32 :
    "0" .. "O" ;    // representing values between 0 and 31

terminator :
    ASCII(0) ;

Conversion Primitives

The following conversion primitives should be implemented by all library defined scalar types to facilitate conversion between all numeric types even if no direct conversion path exists:

PROCEDURE [TO] toSerialized ( value : <ScalarType>; VAR serialized : ARRAY OF CHAR );
(* Converts <value> to its serialised scalar representation and passes the result back
   in <serialized>. The output is shortened if the size of the passed in character array
   is insufficient to represent all available digits. *)

PROCEDURE [FROM] fromSerialized ( VAR value : <ScalarType>; serialized : ARRAY OF CHAR );
(* Converts the passed in serialised scalar to a value of type <ScalarType>
   and passes the result back in <value>. *)