Modula-2 Reloaded

A Modern Typesafe & Literate Programming Notation

Site Menu

Project

Specification

Implementation

Recommendations

Reference

Needs Updating

Work in Progress

Wastebasket

Wiki Manual

edit SideBar

Transliteration

Encoding Modula-2 in Legacy Character Sets

Modula-2 syntax is based on the ASCII character set. Nevertheless, care has been taken to allow reversible transliteration to and from legacy character sets via source code transliteration utilities to accommodate Modula-2 on computer systems that do not support ASCII or Unicode. The recommendations described here require at minimum the availability of the following printable characters:

 .  ,  :  ;  +  -  *  /  =  <  >  (  )  '  !  ?  0 1 2 3 4 5 6 7 8 9
 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Recommended Transliterations

Implementors of transliteration utilities may choose whichever digraph or trigraph transliterations they see fit provided that the transliterations are reversible. However, the following transliterations are recommended as their digraphs have been reserved:

Original Symbol#&{}[]<##>\^|~_$@%##@@'"`
Transliteration/=+>(..)(::)(==)?/?>?!?-?.?=?*?:??????,?,,?<
Recording the Encoding in the Source

It is mandatory for any Modula-2 source file that is not ASCII encoded to record the character encoding using an ENCODING pragma. Transliteration utilities must take care to insert the pragma where absent, or update the encoding in the pragma where present.

Example:

 <*ENCODING="EBCDIC500"*> (* Transliterated Modula-2 Source Text *)
 DEFINITION MODULE Foobar;
 ...
Transliteration Within Quoted Literals

Where transliteration is applied within quoted string literals, it is recommended to prefix the string literal with a period. It is further recommended to separate transliterated symbols from the remaining string literal by factoring and concatenation. This way it becomes visible in the transliterated source text exactly which part of a string literal has been transliterated, avoiding accidental transliterations.

Example:

 "foo~bar" <--> "foo" & ."?-" & "bar"
Transliteration Within Comments

Where transliteration is applied to block comments, it is recommended to prefix the comment with a question mark. As with string literals, it is recommended to separate and isolate transliterated symbols from the remaining comment by splitting the comment into several comments to make it visible exactly which part of the comment has been transliterated.

Example:

 (* foo\bar *) <--> ?(* foo?/bar *)

Where transliteration is applied to line comments, it is recommended to use ?; in place of the opening ! line comment prefix.

Example:

 ! foo\bar <--> ?; foo?/bar
Transliteration of Character Codes

Where specific character code points are hardcoded in transliterated source text, it is recommended to prefix the character code with ?+ in place of character code literals and CHR function invocations.

Examples:

 CONST euro = 0u20AC; <--> CONST euro = ?+0x9F; (*EBCDIC1047*)
 CONST copyright = CHR(169); <--> CONST copyright = ?+180; (*EBCDIC500*)
Double Quotation Mark

In the now unlikely event that the legacy character set does not provide an encoding for ", it is recommended to eliminate double quoted strings by factoring and concatenation.

Example:

 "foo's bar" <--> 'foo' & .'?,' & 's bar'

However, if the legacy character set provides an encoding for the vertical bar, one may use || as a transliteration instead in which case only less likely occurrences of || within the string would need to be factored out and concatenated.

Example:

 "foo's||bar's" <--> ||foo's|| & '||' & ||bar's||
Case Sensitivity

In the now unlikely event that the legacy character set does not support case sensitivity, it is recommended to use (+) as a prefix to switch to uppercase and (-) as a prefix to switch to lowercase.

Example:

 setFoo <--> (-)SET(+)F(-)OO