Site Menu Project Specification Implementation Recommendations Reference Needs Updating Work in Progress Wastebasket Wiki Manual |
Lexical Entities1.1 Character SetsBy default only the printable characters of the 7-bit ASCII character set, whitespace, tabulator and newline are legal within Modula-2 source text. Unicode characters may be permitted within quoted literals and comments, subject to recognition and verification of the encoding scheme used. 1.2 Reserved WordsReserved words are symbols that consist of a sequence of all-uppercase letters, are visible in any scope, have special meaning in the language and may not be redefined. There are 49 reserved words: ALIAS DEFINITION IF OF RETURN AND DIV IMPLEMENTATION OPAQUE SET ARGLIST DO IMPORT OR THEN ARRAY ELSE IN POINTER TO BEGIN ELSIF LOOP PROCEDURE TYPE BLUEPRINT END MOD RECORD UNTIL BY EXIT MODULE REFERENTIAL VAR CASE FOR NEW RELEASE WHILE CONST FROM NONE REPEAT YIELD COPY GENLIB NOT RETAIN 1.3 Schrödinger's TokensSchrödinger's tokens are symbols that may either be used as reserved words or as identifiers, depending on context. There are 32 Schrödinger's tokens: ABS INSERT STORE TMAX VAL ADDRESS LENGTH SUBSET TMIN VALUE APPEND OCTET SXF TORDERED WRITE CAST READ TDYN TREFC WRITEF COUNT READNEW TFLAGS TSCALAR COROUTINE REMOVE TLIMIT TSORTED EXISTS SEEK TLITERAL UNSAFE 1.4 Special SymbolsSpecial symbols are symbols that consist of one, two or three non-alphanumeric quotable characters, are visible in any scope, have special meaning in the language and may not be redefined. They fall into six categories:
1.4.1 Operators 1.4.2 Punctuation 1.4.3 Grouping Delimiters 1.4.4 Quoted Text Delimiters 1.4.5 Comment Delimiters 1.4.6 Pragma Punctuation and Delimiters 1.5 IdentifiersIdentifiers are names for syntactic entities in a program. They start with a letter, low-line or dollar sign, followed by any number and combination of letters, low-lines, dollar signs and digits. The use of the low-line and dollar sign within identifiers is permitted in support of environments and platforms where they are an integral part of the naming convention, for instance when writing components for or mapping to operating system APIs that use them. However, such an identifier must also contain at least one letter or digit. A non-conformant identifier shall cause a compile time error. The definition of an identifier in a foreign API style shall cause a soft compile time warning. However, the warning may be automatically silenced when Examples:
1.5.1 Reserved IdentifiersReserved identifiers are language defined identifiers that may not be redefined. Reserved are:
1.5. 2 User-Definable IdentifiersIdentifiers that do not coincide with reserved identifiers may be defined or redefined in any scope of a program or library module. 1.6 LiteralsThere are three types of literals: 1.6.1 Numeric literalsNumeric literals represent a numeric compile time value. There are four types: 1.6.1.1 Decimal Number LiteralsDecimal number literals represent decimal whole and real numbers. They are comprised of a mandatory integral part followed by an optional fractional part followed by an optional exponent. Integral and fractional part are separated by a decimal point. Fractional part and exponent are separated by the exponent prefix Examples:
1.6.1.2 Base-2 Number LiteralsBase-2 number literals represent whole numbers in base-2 notation. They are comprised of base-2 number prefix Examples:
1.6.1.3 Base-16 Number LiteralsBase-16 number literals represent whole numbers in base-16 notation. They are comprised of base-16 number prefix Examples:
1.6.1.4 Character Code LiteralsCharacter code literals represent Unicode code points in base-16 notation. They are comprised of Unicode prefix Examples:
1.6.2 String LiteralsString literals are sequences of quotable characters and optional escape sequences, enclosed in single quotes or double quotes. String literals may not contain any control code characters. Examples:
1.6.3 Structured LiteralsStructured literals are compound values consisting of zero or more terminal symbols, enclosed in braces. Structured literals may be nested. Examples:
1.7 Non-Semantic SymbolsNon-semantic symbols are symbols that do not impact the meaning of a program. They may occur anywhere in a program before or after semantic symbols but not within them. There are three types: 1.7.1 CommentsComments are ignored by a compiler but are for annotation and documentation. There are two kinds: 1.7.1.1 Line CommentsLine comments start with a Examples:
1.7.1.2 Block CommentsBlock comments are delimited by opening Examples: 1.7.2 PragmasPragmas are in-source compiler directives to control or influence the compilation process but they do not change the meaning of the program. They consist of a pragma body enclosed in opening A pragma body consists of a non-empty token sequence whose syntax is defined by the pragma grammar. Whitespace, tabulator and line breaks may occur between tokens within a pragma, but comments are not permitted. A comment delimiter within a pragma shall cause a compile time error. There are language defined and optional implementation defined pragmas. Examples:
1.7.3 Lexical SeparatorsLexical separators terminate a numeric literal, identifier, reserved word or a pragma symbol. There are two kinds.
1.7.3.1 Control CodesThe following control codes may appear within Modula-2 source text but not within string literals:
Any other control codes within a source file shall cause a compile time error. An unrecognised BOM shall cause a fatal compile time error. Encoding support other than ASCII and UTF8 is implementation defined. 1.8 Reserved SymbolsCertain symbols are reserved for use by optional language facilities, language extensions and external source code processing utilities. Some are specifically reserved for future use. 1.8.1 Symbols Reserved for Optional and Future Use
1.8.2 Symbols Reserved for Coordinated Superset UseA coordinated language superset is a compliant language superset for whose exclusive use certain symbols are reserved. The reserved symbols of coordinated language supersets are listed below:
1.8.3 Symbols Reserved for Uncoordinated Superset UseAn uncoordinated language superset is a compliant language superset for which no reserved words, identifiers or pragmas are reserved. Such a superset may define additional reserved words and predefined identifiers as long as they start with a single Examples: @TRY @CATCH (* possible reserved words of a language superset *) %DESCR %IMMED (* possible reserved words of an OpenVMS specific superset *) 1.8.4 Symbols Reserved for External Source Code ProcessorsTo assist source code processing prior to compilation, certain symbols are reserved for exclusive use by external source code processing utilities.
1.8.5 Other SymbolsAny special symbols not specifically reserved shall be considered reserved for possible future use or taboo. 1.9 Lexical Parameters1.9.1 Length of LiteralsThe minimum lengths of literals a conforming implementation shall support are:
The fractional part of a real number literal may be truncated. If it is truncated, a soft compile time warning shall be emitted. If a string literal, a character code literal, a whole number literal or the significand or exponent of a real number literal is longer than an implementation is able to process, a compile time error shall occur. 1.9.2 Length of Identifiers and Pragma SymbolsThe minimum lengths of identifiers and pragma symbols a conforming implementation shall support are:
If an identifier or a pragma symbol exceeds the maximum length supported by the implementation, it may be truncated to the maximum supported length. If it is, a soft compile time warning shall occur. 1.9.4 Length of CommentsAn implementation that generates source code of another language may choose to preserve comments by copying them into the output. In this case, the implementation may limit the length of comments copied into the output. The minimum lengths of comments to be fully preserved that such an implementation shall support are:
If a comment to be preserved exceeds the maximum length supported by the implementation, it may be truncated to the maximum supported length. If it is truncated, a soft compile time warning shall occur. If a nested block comment is truncated, an implementation shall insert all closing comment delimiters that would have been lost as a result of truncation. 1.9.4 Line and Column CountersAn implementation may limit the capacity of its internal line and column counters. The minimum values a conforming implementation shall support are:
In the event that a source file being processed exceeds the supported counter limits, an implementation may either continue or abort compilation. A soft compile time warning shall occur if the implementation continues. A fatal compile time error shall be emitted if the implementation aborts. 1.9.5 Lexical Parameter ConstantsActual lexical parameters shall be provided as constants in standard library module |