Modula-2 Reloaded

A Modern Typesafe & Literate Programming Notation

Site Menu

Project

Specification

Implementation

Recommendations

Reference

Needs Updating

Work in Progress

Wastebasket

Wiki Manual

edit SideBar

EBNF Notation

A Notation to Describe the Syntax of Modula-2

The syntax of PIM Modula-2 was formally defined in an extended version of Backus-Naur Formalism, known as Wirth EBNF, in which brackets and braces are used to denote optional and repeating syntactic entities. For the formal definition of the revised Modula-2 language we use a slightly different version of EBNF which employs parentheses and modifier suffixes instead.

Each EBNF rule defines exactly one symbol and is terminated by a semicolon. Names of symbols start with a letter which may be followed by letters, digits, hyphens and low lines. Names may not contain whitespace. Terminal symbols are denoted by names which start with an uppercase letter. Non-terminal symbols are denoted by names which start with a lowercase letter. Literals are enclosed in double or single quotes. By convention, reserved words of the target language are denoted in all-uppercase letters.

EBNF production rules take the following general forms:

Synonym

foo := bar ;

Symbol foo is defined as a synonym for symbol bar.

Sequence

foo := bar baz ;

Symbol foo is defined as a sequence of symbol bar followed by symbol baz.

Alternative

foo := bar | baz ;

Symbol foo is defined as an alternative, either symbol bar or symbol baz, but not both.

Option

foo := bar? ;

Symbol foo is defined by zero or one occurrence of symbol bar.

Repetition

foo := bar+ ;

Symbol foo is defined by one or more occurrences of symbol bar.

Optional Repetition

foo := bar* ;

Symbol foo is defined by zero or more occurrences of symbol bar.

Grouping

Parentheses may be used to group syntactic entities on the right hand side of an EBNF rule. A group may be followed by a ?, + or * modifier which then applies to the group as a whole.

foo := bar ( baz | bam ) ( “,” boo )* ;

Symbol foo is defined by an occurrence of symbol bar followed by an alternative of symbol baz, or symbol bam followed by zero or more occurrences of literal “,” and symbol boo.

EBNF defined in EBNF

syntax :=
    statement* ;

statement :=
    symbol-id “:=” expression “;” ;

expression :=
    term ( “|” term )* ;

term :=
    factor+ ;

factor :=
    ( symbol-id | literal | literal-range | group )

    ( “?” | “+” | “*” )* ;

group : =
    “(“ expression “)” ;

symbol-id :=
    terminal-id | non-terminal-id ;

terminal-id :=

    Uppercase-Letter ( Uppercase-Letter | Digit | “-” | “_” )* ;

non-terminal-id :=

    Lowercase-Letter ( Letter | Digit | “-” | “_” )* ;

Reserved-Word :=
  Uppercase-Letter* ;

literal :=

    ( ‘“‘ ( Character | “‘” )+ ‘“‘ ) |

    ( “‘“ ( Character | ‘“’ )+ “‘“ ) ;

literal-range :=
    literal “..” literal ;

Letter :=
    Uppercase-Letter | Lowercase-Letter ;

Uppercase-Letter :=
    “A” .. “Z” ;

Lowercase-Letter :=
    “a” .. “z” ;

Digit :=
    “0” .. “9” ;

Character :=

    Letter | Digit |

    “ “ | “!” | “#” | “$” | “%” | “&” | “(“ | “)” | “*” | “+” |

    “,” | “-” | “.” | “/” | “:” | “;” | “<“ | “=” | “>” | “?” |

    “@” | “[“ | “]” | “^” | “_” | “`” | “{“ | “|” | “}” | “~” ;