C Syntax Tree Format

This appendix documents the syntax tree data structure used by the compiler. This information is only needed by implementors of custom front-ends. Most nodes are self-explanatory; if in doubt, it is recommended that you refer to the Gump sample implementing an Oz parser (installed at examples/gump/OzParser.ozg).


<input> ::= parseError
 | [<compilation unit>]

Compilation Units

<compilation unit> ::= <phrase>
 | <directive>
 | fDeclare(<phrase> <phrase> <coord>)

<directive> ::= dirSwitch([<switch>])
 | dirPushSwitches
 | dirPopSwitches
 | dirLocalSwitches

<switch> ::= on(<switch name> <coord>)
 | off(<switch name> <coord>)

<switch name> ::= <atom>

C.1 The Base Language


At the syntactical level, statements are not distinguished from expressions. Both are subsumed by <phrase>. In a top-down analysis of the tree, it can be determined which phrases need to be statements and which need to be expressions. The fStepPoint form is only required if you want to provide support for source-level debugging: It wraps the contained phrase into a step point (see ``The Mozart Debugger''); the atom can be freely chosen to indicate its kind (call, conditional, etc.).

<phrase> ::= fStepPoint(<phrase> <atom> <coord>)
 | fAnd(<phrase> <phrase>)
 | fEq(<phrase> <phrase> <coord>)
 | fAssign(<phrase> <phrase> <coord>)
 | fOrElse(<phrase> <phrase> <coord>)
 | fAndThen(<phrase> <phrase> <coord>)
 | fOpApply(<atom> [<phrase><coord>)
 | fOpApplyStatement(<atom> [<phrase>]
 | fDotAssign(<phrase> <phrase> <coord>)
 | fObjApply(<phrase> <phrase> <coord>)
 | fAt(<phrase> <coord>)
 | <atom literal>
 | <escapable variable>
 | <wildcard>
 | fSelf(<coord>)
 | fDollar(<coord>)
 | <int literal>
 | fFloat(<float> <coord>)
 | fRecord(<label> [<record argument>])
 | fOpenRecord(<label> [<record argument>])
 | fApply(<phrase> [<phrase><coord>)
 | fProc(<phrase> [<phrase><phrase>
      [<proc flag><coord>)
 | fFun(<phrase> [<phrase><phrase>
      [<proc flag><coord>)
 | fFunctor(<phrase> [<functor descriptor><coord>)
 | fClass(<phrase> [<class descriptor>]
 | fLocal(<phrase> <phrase> <coord>)
 | fBoolCase(<phrase> <phrase> <opt else> <coord>)
 | fCase(<phrase> [<case clause>]
      <opt else> <coord>)
 | fFOR([<for decl><phrase> <coord>)
 | fLockThen(<phrase> <phrase> <coord>)
 | fLock(<phrase> <coord>)
 | fThread(<phrase> <coord>)
 | fTry(<phrase> <catch> <finally> <coord>)
 | fRaise(<phrase> <coord>)
 | fSkip(<coord>)

<label> ::= <atom literal>
 | <variable>

<atom literal> ::= fAtom(<literal> <coord>)

<variable> ::= fVar(<atom> <coord>)

<escapable variable> ::= <variable>
 | fEscape(<variable> <coord>)

<wildcard> ::= fWildcard(<coord>)

<int literal> ::= fInt(<int> <coord>)

<record argument> ::= <phrase>
 | fColon(<feature> <phrase>)

Procedures can carry flags (atoms following the proc or fun keyword). For the moment, the only recognized flags are instantiate (the body's code is copied upon application), lazy (the body has by-need semantics), dynamic (disable static-call optimization of this procedure), and sited (cannot be pickled). Other atoms are silently ignored.

<proc flag> ::= <atom literal>


<functor descriptor> ::= fRequire([<import decl><coord>)
 | fPrepare(<phrase> <phrase> <coord>)
 | fImport([<import decl><coord>)
 | fExport([<export decl><coord>)
 | fDefine(<phrase> <phrase> <coord>)

<import decl> ::= fImportItem(<variable> [<aliased feature>]
            <opt import at>)

<aliased feature> ::= <feature no var>
 | <variable>#<feature no var>

<opt import at> ::= fNoImportAt
 | fImportAt(<atom literal>)

<export decl> ::= fExportItem(<export item>)

<export item> ::= <variable>
 | fColon(<feature no var> <variable>)


<class descriptor> ::= fFrom([<phrase><coord>)
 | fProp([<phrase><coord>)
 | fAttr([<attr or feat><coord>)
 | fFeat([<attr or feat><coord>)

<attr or feat> ::= <escaped feature>
 | <escaped feature>#<phrase>

<meth> ::= fMeth(<meth head> <phrase> <coord>)

<meth head> ::= <meth head 1>
 | fEq(<meth head 1> <variable> <coord>)

<meth head 1> ::= 
    <atom literal>
 | <escapable variable>
 | fRecord(<meth head label> [<meth argument>])
 | fOpenRecord(<meth head label> [<meth argument>])

<meth head label> ::= <atom literal>
 | <escapable variable>

<meth argument> ::= 
    fMethArg(<meth arg term> <default>)
 | fMethColonArg(<feature> <meth arg term> <default>)

<meth arg term> ::= <variable>
 | <wildcard>
 | fDollar(<coord>)

<default> ::= fNoDefault
 | fDefault(<phrase> <coord>)


<feature no var> ::= <atom literal>
 | <int literal>

<feature> ::= <feature no var>
 | <variable>

<escaped feature> ::= <feature no var>
 | <escapable variable>


<case clause> ::= fCaseClause(<pattern> <phrase>)

<pattern> ::= <phrase>
 | fSideCondition(<phrase> <phrase> <phrase> <coord>)

<catch> ::= fNoCatch
 | fCatch([<case clause><coord>)

<finally> ::= fNoFinally
 | <phrase>

<opt else> ::= fNoElse(<coord>)
 | <phrase>

<for decl> ::= forFeature(<atom literal> <phrase>)
 | forPattern(<phrase> <for gen>)

<for gen> ::= forGeneratorList(<phrase>)
 | forGeneratorInt(<phrase> <phrase> <opt phrase>)
 | forGeneratorC(<phrase> <phrase> <opt phrase>)

<opt phrase> ::= <phrase>
 | unit


Each triple consisting of an <atom> and two <int>s denotes a file name ('' if none known), a line number (starting at 1; required) and a column number (starting at 0; ~1 if none known). If two triples are given, then they denote the starting and ending coordinates of a construct. A pos may be turned into a fineStep or a coarseStep, denoting a step point for debugging. unit is an unknown coordinate.

<coord> ::= pos(<atom> <int> <int>)
 | pos(<atom> <int> <int> <atom> <int> <int>)
 | fineStep(<atom> <int> <int>)
 | fineStep(<atom> <int> <int> <atom> <int> <int>)
 | coarseStep(<atom> <int> <int>)
 | coarseStep(<atom> <int> <int> <atom> <int> <int>)
 | unit

C.2 Finite Domain Extensions and Combinators

<phrase> += <fd expression>
 | fFail(<coord>)
 | fNot(<phrase> <coord>)
 | fCond([<clause><opt else> <coord>)
 | fOr([<clause opt then><coord>)
 | fDis([<clause opt then><coord>)
 | fChoice([<phrase><coord>)

<fd expression> ::= 
    fFdCompare(<atom> <phrase> <phrase> <coord>)
 | fFdIn(<atom> <phrase> <phrase> <coord>)

<clause> ::= fClause(<phrase> <phrase> <phrase>)

<clause opt then> ::= fClause(<phrase> <phrase> <opt then>)

<opt then> ::= fNoThen(<coord>)
 | <phrase>

C.3 Gump Extensions

<compilation unit> += 
    fSynTopLevelProductionTemplates([<prod clause>])

<phrase> += fScanner(<variable>
         [<class descriptor>] [<meth>]
         [<scanner rule><atom> <coord>)
 | fParser(<variable>
        [<class descriptor>] [<meth>]
        <token clause> [<parser descriptor><int>

<grammar symbol> ::= <atom literal>
 | <variable>


<scanner rule> ::= fMode(<variable> [<mode descriptor>])
 | <lex clause>

<mode descriptor> ::= fInheritedModes([<variable>])
 | <lex clause>

<lex clause> ::= fLexicalAbbreviation(<grammar symbol> <regex>)
 | fLexicalRule(<regex> <phrase>)

<regex> ::= <string>


<token clause> ::= fToken([<token decl>])

<token decl> ::= <atom literal>
 | <atom literal>#<phrase>

<parser descriptor> ::= <prod clause>
 | <syntax rule>

<prod clause> ::= 
    fProductionTemplate(<prod key> [<prod param>]
                    [<syntax rule>] [<syn expression>]
                    [<prod ret>])

<prod param> ::= <variable>
 | <wildcard>

<prod key> ::= none#<string>
 | <atom>#<string>

<prod ret> ::= none
 | <variable>
 | fDollar(<coord>)

<syntax rule> ::= fSyntaxRule(<grammar symbol> [<syn formal>]
            <syn expression>)

<syn formal> ::= <variable>
 | <wildcard>
 | fDollar(<coord>)

<syn expression> ::= 
    fSynApplication(<grammar symbol> [<phrase>])
 | fSynAction(<phrase>)
 | fSynSequence([<variable>] [<syn expression>])
 | fSynAlternative([<syn expression>])
 | fSynAssignment(<escapable variable> <syn expression>)
 | fSynTemplateInstantiation(<prod key> [<syn expression>]

Leif Kornstaedt
Version 1.4.0 (20080702)