2 The Gump Scanner Generator

This chapter describes the Gump Scanner Generator. Its input consists of an Oz source with embedded scanner specifications; the output implements each scanner by an Oz class.


A scanner is a program that performs lexical analysis, which means that it transforms a stream of characters into a stream of tokens. The text is read from left to right. During this process, sequences of characters are grouped into lexemes according to user-defined rules, specified by so-called regular expressions and associated semantic actions. An action computes tokens from a lexeme, each consisting of a token class and an optional token value, which are appended to the token stream. The process is iterated until the end of the character stream is reached.

This chapter first describes the basic principles of the Gump Scanner Generator by means of an example in Section 2.1. A more detailed reference is then given in Section 2.2.

Leif Kornstaedt
Version 1.4.0 (20080702)