27 March 2000 Release 2.22 Notes for New Users of PCCTS Version 1.33MR22
14
#60.
C++ mode makes multiple parsers easy
pccts/testcpp/5/test.g
Uses multiple instances of a single parse class (thus a single grammar)
pccts/testcpp/6/main.cpp
Program uses parsers for two different grammars (test.g and test2.g)
If two parsers share the same
DLG
automaton it may be necessary to save
DLG
state. See Item #61.
#61.
Use
DLG
LexerBase routines to save/restore
DLG
state when multiple parsers share a token buffer
When the second parser "takes control" the
DLG
Lexer doesn't know about it and doesn't reset the state variables
such as #lexclass, line number, column tracking, etc.
Use
DLG
LexerBase::saveState (
DLG
State *) and restoreState(
DLG
State *) to save and restore
DLG
state.
#62.
In C++ mode ASTs and
ANTLR
Tokens do not use stack discipline as they do in C mode
In C mode ASTs and attributes are allocated on a stack. This is an efficient way to allocates space for structs and is
not a serious limitation because in C it is customary for a structure to be of fixed size. In C++ mode it would be a
serious limitation to assume that all objects of a given type were of the same size because derived classes may have
additional fields. For instance one may have a "basic" AST with derived classes for unary operators, binary
operators, variables, and so on. As a result the C++ mode implementation of symbolic tags for elements of the rule
uses simple pointer variables. The pointers are initialized to 0 at the start of the rule and remain well defined for the
entire rule. The things they point to will normally remain well defined, even objects defined in sub-rules:
rule ! : a:rule2 {b:B} <<#0=#(#a,#[$b]);>> ; // OK only in C++ mode
This fragment is not be well defined in C mode because "B" would become undefined on exit from "
{...}
".
#63.
Summary of Token class inheritance in file AToken.h
ANTLR
AbstractToken - (empty class) virtual table
|
V
+-- ANTLRRefCountToken - (reference counter) virtual table
| |
| V
| +--
ANTLR
CommonRefCountToken - (token type, text, line) virtual table
| using variable length text fields
|
+-- ANTLRCommonNoRefCountToken - (token type, text, line) virtual table
| using variable length text fields
|
+-- MyToken - (token type, text, line, ...) virtual table
Examples:
NoLeakToken.h
SimpleToken.h
notes/calcAST/numToken.h - numeric field
notes/col/myToken.h - variable length text with column info
#64.
Diagram showing relationship of major classes
ANTLRTokenStream
(ATokenStream.h)
|
V
ANTLRParser --> ANTLRTokenBuffer --> DLGLexerBase ---> DLGInputStream
(AParser.h) (ATokenBuffer.h) (DLexerBase.h) |(DLexerBase.h)
| | | |
| V V +- DLGFileInput
| MyTokenBuffer DLGLexer |
| (ANTLR generated) |
V +- DLGStringInput
MyParser (generated by ANTLR from myFile.g)
MyParser.h
(class header)
MyParser.cpp
(static variable initialization)
myFile.cpp
(implementation code for rules)
#65.
Required AST constructors: AST(), AST(
ANTLR
TokenPtr), and AST(X x,Y y) for #[X x,Y y]