27 March 2000 Release 2.22 Notes for New Users of PCCTS Version 1.33MR22
20
#100.
ANTLR
will guess where to match "
@
" if the user omits it from the start rule
ANTLR
attempts to deduce "start" rules by looking for rules which are not referenced by any other rules. When it
finds such a rule it assumes that an end-of-file token ("@") should be there and adds one if the user did not code
one. This is the only case, according to TJP, when
ANTLR
adds something to the user's grammar.
#101. To match any token use the token wild-card expression "." (dot)
This can be useful for providing a context dependent error message rather than the all purpose message "syntax
error".
if-stmt : IF "\(" expr "\)" stmt
| IF . <<printf("If statement requires expression "
"enclosed in parenthesis");
PARSE_FAIL; // user defined
>>
;
This particular case is better handled by the parser exception facility.
A simpler example:
quoted : "quote" . ; // quoted terminal
#102. The "~" (tilde) operator applied to a #token or #tokclass is satisfied when the input token does
not
match
anything : (~ t:Newline)* Newline ;
The "~" operator cannot be applied to rules. Use syntactic predicates to express the idea "if this rule doesn't match
try to match this other rule".
The element label "t" in the example allows one to examine the token actually matched. Contributed by Tom
Nurkkala (tom.nurkkala@powercerv.com).
#103. To list the rules of the grammar grep
parserClassName.
h for "_root" or edit the output from
ANTLR
­cr
#104. The
ANTLR
­gd trace option can be useful in sometimes unexpected ways
For example, by suitably defining the functions
ANTLR
Parser::tracein and
ANTLR
Parser::traceout one can accumulate
information on how often each rule is invoked. They could be used to provide a traceback of active rules following
an error provided that the havoc caused by syntactic predicates' use of setjmp/longjmp is properly dealt with.
#105. Associativity and precedence of operations is determined by nesting of rules
In the example below "=" associates to the right and has the lowest precedence. Operators "+" and "*" associate to
the left with "*" having the highest precedence.
expr0 : expr1 {"="^ expr0} ; /* a1 */
expr1 : expr2 ("\+"^ expr2)* ; /* a2 */
expr2 : expr3 ("\*"^ expr3)* ; /* a3 */
expr3 : ID ; /* a4 */
The more deeply nested the rule the higher the precedence. Thus precedence is "*" > "+" > "=". Consider the
expression "x=y=z". Will it be parsed as "x=(y=z)" or as "(x=y)=z" ? The first part of expr0 is expr1. Because
expr1 and its descendants cannot match an "=" it follows that all derivations involving a
second
"=" in an expression
must arise from the "
{...}
" term of expr0. This implies right association.
In the following samples the ASTs are shown in the root-and-sibling format used in
PCCTS
documentation. The
numbers in brackets are the serial number of the ASTs. This was created by code from Example #5.
a=b=c=d
( = <#2> a <#1> ( = <#4> b <#3> ( = <#6> c <#5> d <#7> ) ) ) NL <#8>
a+b*c
( + <#2> a <#1> ( * <#4> b <#3> c <#5> ) ) NL <#6>
a*b+c
( + <#4> ( * <#2> a <#1> b <#3> ) c <#5> ) NL <#6>