27 March 2000 Release 2.22 Notes for New Users of PCCTS Version 1.33MR22
8
_begcol=_endcol+1;
If you include multiple tabs and other forms of whitespace in a single regular expression, the computation of
_endcol by
DLG
must be backed out by subtracting the length of the string. Then you can compute the column
position by inspecting the string character by character.
#53. Computing column numbers when using more() with strings that include tab characters and newlines
/* what is the column and line position when the comment includes
or is followed by tabs
tab
tab
*/
tab
tab
i++;
Note: This code excerpt requires a change to
PCCTS
1.33 file pccts/dlg/output.c in
order to inject code into the
DLG
Lexer class header. The modified source code is
distributed as part of the notes in file notes/changes/dlg/output.c and output_diff.c. An
example of its use is given in Example #7.
My feeling is that the line and column information should be updated at the same time more() is called because it
will lead to more accurate position information in messages. At the same time one may want to identify the
first
line on which a construct begins rather than the line on which the problem is detected: it's more useful to know that
an unterminated string started at line 123 than that is was still unterminated at the end-of-file.
void DLGLexer::tabAdjust () { // requires change to output.c
char * p; // to add user code to DLGLexer
if (_lextext == _begexpr) startingLineForToken=_line;
_endcol=_endcol-(_endexpr-_begexpr)+1; // back out DLG computation
for (p=_begexpr;*p != 0; p++) {
if (*p == '\n') { // newline() by itself
newline();_endcol=0; // doesn't reset column
} else if (*p == '\t') {
_endcol=((_endcol-1) & ~7) + 8; // traditional tab stops
};
_endcol++;
};
_endcol--;
// DLG will compute begcol=endcol+1
}
See Example #7 for a more complete description.
Ambiguity Aid (options -aa, -aam, -aad
#54. Example with nested if statement
Consider the timeless and eternal beauty of the nested if statement:
stmt : if_stmt /* 1 */
| assign_stmt /* 2 */
; /* 3 */
if_stmt : IF expr /* 4 */
THEN stmt /* 5 */
{ ELSE stmt } /* 6 */
; /* 7 */
assign_stmt : expr EQUAL expr SC ; /* 8 */
expr : E ; /* 9 */
This will be ambiguous regardless of the value of k and ck chosen. When analyzed with -k 1 and -ck 1
ANTLR
will
report:
ifstmt.g(6) : warning: alts 1 and 2 of {...} ambiguous upon { ELSE }
We can specify the ambiguity of interest using a line number or rule name:
antlr ifstmt.g -aa if_stmt #
invoked using a rule name
antlr ifstmt.g -aa 6 #
invoked using a line number
The output is: