In March, Lemon received some
performance enhancements in the form of left-hand-side direct optimizations.
As the maintainer (and sole user) of a
c++-compatible lemon fork,
I became intimately familiar with these changes (and even noted a couple bugs).
Previously, you might write code like this:
%type expr {int}
expr(A) ::= expr(B) PLUS expr(C). { A = B + C; }
expr(A) ::= LPAREN expr(B) RPAREN. { A = B; }
which roughly translated into this c code:
int temporaryA;
...
temporaryA = stack[0].int_value + stack[2].intValue;
stack[0].int_value = temporaryA;
....
temporaryA = stack[1].int_value;
stack[0].int_value = temporaryA;
(Actual code is uglier due to unions and negative stack offsets.) With left-hand-side direct, the temporary variable can be eliminated in some circumstances:
%type expr {int}
expr(A) ::= expr(A) PLUS expr(C). { A += C; }
expr(A) ::= LPAREN expr(B) RPAREN. { A = B; }
stack[0].int_value += stack[2].intValue;
....
stack[0].int_value = stack[1].int_value;
Left-hand-side direct optimization happens in 3 circumstances:
- If the left-most right-hand side term has the same name and type.
- If the left-most right-hand side term has no name
- There is a special `/X-overwrites-Y/ comment.
For lemon++, I had to disable the comment check. However, in the sqlite
grammar, most usage looks like:
transtype(A) ::= DEFERRED(X). {A = @X; /*A-overwrites-X*/}
lemon++ tracks if the major type (@X) or minor type (X) are used and can
still use left-hand-side direct optimizations in this case.
While left-hand-side direct optimizations are obviously beneficial to lemon,
they are probably more beneficial to lemon++ when used with larger and more
complex datatypes instead of integers and pointers. Previously code like this:
%type list {std::vector<int>}
list(A) ::= list(B) expr(C).
{ A = std::move(B); B.push_back(C.int_value); }
translated to code like this:
std::vector<int> tmpA;
tmpA = std::move(stack[0].vector_int_value);
tmpA.push_back(stack[1].int_value);
yy_destructor(stack[0].vector_int_value);
yy_constructor(stack[0].vector_int_value, std::move(tmpA));
Now it can be rewritten to this:
list(A) ::= list(A) expr(C).
{ A.push_back(C.int_value); }
stack[0].vector_int_value.push_back(stack[1].int_value);