Lemon 3/16

In March, Lemon received some per­for­mance enhance­ments in the form of left-hand-side direct opti­miza­tions. As the main­tainer (and sole user) of a c++-com­pat­i­ble lemon fork, I became inti­mately famil­iar with these changes (and even noted a cou­ple bugs).

Pre­vi­ously, you might write code like this:

%type expr {int}
expr(A) ::= expr(B) PLUS expr(C). { A = B + C; }
expr(A) ::= LPAREN expr(B) RPAREN. { A = B; }

which roughly trans­lated into this c code:

int temporaryA;
...
temporaryA = stack[0].int_value + stack[2].intValue;
stack[0].int_value = temporaryA;
....
temporaryA = stack[1].int_value;
stack[0].int_value = temporaryA;

(Ac­tual code is uglier due to unions and neg­a­tive stack off­sets.) With left-hand-side direct, the tem­po­rary vari­able can be elim­i­nated in some cir­cum­stances:

%type expr {int}
expr(A) ::= expr(A) PLUS expr(C). { A += C; }
expr(A) ::= LPAREN expr(B) RPAREN. { A = B; }

stack[0].int_value += stack[2].intValue;
....
stack[0].int_value = stack[1].int_value;

Left-hand-side direct opti­miza­tion hap­pens in 3 cir­cum­stances:

  1. If the left-most right-hand side term has the same name and type.
  2. If the left-most right-hand side term has no name
  3. There is a spe­cial `/X-over­writes-Y/ com­ment.

For lemon++, I had to dis­able the com­ment check. How­ever, in the sqlite gram­mar, most usage looks like:

transtype(A) ::= DEFERRED(X).  {A = @X; /*A-overwrites-X*/}

lemon++ tracks if the major type (@X) or minor type (X) are used and can still use left-hand-side direct opti­miza­tions in this case.

While left-hand-side direct opti­miza­tions are obvi­ously ben­e­fi­cial to lemon, they are prob­a­bly more ben­e­fi­cial to lemon++ when used with larger and more com­plex datatypes instead of inte­gers and point­ers. Pre­vi­ously code like this:

%type list {std::vector<int>}
list(A) ::= list(B) expr(C).
{ A = std::move(B); B.push_back(C.int_value); }

trans­lated to code like this:

std::vector<int> tmpA;
tmpA = std::move(stack[0].vector_int_value);
tmpA.push_back(stack[1].int_value);
yy_destructor(stack[0].vector_int_value);
yy_constructor(stack[0].vector_int_value, std::move(tmpA));

Now it can be rewrit­ten to this:

list(A) ::= list(A) expr(C).
{ A.push_back(C.int_value); }

stack[0].vector_int_value.push_back(stack[1].int_value);