Building an Equation to RPN Parser

As part of the continuing examination of lex and yacc, here are the rules for building a parser that translates equations into RPN format.

The process is actually very simple. Because of the way the parser works, all you have to do is print out whatever component we see at each stage. For example, when you see a number, print it out, and when you see a operand, also print it out. The basic ruleset is shown below:

%%
list:   /* nothing */
        | list EOLN
        | list expr EOLN        { printf( "n" ); }
        ;
expr:   shift_expr
        ;
shift_expr: pow_expr
        | shift_expr LEFTSHIFT pow_expr { printf("< < "); }
        | shift_expr RIGHTSHIFT pow_expr { printf(">> “); }
        ;
pow_expr: add_expr
        | pow_expr POW add_expr { printf(”^ “); }
        ;
add_expr: mul_expr
        | add_expr PLUS mul_expr  { printf(”+ “); }
        | add_expr MINUS mul_expr { printf(”- “); }
        ;
mul_expr: unary_expr
        | mul_expr MUL unary_expr { printf(”* “); }
        | mul_expr DIV unary_expr { printf(”/ “); }
        | mul_expr MOD unary_expr { printf(”% “); }
        ;
unary_expr: postfix_expr
        | MINUS primary %prec UNARYMINUS { printf(”-”); }
        | INC unary_expr { printf(”++ “); }
        | DEC unary_expr { printf(”– “); }
        ;
postfix_expr: primary
        | postfix_expr INC { printf(”++ “); }
        | postfix_expr DEC { printf(”– “); }
        | postfix_expr FACT { printf(”! “); }
        ;
 primary: NUMBER { printf(”%g “,$1); }
        | PI { printf(”%g “, M_PI); }
        | OPENBRACKET expr CLOSEBRACKET { }
        | function_call
        ;
function_call: SIN OPENBRACKET expr CLOSEBRACKET { printf(”sin “); }
        | COS OPENBRACKET expr CLOSEBRACKET { printf(”cos “); }
        | TAN OPENBRACKET expr CLOSEBRACKET { printf(”tan “); }
        | ASIN OPENBRACKET expr CLOSEBRACKET { printf(”asin “); }
        | ACOS OPENBRACKET expr CLOSEBRACKET { printf(”acos “); }
        | ATAN OPENBRACKET expr CLOSEBRACKET { printf(”atan “); }
        ;
%%

Why does it work?

It has to do with the parser evaluates the different components. When, for example, the parser identifies an addition with this rule:

add_expr: mul_expr
        | add_expr PLUS mul_expr  { printf("+ "); }

The code that the parser generates evaluates the sub-rules first, and in both cases the rules will ultimately lead to the numerical value. Each time the number is seen, the value is printed. Once both rules have been resolved, it then matches the full expression and outputs the plus sign.

In use, the parser generates all of the necessary RPN:

4+5*6
4 5 6 * +
(4+5)*6
4 5 + 6 * 

You can download the source for the equation to RPN parser: equtorpn.tar.gz (Unix)