Tag Archives: MCslp Coalface

Multiple search

I think it was a post by Robert Scoble that put me onto this, but gada.be.

The site goes away and searches a number of other search engines to give you some top responses. Quite interesting…try running these and check the results:

It’s unfortunate that there’s the disparity between the Martin C Brown and Martin MC Brown; that change was one I made last year when I started blogging more regularly because most people know me as simply MC. At least though there’s a link.

Major rewrite of C/ODBC completed

One of my first major tasks at MySQL has just been completed - a major rewrite of the Connector/ODBC (C/ODBC) documentation.

There were three major focuses for the rewrite:

  1. Bring the documentation up to date. We had a mix of information on the latest release (currently 3.51, but 5.0 is currently in development), but many of the sections didn’t reflect that new version. There is also new information on how to install the driver on Mac OS X.
  2. Restructure the information. This is something I’m doing across the board on the Connectors docs, as I try to re-organize them all into a more coherent, and compatible, structure. For example, I’ve collated all of the tips about using C/ODBC with different applications into their own section, organized by application. I’ve also extended the information; for example we now have a step by step guide to importing data from MySQL into Microsoft Word and Excel through Microsoft Query.
  3. Setting up the document so that I can more easily add and extend the information in there with tips from the community, bug fixes, and of course new releases.

I’ll now be continuing the work with the other Connectors, like Connector/J and Connector/NET.

One worker in one country

A recent article at CNN talks about how MySQL operates.

As one of the MySQL team, I can attest that works, but it requires a significant amount of coordination, and lots of online communication through email, IRC, Skype and other methods to keep everbody talking and all the projects working together.

The flip side to that process is that we all get involved in different areas, and you tend to be much more aware of what is going on company wide. There is also better cooperation - because we can all get involved we can all provide our experience and expertise to a wide range of problems and projects. Also, because we come from such a wide range of backgrounds and environments, we have a much wider perspective.

So not only does remote, and earth-wide staffing work, but it provides us with a level of cooperation that might be more difficult if we all worked in group offices in the same building.

GNU/Linux – Fad or Favoritism picked up by Linux Today

I’ve picked up bloggin at FSM again, and my first post has been picked up by LinuxToday already.

I urge you to read the article, but the basis of my thoughts are whether my current Linux distribution choices are because they are better than my previous ones (and therefore my favourites), or whether I’ve merely chosen them because they happen to popular at the moment (and are therefore just a passing fad).

Memory stick returned!

Years ago, I got into the habit of travelling around with a USB memory stick, probably long before they became the fashion accessory they are now. It became invaluable for three reasons:

  • I keep a copy of my current resume on the stick. If I meet somebody at a show or conference, I can give them my memory stic to copy off the file.
  • I use it for backups - encrypted, of course - of the files on my laptop while I’m away. That way, I have an ‘offsite’ backup of anything that has changed since the machine was last backed up on the network, simply by carrying around the key.
  • It’s become one the most used elements of my tookit for migrating files. I’ve used it to swap files between machines, copy important documents between the office and a client when I didn’t or couldn’t take a laptop, and I’ve used it on a number of occasions for migrating Windows XP users between machines.

When I went to Sorrento for a conference in March, however, I lost my USB stick. I searched everywhere I thought I might find it, and just assumed I’d lost it somewhere amongst the conference rooms or my hotel room.

In fact, as evidenced by an email last week, I’d lost it at the airport. And my faith in human nature has been slightly restored, because the gentleman that found it check the info on the stick, found my resume and through that, found me. The stick itself arrived back just this week.

I’d just like to say thanks to Mike for finding, discovering, and returning the stick :)

Peter?s MySQL Events Article

Peter Gulutzan, who I had the pleasure of meeting (and eating) with a number of times at the developer’s conference in Sorrento, has written a very good overview of the new MySQL Events features plugged into MySQL 5.1.

Events differ from triggers because you can set an event to happen at a specified time or a specific interval. Although sometimes referred to as CRON for MySQL, events are slightly more flexible, and database driven making them much more practical than running a script or other tool through the cron, at or similar tool.

Building an RPN to Equation Parser

In the final part of the examination of lex and yacc, here are the rules for building a parser that translates RPN into equation input (the reverse of the Equation to RPN parser.

Translating RPN into standard equation format is a lot more difficult. Although the fundamentals are similar to the RPN parser (we still use a stack for values that are popped off when we see an operand), it is the recording of that process is much more difficult.

In the RPN calculator, we can place the result of the calculation back onto the stack so that the value can be used. To resolve something into the equation format we need to record the equivalent expression, not the value. For that, we use a temporary string, and then check if the temporary string has a value and append further expressions to that string.

Also, to help precedence in the final calculation (a process handled automatically by the sequence of numbers an operands in RPN) we also enclose each stage of the calculation in parentheses.

The resulting rules are shown below. Note that for the example, only the basic operands (+ - * /) are supported, but the principles are valid for any combination.

%%
list:   /* nothing */
        | list EOLN
        | list expr EOLN        { printf( "%sn",exprstring); }
        ;
expr:   primary
        | expr primary MUL
          {
            if (strlen(exprstring) > 0)
              {
                sprintf(tmpstring,"(%s * %g)",exprstring, pop());
              }
            else
              {
                sprintf(tmpstring,"( %g * %g )",pop(),pop());
              }
            strcpy(exprstring,tmpstring);
          }
        | expr primary DIV
          {
            temp=pop();
            if (strlen(exprstring) > 0)
              {
                sprintf(tmpstring,"(%s / %g)",exprstring, temp);
              }
            else
              {
                sprintf(tmpstring,"( %g / %g )",pop(),temp);
              }
            strcpy(exprstring,tmpstring);
          }
        | expr primary PLUS
          {
            if (strlen(exprstring) > 0)
              {
                sprintf(tmpstring,"(%s + %g)",exprstring, pop());
              }
            else
              {
                sprintf(tmpstring,"( %g + %g )",pop(),pop());
              }
            strcpy(exprstring,tmpstring);
          }
        | expr primary MINUS
          {
            temp=pop();
            if (strlen(exprstring) > 0)
              {
                sprintf(tmpstring,"(%s - %g)",exprstring, temp);
              }
            else
              {
                sprintf(tmpstring,"( %g - %g )",pop(),temp);
              }
            strcpy(exprstring,tmpstring);
          }
        ;
primary: NUMBER { push($1); }
        ;
%%

You can see the resulting output below:

4 5 + 6 *
(( 4 + 5 ) * 6)

As mentioned in the original IBM article, we can pipe sequences together to show the parsing and calculation of an expression from different formats. For example:

$ rpntoequ|calc
4 5 + 6 *
54

And even rpntoequ and equtorpn:

$ rpntoequ|equtorpn
4 5 + 6 *
4 5 + 6 *

The current RPN translator as shown here is not as advanced as the main RPN system, and so it doesn’t support all the options, or expression formats, but you can get the general idea.

You can download the code for this example: rpntoequ.tar.gz (Unix).

Building an Equation to RPN Parser

As part of the continuing examination of lex and yacc, here are the rules for building a parser that translates equations into RPN format.

The process is actually very simple. Because of the way the parser works, all you have to do is print out whatever component we see at each stage. For example, when you see a number, print it out, and when you see a operand, also print it out. The basic ruleset is shown below:

%%
list:   /* nothing */
        | list EOLN
        | list expr EOLN        { printf( "n" ); }
        ;
expr:   shift_expr
        ;
shift_expr: pow_expr
        | shift_expr LEFTSHIFT pow_expr { printf("< < "); }
        | shift_expr RIGHTSHIFT pow_expr { printf(">> “); }
        ;
pow_expr: add_expr
        | pow_expr POW add_expr { printf(”^ “); }
        ;
add_expr: mul_expr
        | add_expr PLUS mul_expr  { printf(”+ “); }
        | add_expr MINUS mul_expr { printf(”- “); }
        ;
mul_expr: unary_expr
        | mul_expr MUL unary_expr { printf(”* “); }
        | mul_expr DIV unary_expr { printf(”/ “); }
        | mul_expr MOD unary_expr { printf(”% “); }
        ;
unary_expr: postfix_expr
        | MINUS primary %prec UNARYMINUS { printf(”-”); }
        | INC unary_expr { printf(”++ “); }
        | DEC unary_expr { printf(”– “); }
        ;
postfix_expr: primary
        | postfix_expr INC { printf(”++ “); }
        | postfix_expr DEC { printf(”– “); }
        | postfix_expr FACT { printf(”! “); }
        ;
 primary: NUMBER { printf(”%g “,$1); }
        | PI { printf(”%g “, M_PI); }
        | OPENBRACKET expr CLOSEBRACKET { }
        | function_call
        ;
function_call: SIN OPENBRACKET expr CLOSEBRACKET { printf(”sin “); }
        | COS OPENBRACKET expr CLOSEBRACKET { printf(”cos “); }
        | TAN OPENBRACKET expr CLOSEBRACKET { printf(”tan “); }
        | ASIN OPENBRACKET expr CLOSEBRACKET { printf(”asin “); }
        | ACOS OPENBRACKET expr CLOSEBRACKET { printf(”acos “); }
        | ATAN OPENBRACKET expr CLOSEBRACKET { printf(”atan “); }
        ;
%%

Why does it work?

It has to do with the parser evaluates the different components. When, for example, the parser identifies an addition with this rule:

add_expr: mul_expr
        | add_expr PLUS mul_expr  { printf("+ "); }

The code that the parser generates evaluates the sub-rules first, and in both cases the rules will ultimately lead to the numerical value. Each time the number is seen, the value is printed. Once both rules have been resolved, it then matches the full expression and outputs the plus sign.

In use, the parser generates all of the necessary RPN:

4+5*6
4 5 6 * +
(4+5)*6
4 5 + 6 * 

You can download the source for the equation to RPN parser: equtorpn.tar.gz (Unix)

O?Reilly RSS Feeds getting annoying

For some strange reason at the moment, O’Reilly feeds are creating duplicates. It’s obviously deliberate, but I can’t see the logic behind the process.

The main culprit (but not the only one) is a series on Refactoring (called Refactoring Everything). Refactoring is an important part of any programming process, but the application they are looking at specifically is Perl based.

However, of the feeds I subscribe the latest issue (7) and all previous issues, as well a a number of other articles, has appeared on:

Every single one of those links is unique…

The ONLamp link is obviously the key one, and I can, at a stretch, understand why it appears on the PHP and Python areas. The Apache one also a little relevance (it’s key to the LAMP stack), but while the application is web based, the refactoring has nothing to do with Apache.

But why does it also appear on the BSD feed? I guess you could be refactoring on a BSD platform, but why pollute a BSD focused feed with a programming story?

Building an RPN Calculator

There’s an tutorial shortly due to appear at IBM developerWorks that covers the process behind building a calculator using the lex and yacc (or flex and bison) tools to build a parser. The tutorial covers a natural expression parser, i.e. one capable of processing:

(4+5)*6

I suggest a couple of extensions in the tutorial, namely a Reverse Polish Notation (RPN) calculator, and translators that convert to/from RPN and standard equation format. Here, we’re going to start with looking at the RPN calculator.

The RPN system is more straightforward for people to learn when you think about typical equations, for example you might write:

45
63 +

In RPN, you would enter this as:

45 63 +

From a parsing point of view, the process is also easier, because you can perform the calculation by pushing the numbers on to the stack and then performing a calculation with those two numbers. This hugely simplifies the parser, but it only has to push numbers and pop them off when it sees the operand, rather than having to extract both numbers and parser from the input text. Even better, compound calculations can be made easier because the result of one calculation can be pushed back on to the stack for the next part.

For example, the following equation:

45 63 + 23 *

Can be read as:

  1. Push 45 on to stack
  2. Push 63 on to stack
  3. Pop value off stack, add it to another value popped off the stack
  4. Push result to stack
  5. Pop value off stack, multiply by value popped off stack

The lexical analysis component (i.e. the lex definitions) remain the same, it’s only the parser that changes. Before we examine the yacc rules, you need to see the simple stack system. It provides two functions, one pushes values on, and the other pops values off. All the values are stored in a simple array and a global stack pointer holds the current storage location so that values can be popped off or pushed back:

#include "globdefs.h"

int sp=0;

double val[MAXVAL];

void push(f)
double f;
{
  if (sp < MAXVAL)
    val[sp++]=f;
  else
    printf("Error: stack full, cant push%gn",f);
}

double pop()
{
  double value;

  if (sp > 0)
    return(val[–sp]);
  else
  {
    printf(”error: stack emptyn”);
    return 0.0;
  }
}

The yacc rules for a simple RPN parser are shown below (the rest of the surrounding code is identical).

%%
list:   /* nothing */
        | list EOLN
          { printf( "%gn" , pop()); }
        | list exprlist EOLN
          { printf( "%gn" , pop()); }
        ;
exprlist: shift_expr
        | exprlist shift_expr
        ;
shift_expr: add_expr
        | shift_expr LEFTSHIFT
          {
            temp=pop();
            push(((int)pop()) < < ((int)temp));
          }
        | shift_expr RIGHTSHIFT
          {
            temp=pop();
            push(((int)pop()) >> ((int)temp));
          }
        ;
add_expr: mul_expr
        | add_expr PLUS
          { push(pop()+pop()); }
        | add_expr MINUS
          {
            temp=pop();
            push(pop()-temp);
          }
        ;
mul_expr: unary_expr
        | add_expr MUL
          { push(pop()*pop()); }
        | add_expr DIV
          {
            temp=pop();
            push(pop()/temp);
          }
        | add_expr MOD
          {
            temp=pop();
            push(fmod(pop(),temp));
          }
        ;
unary_expr: primary
        | MINUS primary %prec UNARYMINUS { push(-pop()); }
        | unary_expr INC { push(pop()+1); }
        | unary_expr DEC { push(pop()-1); }
        ;
 primary: NUMBER { push($1); }
        | PI { push(M_PI); }
        ;
%%

You can see here that numbers are simply pushed onto the stack:

primary: NUMBER { push($1); }
        | PI { push(M_PI); }
        ;

While any calculation is a case of popping off the values and putting them back on the stack:

add_expr: mul_expr
        | add_expr PLUS { push(pop()+pop()); }
        | add_expr MINUS { temp=pop(); push(pop()-temp); }
        ;

The ruleset is shorter, partially because this RPN calculator is not as advanced, but also because the process is much simpler because the rules don’t need to take into account the complex structure of a typical equation line.

In a future post we’ll cover the RPN to equation and equation to RPN parsers.

You can download the complete code for the RPN calculator as rpn.tar.gz (Unix).