Tuesday, November 28, 2006

PBXT: Experience with testing Storage Engines

Many users may be unaware of the fact that there is a great tool for automatically testing MySQL called mysql-test-run. The most recent version of this program is written in perl and resides in the mysql-test directory of the MySQL source code tree.

Tests are written as scripts (.test files) which contains SQL statements and a number of meta-commands. Each .test file has a corresponding .result file which is used to verify the test output. A text-based "diff" is used to confirm that the output generated by the test is identical to the expected result.

To run a test mysql-test-run automatically sets up a clean environment and starts the MySQL server. It then executes the test scripts specified (by default all tests are executed) and verifies the output.

Most of the tests shipped with MySQL are intended to run with the MyISAM storage engine. There are also tests specifically created for other MySQL engines like NDB and InnoDB.

But what about testing 3rd party engines like PBXT? Firstly, its easy enough to start the test, just execute the following:

$ ./mysql-test-run --mysqld=--default-storage-engine=pbxt

This tells mysql-test-run to start mysqld with the option that sets PBXT to the default engine. So all CREATE TABLE statements that do not explicitly specify an engine now use PBXT. After this the real work starts:

There are over 630 tests in MySQL 5.1.12 and currently PBXT requires 223 individual changes to the test scripts for them to run. Of the 630 test results, 187 need to be changed. This is a lot of work given that verifying a difference is not a bug can be quite time consuming.

Meanwhile many differences are trivial, so ideally they would not appear at all. Lets have a look at some of the most common test differences:

SHOW CREATE TABLE:

The SHOW CREATE TABLE statement output depends on the default engine. So CREATE TABLE (...) ENGINE=MyISAM must be changed to CREATE TABLE (...) ENGINE=PBXT in over 360 instances.

The correct way to solve this problem would be to use the mysql-test-run command --replace-regex to transform the output to something like ENGINE=?. But, of course, this change would need to be checked into the MySQL source code tree to be of long term help.

MERGE, FULLTEXT or GEOMETRY:

All these features are only supported by MyISAM tables. When using these features the test should explicitly specify ENGINE=MyISAM. An even better solution would be if MySQL could automatically pick an engine that supported the features used in the CREATE TABLE. In this case the default engine will take preference.

ENGINE=InnoDB:

In order to use the InnoDB tests to test some other transactional engine, all the CREATE TABLE (...) ENGINE=InnoDB statements must be changed. Ideally MySQL would support syntax of the form: ENGINE=TRANSACTIONAL, which would automatically pick a transactional engine from those available. If the default engine is transactional then that would obviously be used.

ORDER BY Required:

By far the biggest problem when testing PBXT is that the "natural" SELECT order often differs from that of MyISAM. This means that a SELECT without an ORDER BY returns the correct rows, but in a different order to MyISAM. This is due to the way PBXT manages free space.

Making this problem even worse is the fact that the order can differ from one run to the next because PBXT manages the free space asynchronously. The only thing that helps is to add an ORDER BY clause to the SELECT, but I am wondering if the problem cannot be solved in general by mysql-test-run. For example this could be done by a more intelligent diff function.

EXPLAIN Differences:

Each engine has its own way of calculating the heuristics used by the optimizer. This causes many differences in the result files due to slightly different EXPLAIN output. On the other hand there is a need to verify that queries are being optimized correctly. I'm clueless on this one at the moment, so suggestions are welcome.

I think other engine developers will agree, its a major task so any help in testing will be greatly appreciated!

No comments: