Xaprb

Stay curious!

Archive for November, 2011

Special mysqldump fingerprinting rule in pt-query-digest

without comments

The pt-query-digest tool has a number of special cases that affect how it “fingerprints” queries when it groups similar queries together to produce an aggregated report over the group. One of these is a special rule for queries that appear to come from mysqldump, of the following form:

SELECT /*!40001 SQL_NO_CACHE */ * FROM `users`

All such queries will be fingerprinted together and presented in a single class of queries. I remember many instances where mysqldump queries crowded the report of the “most important” queries and just caused other queries to be excluded. Grouping them together made it obvious that mysqldump’s load on the server was a problem, but didn’t obliterate other interesting things we wanted to see in the report.

Written by Xaprb

November 29th, 2011 at 9:56 am

Posted in Percona Toolkit,SQL

Tagged with

Status update on High Performance MySQL

with 11 comments

The third edition is nearly done. I’ve committed first drafts of all chapters, and all but one appendix. I need to do the last appendix and then rewrite the preface, which is a few days of work at my current pace. After that, it’s the usual tech review, copyediting, updates to figures, etc — and then it’s off to production.

I’m really pleased with this edition. I was planning on it just being a refresh of the second edition to reflect what’s new in MySQL-land, but it’s almost a complete rewrite. There’s a lot more focus on a logical approach throughout: what happens in the server, what are the limitations, why this matters, what are the practical consequences and applications, and therefore…. The “and therefore” is the real reason you buy a book such as this one.

Written by Xaprb

November 21st, 2011 at 11:13 am

When documentation is code

without comments

One of the things I think we did right with Maatkit (and now with Percona Toolkit) is making the documentation part of the code itself. So much redundancy and wrong documentation has been eliminated by making the tool actually read its own documentation when it starts up. As an example, the default value of the –shorten option is defined in the documentation (it’s Perldoc) like this:

=item --shorten

type: int; default: 1024

Not only is the documentation part of the code, but the tool’s –help output is generated from it too. The existence, type, defaults, and even the behavior of the command-line options is defined in the documentation. If I execute the tool with the –help option, you can see that default value:


[baron@ginger bin]$ ./pt-query-digest --help | grep  -- --shorten
  --shorten=i                    Shorten long statements in reports (default
  --shorten                      1024

If I change the tool’s documentation to say the default is 2048, you’ll see it in the output:


[baron@ginger bin]$ ./pt-query-digest --help | grep  -- --shorten
  --shorten=i                    Shorten long statements in reports (default
  --shorten                      2048

We even have tests for the documentation. If the documentation is code, and code should be tested, then the documentation should be tested too. I updated the documentation for the new version of pt-table-checksum the other day without testing it, and pushed the code back to Daniel, who merged it and ran the tests — and found that I’d changed a bit of the documentation that said one option disables another option. A statement like that needs to be tested formally.

We have many thousands of unit tests for Percona Toolkit last time I checked. One of them guarantees that this little bit of documentation is correct. What a great thing. I continue to try to find ways to make the tools’ documentation formally verifiable as much as possible. It’s not possible to do 100% of it, but a surprising amount can be tested.

Written by Xaprb

November 7th, 2011 at 10:41 pm

Posted in SQL