Archive for August, 2007

innotop is available from openSUSE buildservice

RPM packages for innotop, a flexible and powerful MySQL and InnoDB monitor I wrote, are now available through the openSUSE buildservice, which builds RPMS on several platforms:

Thanks to Lenz Grimmer, SUSE Linux, and Dr. Peter Poeml for making this happen.

Technorati Tags:, , ,

You might also like:

  1. innotop version 1.0 released
  2. The innotop MySQL and InnoDB monitor
  3. How to install innotop

Coming soon: High Performance MySQL, Second Edition

We’ve begun writing the second edition of the now-classic High Performance MySQL. “We” means co-authors Arjen Lentz (formerly of MySQL), Baron Schwartz (that’s me), and Vadim Tkachenko and Peter Zaitzev, both formerly of MySQL’s high-performance team and now partners at Percona, a high-performance MySQL consultancy firm and host of the popular MySQL Performance Blog. Neither of the first edition’s authors (Jeremy Zawodny and Derek Balling) is working on this project, but they’re with us in spirit, I think. O’Reilly is still the publisher, and Andy Oram is still the editor.

Though we’re theoretically revising and updating the first edition, we’re actually starting from scratch and re-writing the book. We’re expanding it from the first edition’s 265 pages to 384, according to the contract, but my unofficial guess is it’ll go well over 400 pages. A lot has changed since Jeremy and Derek wrote the first edition — high performance MySQL is a bigger subject today, with different techniques, tools and technologies, and of course a much more complicated MySQL server. The second edition will remain the definitive reference for building high-performance, scalable systems with MySQL.

We’re early in the process, so it’s hard to know how far into the future we can safely look. Still, just to whet your appetite, here’s the table of contents:

  1. Preface
  2. Back to basics
  3. MySQL Architecture
  4. Finding Bottlenecks: Profiling and Benchmarks
  5. Schema Optimization and indexing
  6. Query Performance Optimization
  7. Advanced SQL Functionality
  8. Optimizing Server Settings
  9. Operating System and Hardware Optimization
  10. Scaling and High Availability
  11. Application Level Optimization
  12. Backup and Recovery
  13. Security
  14. Analyzing Server Status
  15. Tools for High Performance

Stay tuned for more news as the book progresses. The four of us plan to blog as we go.

Technorati Tags:, , , , , , , , ,

You might also like:

  1. Get a free sample chapter of High Performance MySQL Second Edition
  2. Progress on High Performance MySQL, Second Edition
  3. More progress on High Performance MySQL, Second Edition
  4. High Performance MySQL, Second Edition: Backup and Recovery
  5. High Performance MySQL, Second Edition: Advanced SQL Functionality

How to notify event listeners in MySQL

A high-performance application that has producers and consumers of some resource, such as a queue of messages, needs an efficient way to notify the consumers when the producer has inserted into the queue. Polling the queue for changes is not a good option. MySQL’s GET_LOCK() and RELEASE_LOCK() functions can provide both mutual exclusivity and notifications.

This post was prompted by a message to the MySQL general emailing list some time ago, but I’m finally getting around to actually testing the theoretical solution I mentioned then (I can never just think my way through anything that involves locking and waiting… I have to test it).

Here’s the set-up:

create table test.messages (
   id int not null auto_increment primary key,
   message varchar(50) not null
);

The producer

The producer’s job is to insert rows into the table. In pseudo-code,

while (true ) {
   get_lock();
   // time passes...
   query("insert into messages(message) values ('hi')");
   release_lock();
}

Releasing the lock immediately after inserting will “wake up” the consumer, which must be blocked, waiting for the lock. Locking again as soon as possible will make the producer wait until the consumer is done processing, then the consumer will wait again.

The consumer

Since the consumer is waiting for the lock, that means it has tried to exclusively lock the same resource the producer has locked. Once the producer releases it, the consumer can go ahead and process the rows just inserted. In pseudo-code:

$last_row = 0;
while ( true ) {
   get_lock();
   $rows = query("SELECT * FROM messages WHERE id > $last_row");
   for each $row ( $rows ) {
      // Process
      $last_row = $row[id];
   }
   release_lock();
}

Locking

The actual locking implementation always makes the details more complicated.

Both the producer and the consumer will have to get an exclusive lock on the queue table, or something that represents the queue table. The immediately obvious solution is LOCK TABLES. This doesn’t work well for most situations.

Why not? Since the producer and/or the consumer might need to access data in more than one table, they’ll have to lock all the tables they need. This will block other parts of the system from functioning, assuming there’s more than just a queue in the database. Other queries might then need to use LOCK TABLES too, and this just has a way of spreading out of control until the entire database becomes serial, mutual-exclusive access. This is terrible for any serious application.

Fortunately, MySQL has application locks, implemented with GET_LOCK() and RELEASE_LOCK(). They’re advisory, so you can ignore them if you want, but they are handy for things like this, where the producer and consumer just need to lock the same thing. They’re also relatively cheap. You’re really just locking a string, which you can pick. I’ll use the name of the table.

Here’s the code:

// Producer:
$timeout = 1000000;
while (true) {
   query("SELECT GET_LOCK('messages', $timeout)");
   // time passes...
   query("insert into messages(message) values ('hi')");
   query("SELECT RELEASE_LOCK('messages')");
}

// Consumer:
$last_row = 0;
while ( true ) {
   query("SELECT GET_LOCK('messages', $timeout)");
   $rows = query("SELECT * FROM messages WHERE id > $last_row");
   for each $row ( $rows ) {
      // Process
      $last_row = $row[id];
   }
   query("SELECT RELEASE_LOCK('messages')");
}

This works because the producer and consumer are really notifying each other — it’s not one-way, it’s symmetric. Inside MySQL, there’s a queue of threads waiting for locks. As soon as one releases the lock, the other gets it, and immediately goes back onto the queue waiting for it again.

Complications

There’s more to it than this. GET_LOCK() has a timeout, which can’t be infinite. If the timeout expires, the function returns, but doesn’t grant the lock. Some other errors could also cause this to happen. The producer and consumer have to be prepared to recognize when the lock isn’t granted, and retry. The return value of GET_LOCK() signifies whether the lock was really granted. Also, either the producer or consumer could die, and then there’d be no wait for the lock at all. The consumer can tell that this happened by noticing there’s no work to do. The producer can’t really tell unless it queries the database. But the producer is likely waiting for something (another lock, user input,…) where the code says “time passes.” So this shouldn’t really be a problem.

Another limitation is the possibility of the consumer starting first and locking out the producer. If it doesn’t release the lock and try to re-lock periodically, the producer will never be able to get a lock. If it does, there’s still another problem. The consumer should sleep so as not to spin-wait for the presence of a producer. If the producer produces a row while the consumer is sleeping, and then doesn’t produce and release again for a very long time, the consumer will not find out about the row the producer inserted. It will have to wait for the next message the producer inserts. The solution is to make sure the consumer keeps the lock while it sleeps.

All of these issues are solvable with special-case startup code, but I’m sure you can work out something that meets your needs. I don’t want to make this article more complicated, because this will all be application-dependent.

Sample application

Here is a Perl script that implements a producer and consumer on a MySQL table called test.messages. To run it, give a --mode argument of ‘p’ or ‘c’. Be sure you create the table (see above) first:

Start two instances, one in producer mode, one in consumer mode, and watch the consumer print out messages as you enter them into the producer. Fun!

More options

If you do need to poll, there are still some steps you can take to make it more efficient. I wrote about efficient polling with exponential or Fibonacci wait a while ago. This technique has worked well for me in many applications.

You can also poll on something small and efficient, instead of polling a potentially big messages table. Make another table in which the producer inserts a single row, or flips a single row from zero to one, and the consumer resets it. Polling on a small resource is much more efficient than a big resource. You can use this technique together with transactions to coordinate the work of many producers and consumers, even when you don’t have explicit methods of locking (for example, if your database server doesn’t support it).

Finally, if you need a fixed-size FIFO queue or “round-robin table,” try the suggestions in my article on how to create a queue in SQL.

Technorati Tags:, , , , , , , , ,

You might also like:

  1. How to coordinate distributed work with MySQL’s GET_LOCK
  2. How to monitor InnoDB lock waits
  3. How to find out who is locking a table in MySQL
  4. How to give locking hints in MySQL

MySQL Camp 2007

I caught most of the second day of MySQL Camp 2007. It was fun and educational as before. The format was a little different than the last Camp; everything was in one room. Google and Proven Scaling provided food.

Sessions were loosely organized, to say the least, but that’s what an un-conference is all about. When I got there, Ronald Bradford was presenting on MySQL Proxy. Bob Stein, creator of Visibone charts and cheat-sheets, followed with a session seeking feedback to improve the charts. By the way, the way he produces those charts is totally off the wall. Jay Pipes gave an extended tutorial on ways to make MySQL perform really badly. After lunch, I gave sort of a stand-up talk on MySQL Toolkit, which I typed up while listening to the other talks. I tried to give an overview of what the tools in the toolkit are, how they work, and what to use them for. A couple other people showed some of their own tools after that too.

Then I went to supper in Manhattan with a bunch of old and new friends from the MySQL community, played some card games, and that’s it. Worth every minute!

I’m going to try to help Jay organize the next camp in central Virginia at the University of Virginia in Charlottesville. I think the tentative plan is early next May or something like that. I’m sure that is all subject to change.

2nvy.com
Technorati Tags:, , , , , ,

No related posts.

MySQL Toolkit version 815 released

Download MySQL Toolkit

I’ve just released changes to all tools in MySQL Toolkit. The biggest changes are in MySQL Table Sync, which I’m beginning to give sane defaults and options to. Some of the changes are incompatible (but that’s what you get with MySQL Table Sync, which is still very rough). I also found and fixed some bugs with MySQL Visual Explain. Thanks to everyone who submitted bug reports.

Note, the formatting overflow in MySQL Query Profiler was not a security vulnerability. It was simply an issue with a Perl formatting code that displayed numbers as hash marks when they got big enough.

Here’s the whole changelog:

Changelog for mysql-archiver:

2007-08-23: version 1.0.1

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-deadlock-logger:

2007-08-23: version 1.0.3

   * MySQL socket connection option didn't work.
   * Added --askpass option.
   * Truncated output could crash on an undefined regex result.
   * Made --source and --dest accept bareword hostnames.
   * Made DBI errors only print once.

Changelog for mysql-duplicate-key-checker:

2007-08-23: version 1.0.5

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-find:

2007-08-23: version 0.9.4

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-query-profiler:

2007-08-23: version 1.1.3

   * MySQL socket connection option didn't work.
   * Large queries overflowed the formatting room available.

Changelog for mysql-show-grants:

2007-08-23: version 1.0.3

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-slave-delay:

2007-08-23: version 1.0.0

   * MySQL socket connection option didn't work.
   * Added a check that the server is a slave.

Changelog for mysql-slave-restart:

2007-08-23: version 1.0.0

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-table-checksum:

2007-08-23: version 1.1.13

   * MySQL socket connection option didn't work.
   * Added --askpass option.

Changelog for mysql-table-sync:

2007-08-23: version 0.9.6

   * Added --askpass option.
   * Changed --replicate option to --synctomaster.
   * Fixed the MySQL socket option.
   * Made --synctomaster able to connect to the master from SHOW SLAVE STATUS.
   * MySQL socket connection option didn't work.
   * Suppress duplicated error messages from MySQL.
   * Changed DSN from URL-ish format to key=value format.
   * Generated WHERE clauses weren't properly isolated in parentheses.
   * Changed exit status to 0 when --help is given.
   * Made --replicate imply --wait 60.

Changelog for mysql-visual-explain:

2007-08-23: version 1.0.1

   * MySQL socket connection option didn't work.
   * Added --askpass option.
   * UNIONs inside a SUBQUERY weren't correctly nested.
   * Some types of impossible queries weren't handled.
Technorati Tags:, ,

You might also like:

  1. Maatkit version 1877 released
  2. Maatkit version 1709 released
  3. MySQL Toolkit version 896 released
  4. Maatkit version 1674 released
  5. Maatkit version 1508 released

Google Test Automation Conference, Day 1

I’m attending the Google Test Automation Conference (GTAC 2007) in Manhattan, New York right now. It’s a two-day event hosted by Google, with mostly non-Google speakers.

The conference is by invitation only and very limited; we all had to either have something Google’s team of judges thought was good enough to present, or our essays had to impress them (I’m not bragging about getting in, I’m telling you why I thought the conference would be great). Unfortunately, I have to say I’m a little underwhelmed. Several of the talks have been very good, especially Allen Hutchison’s GTAC keynote on first principles and Simon Stewart’s informative and very fun talk on Web Driver, but some of the others didn’t zing me very much.

You can view the GTAC YouTube Playlist to see the talks yourself. The first day is up already — amazing! The Google multimedia folks really have it down to a science.

Perhaps I’m a little un-impressed because I think the discipline of test automation, or at least some of those speaking, is too heavily influenced by Extreme/Agile methodologies, which often take a very narrow view of testing and have invented some seriously damaging techniques like Mock Objects. Perhaps because there’s a lot of Java in the room, and Java’s insistence on Everything Shalt Be An Object has twisted natural concepts into very difficult implementations, which other programming languages blindly follow even when they have first-class support for the thing (such as a test) Java represents so awkwardly. And perhaps because there’s so much focus on auto-generated tests, which I think are about as useful as auto-generated documentation. They’re often un-tests, just as documentation generated by inspecting and mentioning class names, method names, and parameter types is un-documentation. Not that auto-generated tests don’t have a place in the world — they do — but it’s limited.

The most interesting talk to me was Adam Porter and Atif Memon’s Skoll project (here’s the Skoll homepage), which is developing a distributed means of building and running test suites in different configurations, very smartly. There’s real computer science going into this. And guess what one of their big test projects is? (Perhaps the only big test project, I’m not sure). Building MySQL source code. Yep, they’re finding real bugs by smartly building different configurations and finding test failures, then iterating to find related configurations that fail. Watch the video for the details of how intelligently they’re doing this.

I decided to skip the last talk and the evening’s socializing, and instead headed over to the MySQL Camp, which is happening just a few miles away in Brooklyn. I spent the evening mooching Japanese food and catching up with friends I met at MySQL Camp 2006. I went to bed late, but it was worth it.

Today I’m also going to skip the Google Test Automation Conference and focus on MySQL Camp. I tried to find out more about today’s GTAC talks, but it’s tough. Google has kind of made it a black box — I didn’t even find out in advance who was going to speak, or get any chance to offer a talk myself. A few days ago they sent an email with the schedule, which listed speaker names and talk titles, but no other information. That was the first I knew of the schedule. There are certain things that are great about how they’re running this, such as having just one track (no tough choices of which talk to attend), but not knowing who was speaking on what made it hard to judge whether and why I wanted to attend. Abstracts would have helped a lot. As far as I can see, today’s talks are going to include more mildly promotional material. I’d be interested in the Lightning Talks and maybe a couple of the others, but time is precious, and given that I know MySQL camp is going to be good, I’m not willing to take the risk that today’s GTAC talks will be uninspiring.

Technorati Tags:, , , , , , , , , ,

You might also like:

  1. MySQL Camp 2007
  2. How to use the Visual SourceSafe automation interface
  3. I’m going to the upcoming MySQL Camp

How to select the first or last row per group in SQL

There is no “first” or “last” aggregate function in SQL. Sometimes you can use MIN() or MAX(), but often that won’t work either. There are a couple of ways to solve this vexing non-relational problem. Read on to find out how.

First, let’s be clear: I am posing a very non-relational problem. This is not about the minimum, maximum, top, most, least or any other relationally valid extreme in the group. It’s the first or last, in whatever order the rows happen to come. And we all know rows aren’t ordered — in theory. But in practice they are, and sometimes you need the first or last row in a group.

If you have a question this article doesn’t answer, you might like to read how to select the first/least/max row per group in SQL and how to find the maximum row per group in SQL without subqueries.

A MySQL user-variable solution

I’ll show a MySQL-specific solution with one of the queries I developed for MySQL Table Checksum.

Here’s the idea: crush an entire table down to a single checksum value by checksumming each row, mushing it together with the previous row’s checksum, and then checksumming the result again. It’s fairly easy to do this, but it’s hard to get the final result in one statement. This is necessary to use the statement in an INSERT .. SELECT, which I needed to do.

An example might clarify:

select * from fruit;
+---------+
| variety |
+---------+
| apple   | 
| orange  | 
| lemon   | 
| pear    | 
+---------+

set @crc := '';

select variety, @crc := md5(concat(@crc, md5(variety))) from fruit;
+---------+-----------------------------------------+
| variety | @crc := md5(concat(@crc, md5(variety))) |
+---------+-----------------------------------------+
| apple   | ae6d32585ecc4d33cb8cd68a047d8434        | 
| orange  | 7ec613c796f44ef5ccb0e24e94323e38        | 
| lemon   | a2475f37be12cebf733ebfc7ee2ee473        | 
| pear    | ec98fe57833bbd91790ebc7ccf84c7e9        | 
+---------+-----------------------------------------+

I want the “last” value of @crc after the statement is done processing. How can I do this? The solution I found is to use a counter variable. I’ll demonstrate:

set @crc := '', @cnt := 0;

select variety,
   @cnt := @cnt + 1 as cnt,
   @crc := md5(concat(@crc, md5(variety))) as crc
from fruit;
+---------+------+----------------------------------+
| variety | cnt  | crc                              |
+---------+------+----------------------------------+
| apple   |    1 | ae6d32585ecc4d33cb8cd68a047d8434 | 
| orange  |    2 | 7ec613c796f44ef5ccb0e24e94323e38 | 
| lemon   |    3 | a2475f37be12cebf733ebfc7ee2ee473 | 
| pear    |    4 | ec98fe57833bbd91790ebc7ccf84c7e9 | 
+---------+------+----------------------------------+

The counter variable might make you want to write something like HAVING cnt = MAX(cnt), but that won’t work (try it!). Instead, I prefixed the checksum with the count so the last row is the stringwise maximum:

select variety,
   @crc := concat(lpad(@cnt := @cnt + 1, 10, '0'),
      md5(concat(right(@crc, 32), md5(variety)))) as crc
from fruit;
+---------+--------------------------------------------+
| variety | crc                                        |
+---------+--------------------------------------------+
| apple   | 0000000001ae6d32585ecc4d33cb8cd68a047d8434 | 
| orange  | 00000000027ec613c796f44ef5ccb0e24e94323e38 | 
| lemon   | 0000000003a2475f37be12cebf733ebfc7ee2ee473 | 
| pear    | 0000000004ec98fe57833bbd91790ebc7ccf84c7e9 | 
+---------+--------------------------------------------+

You can see I also left-padded the count so a lexical sort will agree with a numeric sort, and so I can predict how many extra characters I’ll need to remove to get back the original value. Now I can use the MAX() function to select the last row, and simply lop off the leftmost ten digits (I use the RIGHT() function for convenience, but generally you want to use SUBSTRING()):

select right(max(
   @crc := concat(lpad(@cnt := @cnt + 1, 10, '0'),
      md5(concat(right(@crc, 32), md5(variety))))
   ), 32) as crc
from fruit;
+----------------------------------+
| crc                              |
+----------------------------------+
| ec98fe57833bbd91790ebc7ccf84c7e9 | 
+----------------------------------+

Et voila, I got the last value in the group. By the way, this will work with ONLY_FULL_GROUP_BY in the server’s SQL mode.

Other methods

My solution relies on a MySQL user variable to do the counting, but there are many ways to number rows in SQL: you could simulate the ROW_NUMBER() function, for instance, or use techniques mentioned in the comments on how to number rows in MySQL (one of the comments shows a particularly clever solution with subqueries, but I didn’t want to use it because MySQL doesn’t support subqueries in older versions). Any of these should work one way or another. Of course, if you are using a product such as Microsoft SQL server 2005, which actually has the ROW_NUMBER() function, you can use that!

Conclusion

Finding the first or last row is a bit unintuitive, and it’s definitely non-relational, but sometimes it’s what you need. The technique I demonstrated in this article is easily adaptable to many kinds of queries. I hope it helped you!

If this article didn’t solve your problem, please read these before posting questions to the comments: how to select the first/least/max row per group in SQL and how to find the maximum row per group in SQL without subqueries.

Technorati Tags:, , , , ,

You might also like:

  1. How to simulate the SQL ROW_NUMBER function
  2. How to number rows in MySQL
  3. How to simulate the GROUP_CONCAT function
  4. How to select the first/least/max row per group in SQL
  5. Advanced MySQL user variable techniques

What would make me buy MySQL Enterprise?

MySQL AB’s recent changes to the Community/Enterprise split have made people go as far as calling the split a failure. I don’t think it’s working well either, but it could be fixed. Here’s what I think would make Enterprise a compelling offer.

I’d recommend Enterprise if I could

If the MySQL Enterprise Server were a good thing, I’d recommend it to my consulting clients. I’d suggest we start using it at my employer, too. I believe in supporting people and companies whose work benefits me. Here’s the thing, though: I think it would be detrimental, even dangerous.

Why on earth would I think that?

Because nobody’s testing the Enterprise source code before it’s released.

It’s getting bug fixes that haven’t been stress-tested in the real world. Some of them are even being rolled back, many months later, because they were broken.

Reasons I’d buy MySQL Enterprise

The reasons I’d buy a MySQL Enterprise subscription would be as follows, in order of importance:

  1. A stable, tested version of the server with well-known, documented limitations and bugs.
  2. Technical support.
  3. The knowledge base, etc, etc.

But… that’s what Enterprise is, right?

The official list of benefits in an Enterprise subscription looks like it matches my list, doesn’t it?

MySQL Enterprise subscriptions include the following benefits:

  1. MySQL Enterprise Server: The MySQL Enterprise Server is the most reliable, secure and up-to-date version of MySQL in source and binary format.
  2. Extensive Reliability Testing…

… etc …

The thing is, those first two bullets are blatantly untrue. Want proof? Look at the change list for MySQL 5.0.48, which will be the next Monthly Rapid Update. Here are just a few of the changes near the top of the list, with my comments:

  1. Coercion of ASCII values to character sets that are a superset of ASCII sometimes was not done, resulting in illegal mix of collations errors. These cases now are resolved using repertoire, a new string expression attribute…
    • My comment: A new, complex string expression attribute, designed to fix an edge case, is going straight into the “reliable” Enterprise branch? No way I want that untested change on my production servers.
  2. FEDERATED tables had an artificially low maximum of key length.
    • A fix to FEDERATED? FEDERATED is riddled with basic bugs and should not even be distributed with Enterprise, and even so, who cares if I can’t make as long an index as I should be able to? I can work around it while the community tests it.
  3. In some cases, INSERT INTO … SELECT … GROUP BY could insert rows even if the SELECT by itself produced an empty result.
    • Another edge case, probably easy to avoid, that probably affects core parts of the server.
  4. In a stored function or trigger, when InnoDB detected deadlock, it attempted rollback and displayed an incorrect error message (Explicit or implicit commit is not allowed in stored function or trigger). Now InnoDB returns an error under these conditions and does not attempt rollback.
    • Changes to InnoDB’s deadlock and rollback behavior should not be included in a hot-fix, especially since it only affects stored functions and triggers, which are also not ready for Enterprise.

These bug fixes address minor problems, but seem to have the potential to cause major damage if there’s a problem with the fix itself. None of these should be included in a hot-fix release. In fact, after looking through the whole list, I don’t see anything I would want to go to my production servers before six months of community testing. There’s simply too much at stake. The upside of including these changes is so small, and the potential downside so large, that it doesn’t make sense to include them.

What would I not want in Enterprise?

Here are some things that would not attract me to Enterprise:

  • Patches and hot fixes.
  • New features.

Take a look at bullet points number three and four in the list of Enterprise benefits:

  1. Updates and Upgrades with New Features: You receive the newest versions of MySQL Enterprise Server released during your active subscription term.
  2. Predicable Releases with Bug Fixes and Updates: Predictable and scheduled service packs ensure that a new, fully up-to-date version of the MySQL Enterprise Server is always available with the latest updates and bug fixes. Customers of MySQL Enterprise receive Monthly Rapid Updates & Quarterly Service Packs

These are exactly the things I don’t want in my Enterprise source code. These two “benefits” directly conflict with the first two benefits. They cannot coexist, period.

MySQL’s marketing information says new experimental features are unstable, but hot bug fixes are stable and reliable. In reality, there’s no difference between new features and new bug fixes; they are both unstable and untested and don’t belong in a conservative, reliable product.

Until this changes, the Enterprise source code will continue to be less trustworthy than the Community source code, in my opinion. Even if the Community source doesn’t get the bug fixes, at least you know what you’re dealing with.

How would I change the current release policy?

I think this is easiest to explain with diagrams. Here’s the current release policy, as I understand it (I know this is over-simplified, but I’m trying to simplify this enough to show how I’d change it):

MySQL Community and Enterprise Source Policy

As I understand it now, the Community source gets (or is intended to get — it’s not really working, but that’s off-topic) frequent contributions from the community, and occasional bug fixes that are applied to the Enterprise source. The Community source is built and released infrequently.

On the other hand, the Enterprise source gets frequent hot fixes and releases, and infrequently gets features merged from the Community source after they’re deemed stable.

I’m not sure who designed this scheme, but I think a lot of people tried to say it was a bad idea and they went ahead anyway. Perhaps the symmetry in the diagram appealed to someone.

Here’s how that would have to change before I’d buy Enterprise:

MySQL Community and Enterprise Source Policy

In this model, the Community gets all source code changes first, and after they are stable, they’re merged into the Enterprise source. The Community code is built and released frequently, and Enterprise is extremely conservative.

This I’d pay for. This is a compelling offer that gives Enterprise customers substantial return for their money.

In this model, I’d be paying MySQL to do the painstaking work of looking at all the changes that happened in the Community source tree during the last release cycle, carefully selecting the good stuff, merging that into the Enterprise source tree, and testing the result. This is a proven model for creating high-quality software from a rapidly changing codebase. I don’t know why MySQL invented their own method instead, but it was a mistake.

Notice something else about this: unless the MySQL developers know something about revision control and merging I don’t (entirely possible, since I’ve never used the product they use), this is a lot simpler to manage. There are no cross-currents between the two source trees. It’s not just the aesthetics of having all the arrows going the same direction; I’d be a lot more confident that the merges went smoothly in this model. I think there’s a much lower chance of a mistake.

I also think the engineers would have a lot less work to do, and could concentrate more on making software and less on maintaining two complicated source trees. In fact, I believe the Community branch has actually been getting bug fixes too, contrary to my first diagram. This isn’t what MySQL initially announced they’d do, but if I had to guess, I’d say the engineering team said it would be too much work to keep the bug fixes out of the Enterprise branch.

Notice what I’m not saying about Community

I am explicitly avoiding saying something in particular about the Community source. I want quick release cycles and all patches applied there first, for one and only one reason: so the Enterprise source is trustworthy and stable. I’m not saying I want it so I can get the most bleeding-edge new fun stuff for free in Community. That is not a factor for me in the mindset I’m using to write this article — I am imagining myself as a customer who is very risk-averse (which is true).

This model would probably make some Community users happy too, though.

What if I needed an immediate fix?

What if I found a serious bug in the software and needed it fixed right away for my business? Shouldn’t MySQL release a hot-fix into the Enterprise tree for that?

No. I found a bug, who cares? If I found it, it means the community didn’t find it first. If the community didn’t find it, it probably only affects me. Therefore, the bug fix should go into the Community server.

If I couldn’t work around the problem (unlikely), I should be able to pay MySQL’s support engineers to make me a custom patch and build just to fix this problem. I’d assume all the risk of that, of course. This unstable, experimental patch should not go into the Enterprise source, but other customers should hear about it.

Right now you might be considering the similarity to Red Hat Enterprise Linux, and thinking “but RHEL does get hot fixes, so why shouldn’t MySQL Enterprise?” The reason is MySQL Enterprise isn’t an entire operating system distribution of software, with third parties fixing defects in upstream source. The Community process I’m advocating should take care of the vast majority of such bugs. Someone might find a critical security flaw that would warrant a hot-fix to the Enterprise product without waiting six months. But seriously, look at the bugs people find in MySQL — look at that changelog I linked to. There are no critical security flaws or kernel buffer overflows — and those are the kinds of things RHEL gets hot fixes for.

Some people might be drawn to MySQL’s current monthly hot-fix policy because they come from a Microsoft background, where Microsoft releasing service packs and hot-fixes is seen as a good thing. All I can say to those people is, you’ve become like a frog in a pot of boiling water. Microsoft’s fixes and service packs are a broken way of fixing their broken software, and are not a good way to manage quality software, so you shouldn’t measure the value of a release policy by whether it looks like Microsoft’s.

What would my ideal Enterprise version look like?

I’d really like to see MySQL AB stop adding new features and make the existing ones work better. The bugs I keep finding are usually quite simple, and I think that’s a sign of a low-quality codebase. For example, try creating a view that already exists. It breaks replication. How did this bug go unnoticed for so long? In my opinion, it’s because the server hasn’t been stable since 5.0 was released, and nobody’s using the bleeding-edge features as much as the core of the server, which is where I’d like MySQL AB to concentrate for the Enterprise version.

The Enterprise version I’d like to see doesn’t have views. That’s right, it doesn’t have views, because nobody’s used and tested them thoroughly yet (if they had, there wouldn’t be so many bugs in them). It doesn’t have triggers, stored procedures, the FEDERATED storage engine, stored functions — in terms of features, it’s somewhere in version 4.1. That’s what I’d call MySQL Enterprise. I don’t want these features because I don’t use them right now anyway, because they have the potential to cause such massive pain. I want them to go back to the community incubator so the bugs can get worked out. I’m managing just fine without using them, but I’m not managing fine with the pain they’re causing just by being there even though I don’t use them.

But at the same time the existing features, especially those needed for scaling and high availability, would be given a lot more attention. Replication would have much stronger assurances of accuracy and reliability. InnoDB would scale to more processors. The query optimizer would get a lot of love. In terms of improvements to existing features, my ideal Enterprise version is somewhere around 5.0.32. I chose that version because it was released about six or eight months ago, which means the big changes in that version would have been out in the Community for six or eight months and I’d be satisfied having them in the Enterprise version.

Right now, if you want to upgrade because of a bug that’s fixed in a newer version, you upgrade into some other bugs. I’m seriously tired of upgrading into the newest, latest, greatest bugs, like infinite loops in relay logs that fill hard drives with gigabytes of duplicate logs in a matter of minutes. These bugs have cost a significant amount of money, time, and frustration. I would definitely recommend people buy and use Enterprise if it fixed bugs without introducing new ones, but I see no signs of that happening.

MySQL’s sales pitch doesn’t convince me

There’s one more thing I think MySQL would have to do to get me to buy Enterprise, and that’s develop a better sales pitch. I’ll explain that — keep reading.

I think the way the Community/Enterprise split is designed smacks of marketing people making decisions. I don’t think this is ultimately going to be as successful a strategy for MySQL as it could be, because they won’t be able to sell it as well. Why not? Because unlike many other products, the people who make decisions about their company’s MySQL installations are engineers, by and large. The current marketing message sounds pretty condescending to an engineer.

I’ve even joined a MySQL webinar just to see. It was supposedly about scaling with MySQL, but in fact there was very little content. They spent a lot of time trying to say you should buy Enterprise. This was very strange, since the webinar was only open to current Enterprise customers. But the reasons they gave for choosing one or the other had me shaking my head in disbelief. It went something like this:

You should choose MySQL Enterprise if you’re making money with MySQL, because Enterprise is the version for making money. If you plan to use MySQL to make money, you should use Enterprise. On the other hand, you should use Community if you’re just experimenting with MySQL. It’s free and has lots of hot new features, like SHOW PROFILE and um, uh, that’s it. Anyway, you should use it if you’re just experimenting, because it’s the version for experimenting. Oh, and you should use it for your testing if you’re an Enterprise customer, because it’s for experimenting with, and tests are experiments.

These aren’t direct quotes, but they probably aren’t far off — they certainly capture the spirit, if not the letter, of the webinar. Their strongest reason for using Enterprise was “because you should use Enterprise,” and they said it several times. And when they said Enterprise users should run Community on their test systems, I thought “you’re kidding. I’m going to test with a different version of the product than I run in production? Enough already.” I signed off with about five minutes left in the webinar.

The bottom line is, I don’t trust a company that assumes I won’t have a problem with such nonsense. I know there are smart engineers working on the MySQL server, but the marketing message is the face the world sees. In my experience, that ends up giving the marketing people the right to make decisions, even when the engineers disapprove. Therefore, I have no confidence the people making the decisions about how MySQL is developed and released are competent to do so.

If MySQL’s marketing materials were written and presented by people with serious tech savvy, I’d be a lot more comfortable about the invisible parts of the company. I assume most other engineers are going to extrapolate backwards from the façade, just like me, and conclude the decision-making process is untrustworthy.

Incidentally, this is exactly why my current employer (an advertising agency) rocks: because the sales folks and execs have decades of experience running companies in the industries we serve, and the people who answer when you call to discuss your account are analysts, not customer service reps. Whoever picks up the phone is an Excel wizard and has a SQL window (not a reporting system, a SQL prompt) open directly to an analysis server — our analysts and sales people are smart and capable and generally have business or engineering degrees from top universities; they’re not just friendly voices.

Contrary to popular wisdom, you can tell a lot about the book by looking at the cover. That’s why MySQL needs a sales pitch that’s convincing and respectable to an engineer.

Conclusion

MySQL AB says it needs to offer its paying customers something of value, and rightly so. Unfortunately, someone who doesn’t seem to understand software engineering at all has decided on a truly backwards way to do that. The result is a release policy that seriously degrades the quality of both product versions. MySQL AB’s marketing folks keep trying to say the Emperor’s new clothes are beautiful, but proof by repeated assertion just doesn’t work on people who know software engineering.

Put another way, MySQL AB is trying to sell Enterprise on the so-called benefit of including bug fixes so the product is “more stable.” This is an oxymoron. They should be selling the service of excluding untrusted code instead.

The current Enterprise offering not only isn’t compelling, but is designed to actually be lower quality than the Community source because there are fewer people testing it. Not using the Enterprise source is a no-brainer for me. However, if they’ll correct this mistake and start producing a source tree that’s conservative, high-quality, and stable, I’ll recommend people buy it. I wish MySQL well in their efforts to commercialize the product, but I don’t want what they’re trying to sell right now.

Technorati Tags:, ,

You might also like:

  1. innotop 1.3.5 released
  2. The truth about MySQL Community and Enterprise
  3. innotop 1.5.0 released
  4. Version 1.6.0 of the innotop monitor for MySQL released
  5. Three updated tools in MySQL Toolkit

How to set up dual monitors in Ubuntu on Dell Inspiron 1501

It took me about five minutes to get dual monitors working on my Dell Inspiron 1501 under Ubuntu 7.04. Here’s how I did it.

  1. I tried the xorg driver, which had been working fine on just the laptop display, but it wouldn’t work; I could get either the external display or the internal display to show, but not both. If you get it working, post a comment and let me know.
  2. I enabled the proprietary driver via Ubuntu’s Restricted Drivers Manager.
  3. I read the Gentoo Wiki page on dual monitors.
  4. As root, I ran
    aticonfig --initial=dual-head --screen-layout=above -v
  5. I rebooted.

At the moment, I’m typing into Firefox on the laptop monitor. My Dell 1800FP monitor is perched right above it; it’s the same width in pixels, and almost the same physical width. I have no windows open on that monitor, but I do have a nice background image! XFCE configures the backgrounds for each display separately.

I’ve been using this setup for about a week now. It’s not flawless, but the flaws don’t get in my way much. Here are the problems I’ve noticed:

  1. The driver is proprietary. Okay, this is a major flaw.
  2. The driver doesn’t work perfectly. Occasionally a weird defect in the screen image appears and won’t go away until X is restarted. It looks like a bar code and usually shows up near the bottom right of one of the monitors, often right over the clock in my system tray.
  3. If I log out of XFCE, my system hangs and I have to use the Alt-PrntScrn keys to shut it down and restart it.
  4. When I start my laptop, only the display on my laptop shows anything (perhaps the login screen isn’t dual-monitor capable). As soon as I log in, both monitors become active. While this is happening, my mouse randomly jumps between the top-left corner of the two monitors.
  5. The two monitors are running two different X displays. I’m not crystal-clear on how this should normally work, but I get the idea Xinerama isn’t the same as this and should work better (I don’t know if I can set up Xinerama with the proprietary driver, but I don’t think so). This has a variety of side effects:
    1. Windows I open on one display can’t be moved to the other (oddly, I can drag and drop between displays, which I didn’t expect).
    2. I can’t alt-tab to the other display.
    3. When I click on a link in Thunderbird, if Firefox is running on the other display, it says Firefox is already running but not responding, and refuses to open the link. I can’t get Firefox running on both displays at once.
    4. My XFCE panel on the laptop display doesn’t show windows on the other display. I tried creating a panel for that display too (XFCE recognizes that there are two displays and lets me place the panel on either one), but I couldn’t place it at the bottom of the display. When I chose to place it at the bottom, it seemed to place itself 768 pixels from the top (the external monitor is 1280×1024, but the laptop display is 1280×768). So I placed it at the top of the display, added a taskbar applet to it, and voila I had what I wanted — but when I rebooted, the panel showed up on the laptop display again.

Otherwise I haven’t noticed any troubles. Anyone who has suggestions on these issues, feel free to post a comment!

Technorati Tags:, , , ,

You might also like:

  1. Ubuntu on Dell Inspiron 1501
  2. Firefox vs. Opera on slow hardware
  3. Favorite USB wireless card for Ubuntu?
  4. How to prelink mozilla-firefox-bin
  5. How to monitor server load on GNU/Linux

How to create stepping slides in OpenOffice.org Impress

If you’ve used Microsoft Powerpoint to create “stepping” slides — slides that appear one bullet point at a time — and can’t figure out how to do it with OpenOffice.org Impress, this article is for you.

What are stepping slides?

“Stepping” slides are the only animation or transition effect I allow myself in slideshows. I don’t generally like animations or other distractions; when I give talks, I am always painfully conscious of how much the audience tends to focus on the slides. I’ve seen some research that suggests people’s brains turn off when they look at slides, so I try to minimize that by making as few slides as possible and engaging the audience.

But enough about me, what do you think about my shirt?

Seriously: stepping slides one bullet point at a time is helpful. It lets me pack a lot more onto a slide, so I can build the story around a concept gradually, increasing the mental density of whatever’s on screen. If the whole slide pops up at once, it’s a distraction. If I split it into a bunch of slides, it’s a distraction.

How do you do it with OpenOffice.org Impress?

Don’t search the Help files for “stepping” — I went that route and crashed OpenOffice.org. Strangely, at the moment the search function crashed, I was writing a presentation about search and indexing algorithms!

You have to do it as an animation. One step at a time (no pun intended):

  1. First, create a slideshow and add a bulleted list:

    OpenOffice.org Impress Stepping Slides, Part 1

  2. Next, select the list text and choose the “Custom Animation” sub-pane in the right-hand side:

    OpenOffice.org Impress Stepping Slides, Part 2

  3. Click the Add… button in the Custom Animation pane and select Appear, then OK to dismiss the dialog box:

    OpenOffice.org Impress Stepping Slides, Part 3

  4. If your Custom Animation pane is large enough, you’ll see a small preview of the bullet points at the bottom. Notice there’s a mouse-click icon next to the first one, and the “Start” pull-down menu is blank (no selection). At this point, all the bullet points are going to be animated as a unit:

    OpenOffice.org Impress Stepping Slides, Part 4

  5. The last step is to make the animation start upon clicking, and make that apply to each bullet point. Pull down the “Start” drop-down and select “On click”. You should now see a little mouse-click icon next to each bullet point:

    OpenOffice.org Impress Stepping Slides, Part 5

You’re done! Test your slideshow just to be sure.

Technorati Tags:, , , , ,

You might also like:

  1. Permit Cookies: a Firefox extension that makes cookie whitelisting easy