Xaprb

Stay curious!

How to notify event listeners in MySQL

with 7 comments

A high-performance application that has producers and consumers of some resource, such as a queue of messages, needs an efficient way to notify the consumers when the producer has inserted into the queue. Polling the queue for changes is not a good option. MySQL’s GET_LOCK() and RELEASE_LOCK() functions can provide both mutual exclusivity and notifications.

This post was prompted by a message to the MySQL general emailing list some time ago, but I’m finally getting around to actually testing the theoretical solution I mentioned then (I can never just think my way through anything that involves locking and waiting… I have to test it).

Here’s the set-up:

create table test.messages (
   id int not null auto_increment primary key,
   message varchar(50) not null
);

The producer

The producer’s job is to insert rows into the table. In pseudo-code,

while (true ) {
   get_lock();
   // time passes...
   query("insert into messages(message) values ('hi')");
   release_lock();
}

Releasing the lock immediately after inserting will “wake up” the consumer, which must be blocked, waiting for the lock. Locking again as soon as possible will make the producer wait until the consumer is done processing, then the consumer will wait again.

The consumer

Since the consumer is waiting for the lock, that means it has tried to exclusively lock the same resource the producer has locked. Once the producer releases it, the consumer can go ahead and process the rows just inserted. In pseudo-code:

$last_row = 0;
while ( true ) {
   get_lock();
   $rows = query("SELECT * FROM messages WHERE id > $last_row");
   for each $row ( $rows ) {
      // Process
      $last_row = $row[id];
   }
   release_lock();
}

Locking

The actual locking implementation always makes the details more complicated.

Both the producer and the consumer will have to get an exclusive lock on the queue table, or something that represents the queue table. The immediately obvious solution is LOCK TABLES. This doesn’t work well for most situations.

Why not? Since the producer and/or the consumer might need to access data in more than one table, they’ll have to lock all the tables they need. This will block other parts of the system from functioning, assuming there’s more than just a queue in the database. Other queries might then need to use LOCK TABLES too, and this just has a way of spreading out of control until the entire database becomes serial, mutual-exclusive access. This is terrible for any serious application.

Fortunately, MySQL has application locks, implemented with GET_LOCK() and RELEASE_LOCK(). They’re advisory, so you can ignore them if you want, but they are handy for things like this, where the producer and consumer just need to lock the same thing. They’re also relatively cheap. You’re really just locking a string, which you can pick. I’ll use the name of the table.

Here’s the code:

// Producer:
$timeout = 1000000;
while (true) {
   query("SELECT GET_LOCK('messages', $timeout)");
   // time passes...
   query("insert into messages(message) values ('hi')");
   query("SELECT RELEASE_LOCK('messages')");
}

// Consumer:
$last_row = 0;
while ( true ) {
   query("SELECT GET_LOCK('messages', $timeout)");
   $rows = query("SELECT * FROM messages WHERE id > $last_row");
   for each $row ( $rows ) {
      // Process
      $last_row = $row[id];
   }
   query("SELECT RELEASE_LOCK('messages')");
}

This works because the producer and consumer are really notifying each other — it’s not one-way, it’s symmetric. Inside MySQL, there’s a queue of threads waiting for locks. As soon as one releases the lock, the other gets it, and immediately goes back onto the queue waiting for it again.

Complications

There’s more to it than this. GET_LOCK() has a timeout, which can’t be infinite. If the timeout expires, the function returns, but doesn’t grant the lock. Some other errors could also cause this to happen. The producer and consumer have to be prepared to recognize when the lock isn’t granted, and retry. The return value of GET_LOCK() signifies whether the lock was really granted. Also, either the producer or consumer could die, and then there’d be no wait for the lock at all. The consumer can tell that this happened by noticing there’s no work to do. The producer can’t really tell unless it queries the database. But the producer is likely waiting for something (another lock, user input,…) where the code says “time passes.” So this shouldn’t really be a problem.

Another limitation is the possibility of the consumer starting first and locking out the producer. If it doesn’t release the lock and try to re-lock periodically, the producer will never be able to get a lock. If it does, there’s still another problem. The consumer should sleep so as not to spin-wait for the presence of a producer. If the producer produces a row while the consumer is sleeping, and then doesn’t produce and release again for a very long time, the consumer will not find out about the row the producer inserted. It will have to wait for the next message the producer inserts. The solution is to make sure the consumer keeps the lock while it sleeps.

All of these issues are solvable with special-case startup code, but I’m sure you can work out something that meets your needs. I don’t want to make this article more complicated, because this will all be application-dependent.

Sample application

Here is a Perl script that implements a producer and consumer on a MySQL table called test.messages. To run it, give a --mode argument of ‘p’ or ‘c’. Be sure you create the table (see above) first:

Start two instances, one in producer mode, one in consumer mode, and watch the consumer print out messages as you enter them into the producer. Fun!

More options

If you do need to poll, there are still some steps you can take to make it more efficient. I wrote about efficient polling with exponential or Fibonacci wait a while ago. This technique has worked well for me in many applications.

You can also poll on something small and efficient, instead of polling a potentially big messages table. Make another table in which the producer inserts a single row, or flips a single row from zero to one, and the consumer resets it. Polling on a small resource is much more efficient than a big resource. You can use this technique together with transactions to coordinate the work of many producers and consumers, even when you don’t have explicit methods of locking (for example, if your database server doesn’t support it).

Finally, if you need a fixed-size FIFO queue or “round-robin table,” try the suggestions in my article on how to create a queue in SQL.

Written by Xaprb

August 29th, 2007 at 8:46 am

7 Responses to 'How to notify event listeners in MySQL'

Subscribe to comments with RSS

  1. I’ve tended to advise against using MySQL’s advisory locking functions, there are enough pitfalls to make it undesirable in terms of relying on it for an application.

    There are other cheap ways to make it work.
    Various people have written UDFs that write to UDP sockets. That’s useful.

    But a really simple method would be to put a trigger on the table, and have a secondary table (MyISAM) with just one row. The trigger would just insert ignore on a row with a fixed ID and a primary/unique key. A reader deletes the row. The queue can then be checked with a SELECT COUNT(*) FROM onerowtbl, which with MyISAM is a very fast operation. It you don’t like triggers, the secondary table can be manually updated by the writer application. Same result.

    I just made up the above as I was writing the reply… there might be other and better solutions. The point is, the above would be reliable.

    Arjen Lentz

    30 Aug 07 at 10:01 am

  2. What are the pitfalls with advisory locking functions?

    Xaprb

    30 Aug 07 at 10:04 am

  3. [...] How to notify event listeners in MySQL at Xaprb A high-performance application that has producers and consumers of some resource, such as a queue of messages, needs an efficient way to notify the consumers when the producer has inserted into the queue. Polling the queue for changes is not a good option. MySQL’s GET_LOCK() and RELEASE_LOCK() functions can provide both mutual exclusivity and notifications. [...]

  4. Hi Baron! Cool article. Should be noted, though, that this is only relevant for stateless, request-response oriented applications. Using an Observer/Observable pattern in stateful applications would be a better option for those platforms that support state. The Observers register with an Observable object and once a condition/event occurs, all observers are notified…

    Cheers!

    Jay

    Jay Pipes

    30 Aug 07 at 4:40 pm

  5. Yeah, I forgot to point out that this is totally a DB-centric approach. There are much better ways for a stateful app.

    Xaprb

    30 Aug 07 at 4:46 pm

  6. Is there a way to implement a connectionless lock in MySQL? The locking mechanism would be perfect for me, except that I can’t maintain a connection. I want to issue a query to get the lock, then go away for a long period, and then come back and unlock it later. As far as I can tell, a lock is the only method to get a MySQL query to block until something else tells it not to, so I’m not seeing any other way to implement it…

    Ichae

    22 Nov 07 at 12:42 am

  7. just wanted to add my two cents to the story, even though it’s 3 years old;)

    Honestly, i did not know about the get_lock-functionality of mysql and neither did I expect it to be available. I was in need of a cross-connection lock as well and I reliably used a mechanism similar to arjen lentz’s.

    My concept is based on mysql’s guarantee that there can’t be two records with the same primary key.

    create table `MyLock`
    `name` varchar(255) not null,
    primary key `name`;

    locking: insert into `MyLock` values (‘aLock’);
    unlocking: delete from `MyLock` where `name` = ‘aLock’;

    the insert-locking-method works because this an implicit read and an explicit write operation made atomic by the db-engine (primary key restriction)

    if the insert fails, the lock failed and has to be retried, where one could use the increasing wait cycle method explained above.

    if the insert succeeds the lock succeeded and can be unlocked by the application after that.

    There is no notification but it is cross-connection.

    Lars

    22 Mar 10 at 5:10 am

Leave a Reply