Xaprb

Stay curious!

Archive for April, 2010

Ignoring, laughing, fighting, winning

with 15 comments

A now-famous quote that I probably don’t need to attribute: “First they ignore you, then they laugh at you, then they fight you, then you win.”

Where is Drizzle in this lifecycle? I’ve been hearing and reading some comments to the tune of “those Drizzle guys think it’s easy to rip MySQL stuff out and start over, wait till they see how hard it’s going to get when the real world sinks in.” Maybe, maybe. But maybe not, too. Maybe not.

I’ve seen more than one software project that was belittled as “never gonna amount to anything, save your time” and went on to do quite well. Never underestimate the power of a handful of passionate and talented people. I personally feel that Drizzle has a bright future.

Written by Xaprb

April 29th, 2010 at 10:04 pm

Posted in Commentary,Drizzle,SQL

Why high-availability is hard with databases

with 7 comments

A lot of systems are relatively easy to make HA (highly available). You just slap them into a well-known HA framework such as Linux-HA and you’re done. But databases are different, especially replicated databases, especially replicated MySQL.

Matchbox CarThe reason has to do with some properties that hold for many systems, but not for most databases. Most systems that you want to make HA are relatively lightweight and interchangeable, with little to zero statefulness, easy to start, easy to stop, don’t care a lot about storage (or at least don’t write a lot of data; that’s usually delegated to the database), and there’s little or no harm done if you ruthlessly behead them. The classic example is a web server or even most application servers. Most of the time these things are all about CPU power and network bandwidth. If I were to compare them to a car, I’d say they are like matchbox cars: there are many of them, and they are cheap and easy to replace.

Mining TruckDatabases are different. With or without replication, you’re looking at a system that is complex, stateful, heavyweight, and cares a lot about storage. It runs on bigger hardware with fast disks and a lot of memory. It’s usually disk-bound, and it does a lot of writes. It’s hard to start — it takes a long time to warm up and really get ready to serve production workloads (many minutes, hours, or even days). It tends to run with a lot of data in memory in a dirty state, so shutdown is slow, because a clean shutdown requires flushing a bunch of data to disk. If you yank its power plug or kill-dash-nine it, it’ll have to perform recovery on startup, which slows the startup process even more. If I were to compare a database server to a car, I wouldn’t even use a car as the analogy: I’d use one of those big-ass mining trucks. If your mining truck breaks down, you don’t just toss it in the trash and pull another off the shelf.

The problem with a lot of HA solutions is that they want to deal with inconsistencies or irregularities by killing the resource and replacing it in another location. This works fine with web servers, but not with database servers. Doing that will cause serious pain and downtime, defeating the point of HA. And when you add replication into the mix, it gets even worse. A system that wants to manage replication needs to deal with very complex conditions. A lot of replication failures are delicate matters that require skilled human intervention to solve. The HA solution must insulate the application from the misbehaving resource, but leave it running so the human can handle things.

This is not the way most applications are made HA. It’s different with databases, and it’s much harder.

Written by Xaprb

April 26th, 2010 at 7:53 am

Posted in High Availability,SQL

Tagged with

Videos and slides for MySQL Conference 2010

with one comment

Sheeri has posted an awesome list of talks with links to videos and slides from the MySQL conference. She did a lot of the video recording herself. I don’t know how she could have posted them so fast — it’s a lot of work.

Written by Xaprb

April 24th, 2010 at 12:55 pm

Posted in Conferences,SQL

Tagged with