Archive for May, 2011
What kind of High Availability do you need?
Henrik just wrote a good article on different ways of achieving high availability with MySQL. I was going to respond in the comments, but decided it is better not to post such a long comment there.
One of the questions I think is useful to ask is what kind of high availability is desired. It is quite possible for a group of several people to stand in a hallway and talk about high availability, all of them apparently discussing the same thing but really talking about very different things.
Henrik says “At MySQL/Sun we recommended against asynchronous replication as a HA solution so that was the end of it as far as MMM was concerned. Instead we recommended DRBD, shared disk or MySQL Cluster based solutions.” Notice that all of those are synchronous technologies (at least, the way MySQL recommended them to be configured), generally employed to ensure a specific desirable property — no loss of data. But “I must not lose any committed transaction” and “my database must be available” are actually orthogonal requirements. One is about availability, the other is durability.
A lot of people who say they want High Availability actually want High Durability.
There are a great many MySQL users for whom writes are much less valuable than reads. I would point to an advertising-supported website as a canonical example. If the system isn’t available — that is, available to serve read queries — then a lot of money is lost. If someone’s latest comment on a blog post is lost — who cares? Money continues to flow.
This is why a lot of people want a system that keeps the database online, even if some writes are lost. Note that loss of writes is not the same thing as consistency — consistency and durability are also orthogonal for most users’ purposes. So we aren’t talking about eventual consistency or any of the other buzzwords, but simply “the system must respond to read queries.”
Asynchronous replication is well suited to many such users’ availability requirements, as long as replication does not fail (halt) through a write conflict or some other failure mode. (It is often perfectly acceptable for it to fail in other ways, as long as it does not halt.) That’s why a lot of users are interested in the specific type of “high availability” that a system such as MMM is intended to provide (but, as I mentioned, actually doesn’t provide). In other words, MMM would be great for a lot of people, if it worked correctly.
I have also been exposed to applications for which this kind of availability-trumps-durability paradigm is absolutely unacceptable. The advertising system upon which the advertising-supported website relies for its income is a good example. Users know they can build sites that only need to be available for reads, precisely because they are trusting that Google AdSense is highly available for writes! Delegating writes to someone else is the easiest way to build systems.
There is a place for DRBD and MySQL Cluster, and there are also many situations that are served by neither the DRBD nor the MMM type of solution.
Josh Berkus wrote a while back about three types of cluster users, as opposed to three types of clusters. I think it’s helpful to approach the conversation from that angle sometimes too. As a consultant, I almost always do that when I enter a discussion with a customer who wants a “cluster” or “high availability.” Those are basically code phrases that tell me I need to start at the beginning and ensure we are all talking about the same requirements!
I also agree with Henrik about the need to turn off automatic failover. In many, many situations this is by far the best approach. Sometimes people state requirements that, if one steps back and looks at them afresh, quite obviously indicate that an automatic failover is the last thing that’s desirable. For example, if someone tells me that he expects failover to be required less than once a year, this is almost guaranteed not to be a good case for automatic failover. A system that’s tested so infrequently is almost certainly not going to work right when it’s needed. In such cases, it’s far better to leave everything alone until an expert human can resolve the problem, rather than have a stupid machine destroy what would otherwise be a fixable system.
Disk latency versus filesystem latency
This isn’t really a problem with iostat(1M) – it’s a great tool for system administrators to understand the usage of their resources. But the applications are far, far away from the disks – and have a complex file system in-between. For application analysis, iostat(1M) may provide clues that disks could be causing issues, but you really want to measure at the file system level to directly associate latency with the application, and to be inclusive of other file system latency issues.
Someone should add Brendan’s feed to Planet MySQL. Here are the articles: part 1, part 2. Brendan will be talking about this topic at Percona Live on the 26th.
New Maatkit tool: mk-table-usage
This month’s Maatkit release includes a new tool that’s kind of an old tool at the same time. We wrote it a couple years ago for a client who has a very large set of tables and many queries and developers, and wants the database’s schema and queries to self-document for data-flow analysis purposes. At the time, it was called mk-table-access and was rather limited — just a few lines of code wrapped around some existing modules, with an output format that wasn’t generic enough to be broadly useful. Thus we didn’t release it with Maatkit. We recently changed the name to mk-table-usage (to match mk-index-usage), included it in the Maatkit suite of tools, and enhanced the functionality a lot.
What’s this tool good for? Well, imagine that you’re a big MySQL user and you hire a new developer. Now you need to bring the new person up to speed with your environment. Or, you want to understand where the data in some table actually comes from. Or, you want to drop a column, but you’re not sure where that data is used and what other code will be affected. Or you want to find all SQL statements that modify a table. Wouldn’t it be nice to have a graph of all your tables and the data flows between them? With this tool you can parse the flow of data in SQL statements, in terms of Table-From → Table-To, and print the results, annotated by the statement’s fingerprint.
The client who sponsored the development of this tool is using it as an auditing mechanism, for some of the purposes I just mentioned, and also to help enforce their SQL coding standards. It can be used for a lot more than that, though. I haven’t done this yet, but it should be easy to write some quick 5-line script to transform it into graphviz format and produce graphs from it, or import into a table that represents edges and run queries against it, and so on. (The client is doing some of those things, but they aren’t asking me to help, so I’m taking their word for it that the output format they chose is easily amenable to these tasks.)





