Xaprb

Stay curious!

Archive for the ‘replication’ tag

Observations on key-value databases

with 5 comments

Key-value databases are catching fire these days. Memcached, Redis, Cassandra, Keyspace, Tokyo Tyrant, and a handful of others are surging in popularity, judging by the contents of my feed reader.

I find a number of things interesting about these tools.

  • There are many more of them than open-source traditional relational databases. (edit: I mean that there are many options that all seem similar to each other, instead of 3 or 4 standing out as the giants.)
  • It seems that a lot of people are simultaneously inventing solutions to their problems in private without being aware of each other, then open-sourcing the results. That points to a sudden sea change in architectures. Tipping points tend to be abrupt, which would explain isolated redundant development.
  • Many of the products are feature-rich with things programmers need: diverse language bindings, APIs, embeddability, and the ability to speak familiar protocols such as memcached protocol.
  • I think there are more solutions here than the ecosystem will support, and in five years a few will stand out as the most popular.
  • This process of paring down the gene pool is win-win because they’re open-source, and nothing will be lost.
  • Choosing which one to use is no easy task even for a highly skilled, technical, up-to-date person. Perhaps the decision-makers will choose on the availability of commercial support and consulting.
  • Many of them offer built-in, dead-simple, distributed, synchronous replication. This is very difficult to achieve with traditional relational databases. What makes key-value databases different? They don’t have MVCC, for one thing; but I’m not sure of the complete answer to that question, to tell the truth.

We live in interesting times.

Written by Xaprb

September 20th, 2009 at 2:57 pm

Failure scenarios and solutions in master-master replication

with 27 comments

I’ve been thinking recently about the failure scenarios of MySQL replication clusters, such as master-master pairs or master-master-with-slaves. There are a few tools that are designed to help manage failover and load balancing in such clusters, by moving virtual IP addresses around. The ones I’m familiar with don’t always do the right thing when an irregularity is detected. I’ve been debating what the best way to do replication clustering with automatic failover really is.

I’d like to hear your thoughts on the following question: what types of scenarios require what kind of response from such a tool?

I can think of a number of failures. Let me give just a few simple examples in a master-master pair:

Problem: Query overload on the writable master makes mysqld unresponsive
Do nothing. Moving the queries to another server will cause cascading failures.
Problem: The writable master is completely unreachable
Fence the writable master and promote the standby master.
Problem: The writable master is reachable but unresponsive due to overload-induced swapping
Do nothing. Moving the load to another server will cause cascading failures.

I don’t want to bias the jury, so I’ll stop there and ask you to contribute your failure scenarios and what you think the correct action should be.

Written by Xaprb

August 30th, 2009 at 3:08 pm

Posted in SQL

Tagged with , , , , , ,

Don’t forget about SHOW PROFILES

with 4 comments

It seems that a lot of people want to try to improve MySQL performance by focusing on server status counters and configuration variables. Looking at counters, and “tuning the server,” is better than nothing, but only barely. You care first and foremost about how long it takes to execute a query, not about how many of this-and-that the server performs or about how big or small this-and-that buffer is. What you really need is timing information.

You can use the slow query log to find timing information about queries, and then you can examine those queries with SHOW PROFILES to see the timing information about the query’s execution itself.

This concept is very simple and absolutely fundamental: if you care about time (and you do!), then measure and optimize time. But it’s so often overlooked or misunderstood.

The addition of SHOW PROFILES was a major step forward in the ability to optimize server and application performance. (Thanks Jeremy Cole!) As time passes and people upgrade their servers, it’s becoming more common to see it in production, which is an enormous help. Now that the differences between the Community and Enterprise versions of the server have been erased, it will be available in all future server versions, which is great news.

Written by Xaprb

May 31st, 2009 at 3:24 pm

Posted in SQL

Tagged with , , ,