Using BASE instead of ACID for scalability
My editor Andy Oram recently sent me an ACM article on BASE, a technique for improving scalability by being willing to give up some other properties of traditional transactional systems.
It’s a really good read. In many ways it is the same religion everyone who’s successfully scaled a system Really Really Big has advocated. But this is different: it’s a very clear article, with a great writing style that really cuts out the fat and teaches the principles without being specific to any environment or sounding egotistical.
He mentions a lot of current thinking in the field, including the CAP principle, which Robert Hodges of Continuent first turned me onto a couple months ago. It has been proven formally, though I have not read the proof myself.
One of the most important concepts he advances is giving up the illusion of control. As programmers and DBAs, I think we may tend to like control too much. Foreign keys are a perfect example. I think the point here is that these things make you feel safe, but they don’t really make you safe. Just as with so many things in life, recognizing our inability to really control the systems we build is key to working with their strengths — instead of trying to bind them with iron bands.
Another great point is idempotency. This is a great way to help avoid problems with MySQL replication, by the way. I’ll leave the “why” as an exercise for the reader, but let me just point out that the file MySQL uses to remember its current position in replication is not synced to disk, so it will almost certainly get out of whack if MySQL dies ungracefully. (Google has solved this problem.)
A highly recommended read — worth more than most case studies about how specific companies have scaled their specific systems.



Indeed, as you said, this is a proven approach used by many “Web 2.0″ sites (FaceBook, Ning, LinkedIn, etc…). Another dimension of this method of scaling is diagonal scaling.
Most sites that choose the BASE approach ignore the “E” (Eventually Consistent) aspect, in that the application does not completely “roll back” a multi-partition transaction if any part of it fails. This is often by design, especially for non-financial applications. As the DBA for a Web 2.0 site, I have found that a generally-acceptable solution for this is to have what we call a “Sanity Crawler”; essentially a script that continually looks for transaction completeness based on a pre-defined rules set (i.e. if a “Friendship Transaction” has not completed after N units of time, it will attempt to complete it or roll it back, based on the rules. If the transaction cannot be “fixed”, a diagnostic message will be logged and developers will debug it).
Ryan Lowe
23 Jul 08 at 9:54 pm
Talking about Replication I read your book High Performance Mysql. I have One Question.
What if one on a Master server have different Databases (Example DB1 DB2 DB3 ..) so how do we replicate all those Databases on the slave server.
I have two Database servers one act as a Master and the Other as Slave. What i have is Multiple database running on the Master and i am able to replicate only one. how do i replicate all the databases on the slave server.
Kunal Jain
5 Aug 08 at 8:02 am