When systems scale better than linearly
I’ve been seeing a few occasions where Neil J. Gunther’s Universal Scalability Law doesn’t seem to model all of the important factors in a system as it scales. Models are only models, and they’re not the whole truth, so they never match reality perfectly. But there appear to be a small number of cases where systems can actually scale a bit better than linearly over a portion of the domain, due to what I’ve been calling an “economy of scale.” I believe that the Universal Scalability Law might need a third factor (seriality, coherency, and the new factor, economy of scale). I don’t think that the results I’m seeing can be modeled adequately with only two parameters.
Here are two publicly available cases that appear to demonstrate this phenomenon: Robert Haas’s recent blog post on PostgreSQL, titled Scalability, in Graphical Form, Analyzed and Mikael Ronstrom’s post from May on MySQL (NDB) Cluster, titled Better than Linear Scaling is Possible.
Dr. Ronstrom’s post discusses the mechanics of the phenomenon, and speculates (I’m not sure it’s conclusive) that it is from a combination of partitioning and better use of CPU caches. Now someone needs to do the math to figure out how to include this factor into the equation.
The good thing about the Universal Scalability Law is how simple and applicable it is for many systems. It’s nice that this economy-of-scale factor seems to be unusual and the simpler model remains easy to apply for a large variety of tasks.



Baron,
Better than linear scalability can be achieved relatively easy, especially I expect we can see it with “NewSQL” systems with auto-sharding capabilities.
The math is following:
let’s assume 1 server with X GB of data and Y GB of memory,
now we go to to 2 servers sharding data.
For simplicity, now each server will have X/2 GB of data and Y GB of memory.
Now, if we look on graph
http://www.mysqlperformanceblog.com/2010/04/08/fast-ssd-or-more-memory/
in some points you can see that doubling memory can actually increase
performance 10x.
So moving from 1 server to 2 servers, and re-balancing data, we can get 10x improvement.
I am lazy to figure out math, but it seems there should be coefficient on data decreasing.
VadimTk
6 Oct 11 at 10:50 pm
The USL model works two ways:
1. Software scaling. You hold everything the same, except you increase concurrency (threads). More workers should get more work done. If the throughput is proportional to the increased concurrency, it’s linear scalability.
2. Hardware scaling. You hold the workload per node as a constant, but you increase the number of nodes doing work. This means if you double the number of servers in the cluster, you should still have the same number of threads per server and the same amount of data per server. If the increase in throughput is proportional to the increase in nodes, it’s linear.
But if we double the number of nodes and keep the dataset the same, that doesn’t fit into the USL model. I think the USL could be extended to model this, but I believe that we’re getting into a lot of complexity involving the working set and the memory boundary. You know this isn’t trivial to understand because it depends on the distribution of requests that cause cache hits versus cache misses.
I think that the super-linear scaling mentioned at the links I showed above are related to this, although I think that Ronstrom’s blog post has too many variables in flux to model simply.
Xaprb
6 Oct 11 at 11:36 pm
Baron,
I am disagree on your definition of hardware scaling.
Let us have system “NeuvoSQL”, which should be black box,
we do not care what is inside.
Initially it works on 2-nodes, we can scale it to 4-nodes or to 6-nodes, all nodes are equal.
You do not know how it handles data inside.
You only put 512 threads workload on this.
Is this hardware scaling ? I believe it is.
The results ( which are received on real system, available on market) are following:
2-nodes: 5000 tps
4-nodes: 12000 tps
6-nodes: 19000 tps
It is better than linear scalability for me, it was received only by hardware scaling.
If USL model can’t handle it, that’s unfortunate, as I said such systems are real, not theoretical, and they are available on market.
VadimTk
6 Oct 11 at 11:55 pm
You can feel free to disagree on the definition of hardware scaling — it’s not my definition, it’s Neil Gunther’s :) I agree with you, we need a way to model the kinds of things you’re talking about.
Xaprb
7 Oct 11 at 12:14 am
Mikael discusses better use of CPU caches, but that’s because he is talking about MySQL Cluster which is already an in memory database. Probably you can see better-than-linear scaling much more easily on a disk based database where dataset is larger than memory on a single node.
We know well (for instance see the Yoshinori tutorial from this year’s MySQL Conference, or coming up in Percona Live UK) that it makes a huge difference if your dataset fits in RAM or not. So like Vadim explains here, once you add more nodes, and assuming the workload can be partitioned so that you don’t just have an identical copy of the InnoDB buffer pool on each node, then by adding nodes you effectively increase the buffer pool. Once it is bigger than the dataset, you get huge increase in performance.
If you want to mode this, the interesting question is what happens as you approach that threshold. You get better performance but I don’t know if there is a simple answer.
Henrik Ingo
7 Oct 11 at 8:09 am