Comments on: Products that scale linearly to hundreds of servers http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/ Stay curious! Mon, 13 May 2013 05:55:40 +0000 hourly 1 http://wordpress.org/?v=3.5.1 By: abroad http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/#comment-18909 abroad Mon, 15 Nov 2010 16:08:35 +0000 http://www.xaprb.com/blog/?p=2087#comment-18909 Hi, stumbled onto this blog, saw this question, and can’t resist pointing you to http://www.xlmpp.com
Enjoy,
abroad

]]>
By: John Hugg http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/#comment-18893 John Hugg Fri, 05 Nov 2010 18:45:24 +0000 http://www.xaprb.com/blog/?p=2087#comment-18893 VoltDB engineer here.

On linear scalability:
Referring to 97% scalability as linear seems to me as a pretty unoffensive exaggeration. As for VoltDB, we’ve tested performance on up to 30 nodes, so I can’t speak to hundreds. For many kinds of workloads in VoltDB, if you double the number of nodes you will get double the performance. This is not true for all workloads and is usually not true for clusters with fewer than three nodes. Yes, there is a graph on our website that is a particular benchmark with some noise to it. This simple graph is a poor substitute for a real POC.

On joins
VoltDB can perform many common and useful joins with blazing speed. Joins that would require large data moves between nodes are out of scope for our project. I’m not aware of anyone in the transaction processing space who does these well. The analytics space is a different story.

On MySQL NDB Cluster
NDB and VoltDB have some key similarities and some key differences. Both require some effort on the part of the app developer to get the performance they promise. While I’m totally biased, I think VoltDB strongest win over NDB is simplicity. The time it takes to go from beginner to expert seems dramatically different between the two systems.

]]>
By: Jonas Oreland http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/#comment-18885 Jonas Oreland Wed, 03 Nov 2010 20:07:23 +0000 http://www.xaprb.com/blog/?p=2087#comment-18885 Jonas comment:

DBT2 is a quite complicated benchmark. Reason why MySQL Cluster didn’t scale 100% linearly is that the history table can’t be partitioned so that a transaction only spans 1 node group. (unless altering benchmark of course)

But given an easier benchmark…that allows transactions to be 100% partitioned, should be linearly (100%) scalable…

So I *don’t* think it’s impossible…

]]>
By: Xaprb http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/#comment-18884 Xaprb Wed, 03 Nov 2010 17:19:12 +0000 http://www.xaprb.com/blog/?p=2087#comment-18884 It occurred to me while I was out doing errands that I should give a more complete example why 100% correspondence between number of nodes and throughput is the only thing that’s actually linear. I think it’s easy to fall into a mental trap of “there is some performance loss, but the amount of loss remains constant per node as you add nodes, thus it’s linear scaling.”

Suppose 1 node is 100 units of performance; 2 nodes is 180. That’s a 10% deviation from perfect scaling. Suppose we double again, and we “hold the loss constant and add 1.8x capacity for every doubling of nodes.” Then we have 4 nodes, and 180 times 1.8 = 324 units.

Plot that on a graph. It’s a curve. A “constant amount of loss” is not linear.

Think about it this way: it’s the same type of thing as compound interest. You know that the principal plus interest grows exponentially as the compounding continues. Your account balance doesn’t grow linearly. Neither does the performance loss from adding nodes.

]]>
By: Xaprb http://www.xaprb.com/blog/2010/11/02/products-that-scale-linearly-to-hundreds-of-servers/#comment-18883 Xaprb Wed, 03 Nov 2010 15:34:13 +0000 http://www.xaprb.com/blog/?p=2087#comment-18883 The 45-degree angle thing is really just a confusion that I shouldn’t have perpetuated. It doesn’t really make sense. (Sorry Morgan…)

NDB absolutely does not provide constant increase directly proportional to the number of nodes. If it did, then we would have 100% gain per node for 2x the nodes. Period. If 2x the nodes is only 97% gain, then we have a curve, not a line. We have 3% deviation from linearity.

It is extremely rare and specialized to find systems that even have the possibility of linear scaling. A distributed database that does transactions across a cluster is not one of them.

]]>