Archive for the ‘Neil J. Gunther’ tag
Those of you who’ve been following my recent work on modeling system scalability might be interested in this. (It’s not my work, by the way. I’m just trying to ski in the wake of Neil Gunther.)
I’ve measured quite a few systems that have some strange bubbles in the scalability curve. As I explained in my talk on Thursday, systems don’t always follow the model precisely, because of their internal architecture. Systems sometimes behave differently at specific points because, say, an internal structure gets filled up and allocates a larger array to hold more of whatever it is, or CPU scheduling changes to balance threads across cores differently, or any of a number of other possibilities that are sometimes hard to understand or predict. Yes, this is hand-waving. But although it could sometimes take a lot of work to explain these kinds of things, it’s easy to observe and measure them in action, so the phenomenon is clearly real. VoltDB, for example, does not follow the scalability curve at 1 and 2 nodes in the cluster because 1 and 2 nodes are magic numbers. In computer science we usually say that there are three types of numbers: 0, 1, and many. Turns out it’s a little different for VoltDB.
So, for those of you who are curious about this, I now stop blathering and simply direct you to Neil Gunther’s blog to read more.
I wish I could be at the Hotsos Symposium. I would keep my mouth very tightly closed and my ears wide open, and try to learn from people who are completely out of my league about performance analysis topics I won’t grok for another decade (if I’m lucky). But I just can’t cram in that much travel.
If you’re like me, why don’t you go to the O’Reilly MySQL Conference instead? I’ll be trying my feeble best to bring some of the Oracle performance scientist’s mentality to this event, with presentations such as my Forecasting MySQL Performance and Scalability session. And a lot of smart people will be there from many database communities. Percona has a 20% discount, too — not a bad enticement.
I had the privilege to meet Neil Gunther and listen to him speak this week at Surge. During his talk, he brought up the point that all measurements are wrong by definition. I thought I knew what he meant, but I was stuck with tunnel vision about floating-point precision and such. I had it all wrong. The real answer is obvious and simple.
The point is that the process of measuring, and therefore the answer that comes out of the measurement process, is imprecise. And further, that we need to treat a measurement as a measurement, not as the true value of whatever it is we tried to measure. So although we may say “the CPU was 70% utilized,” we should really be thinking “the measurements of CPU busy-time totaled 70% of the measurements of elapsed-time.” There’s more, but I won’t repeat his whole talk. You might enjoy his book.
Neil mentioned that this way of thinking isn’t foreign — we learn it in physical sciences. Indeed, I immediately remembered all my chemistry and physics labs, and mechanical engineering classes, and…. But that’s a whole education away now. Somehow between then and now, I educated myself to think that computers manipulate numbers, and the numbers are somehow mathematically pure.
When computers store and retrieve numbers, that’s often imprecise, and that is continually present in my mind — but that’s a whole different matter.