The Erlang Response Time Stretch Factor For 3 And 4 Servers

In a previous post I explored a few variations of equations that express the M/M/m queueing theory response time “stretch factor,” and tried to indicate some areas where I wanted to dig into the relationships between these formulas a bit more. In this post I discuss the divergence between the official Erlang C formula and Neil Gunther’s heuristic approximation to it. I introduced this before thusly:

At \(m=3\) and above, the heuristic is only approximate. What does the Erlang form reduce to for the first of those cases? Does it result in the missing term that will extend to 4 and beyond too?

» Continue Reading (about 500 words)

The Queueing Knee, Part 2

Last week I wrote about the so-called “knee” in the M/M/m queueing theory response time curve. In that post I examined one definition of the knee; here is my analysis of the others, including the idea that there is no such thing as the knee.

There are potentially several ways to think about the “knee” in the queueing curve. In the previous post I dug into Cary Millsap’s definition: the knee is the point where a line tangent to the queueing curve passes through the origin:

knee

Here are a few others to consider:

» Continue Reading (about 1400 words)

The Queueing Knee, Part 1

The “knee” in the M/M/m queueing theory response time curve is a topic of some debate in the performance community. Some say “the knee is at 75% utilization; everyone knows that.” Others say “it depends.” Others say “there is no knee.”

Depending on the definition, there is a knee, but there are several definitions and you may choose the one you want. In this post I’ll use a definition proposed by Cary Millsap: the knee is where a line from the origin is tangent to the queueing response time curve. The result is a function of the number of service channels, and although we may argue about the topics in the preceding paragraph and whether this is the right definition, it still serves to illustrate important concepts.

knee

» Continue Reading (about 500 words)

The Response Time Stretch Factor

Computer systems, and for that matter all types of systems that receive requests and process them, have a response time that includes some time waiting in queue if the server is busy when a request arrives. The wait time increases sharply as the server gets busier. For simple M/M/m systems there is a simple equation that describes this exactly, but for more complicated systems this equation is only approximate. This has rattled around in my brain for a long time, and rather than keeping my notes private I’m sharing them here (although since I’m still trying to learn this stuff I may just be putting my ignorance on full display).

Hockey-Stick Curve

» Continue Reading (about 2200 words)

The Square Root Staffing Law

The square root staffing law is a rule of thumb derived from M/M/m queueing theory, useful for getting an estimate of the capacity you might need to serve an increased amount of traffic.

Square Root Staffing Law

The square root staffing law is designed to help with capacity planning in what’s called the QED regime, which tries to balance efficiency with quality of service. Capacity planning is a set of tradeoffs: for best quality of service, you must provision lots of spare capacity (headroom), but that’s wasteful. For best efficiency, you minimize idle capacity, but then quality of service becomes terrible.

» Continue Reading (about 400 words)

How to Extract Data Points From a Chart

I often see benchmark reports that show charts but don’t provide tables of numeric results. Some people will make the actual measurements available if asked, but I’ve been interested in analyzing many systems for which I can’t get numbers. Fortunately, it’s usually possible to get approximate results without too much trouble. In this blog post I’ll show several ways to extract estimates of values from a chart image.

Extracting

» Continue Reading (about 1000 words)

Thinking clearly about fitting a model to data

I have often seen people fitting curves to sets of data without first understanding whether that is appropriate. I once even used this blog to criticize someone for doing that. I was trying to explain that it’s wrong to fit a model to a set of measurements, unless the model actually describes the process that produced the measurements. All of my explanations (and rants) have fallen far short of the clarity and simplicity of this curve-fitting guide.

» Continue Reading (about 400 words)

A close look at New Relic's scalability chart

I’ve written a lot about modeling MySQL with the USL, and I like it best of all the scalability models I’ve seen, but it’s not the only way to think about scalability. I was aware that New Relic supports a scalability chart, so I decided to take a peek at that. Here’s a screenshot of the chart, from their blog: Here’s how it works. It plots response time (or database time, or CPU) as the dependent variable, versus throughput as the independent variable.

» Continue Reading (about 800 words)

Modeling scalability with the USL at concurrencies less than 1

Last time I said that you can set a starting value for the USL’s coefficient of performance and let your modeling software (R, gnuplot, etc) manipulate this as part of the regression to find the best fit. However, there is a subtlety in the USL model that you need to be aware of. Here is a picture of the low-end of the curve: The graph shows the USL model as the blue curve and linear scalability as the black line.

» Continue Reading (about 500 words)

Determining the USL's coefficient of performance, part 2

Last time I said that the USL has a forgotten third coefficient, the coefficient of performance. This is the same thing as the system’s throughput at concurrency=1, or C(1). How do you determine this coefficient? There are at least three ways. Neil Gunther’s writings, or at least those that I’ve read and remember, say that you should set it equal to your measurement of C(1). Most of his writing discusses a handful of measurements of the system: one at concurrency 1, and at least 4 to 6 at higher concurrencies.

» Continue Reading (about 600 words)

Determining the Universal Scalability Law's coefficient of performance

If you’re familiar with Neil Gunther’s Universal Scalability Law, you may have heard it said that there are two coefficients, variously called alpha and beta or sigma and kappa. There are actually three coefficients, though. See? \[ C(N) = \frac{N}{1 + \sigma(N-1) + \kappa N (N-1)} \] No, you don’t see it – but it’s actually there, as a hidden 1 multiplied by N in the numerator on the right-hand side.

» Continue Reading (about 300 words)

Scalability, performance, capacity planning and USL at Hotsos Symposium

I presented at this year’s [Hotsos Symposium](). I am searching for a claim to specialness, and I think it may be that I am the first Hotsos presenter who’s specifically focused on MySQL. True? I don’t know, but I’ll run with it for now. VividCortex is the startup I founded in 2012. It’s the easiest way to monitor what your servers are doing in production. It does TCP network traffic analysis.

» Continue Reading (about 500 words)

Fundamental performance and scalability instrumentation

This post is a followup to some promises I made at Postgres Open. Instrumentation can be a lot of work to add to a server, and it can add overhead to the server too. The bits of instrumentation I’ll advocate in this post are few and trivial, but disproportionately powerful. Note: VividCortex is the startup I founded in 2012. It’s the easiest way to monitor what your servers are doing in production.

» Continue Reading (about 500 words)

Using BASE instead of ACID for scalability

My editor Andy Oram recently sent me an ACM article on BASE, a technique for improving scalability by being willing to give up some other properties of traditional transactional systems. It’s a really good read. In many ways it is the same religion everyone who’s successfully scaled a system Really Really Big has advocated. But this is different: it’s a very clear article, with a great writing style that really cuts out the fat and teaches the principles without being specific to any environment or sounding egotistical.

» Continue Reading (about 400 words)