What's a good benchmark?

Vadim has taught me that valid benchmarks are both simple and complex. Simple, because the basic principles are few; complex, because the devil is in the details and it’s a lot of work to satisfy the basic requirements. I’ll give the simple version here.

I suppose it should go without saying, but it’s worth saying anyway: benchmarks must also not be deliberately manipulated, e.g. artificial throttling to make scalability appear to be a perfectly straight line. I’ve seen more than one benchmark that is simply too perfect to be true (each throughput measurement is an exact multiple of ten thousand, for example), and the original data is mysteriously not available anymore.