Xaprb

Stay curious!

Optimal performance out of the box!

with 12 comments

Most database companies would be proud to say that their products perform optimally out of the box. It means they accomplished a Herculean feat of engineering. But most databases have configuration options because this is almost impossible. For example, MySQL has scores of tuning options, and it needs a lot more.

So when someone benchmarks your database and makes you look bad, usually you can say “that benchmark was run by someone who doesn’t know how to properly tune my database software.”

But what if the benchmarker claims that your database didn’t need to be tuned, (via Dave Page), because it’s optimal out of the box? Do you accept the benchmark results, or reject the compliment?

Written by Baron Schwartz

May 12th, 2010 at 8:25 am

Posted in PostgreSQL,SQL

Tagged with ,

12 Responses to 'Optimal performance out of the box!'

Subscribe to comments with RSS

  1. This is a step up from a previous white paper showing how much better it was than MyISAM.

    Mark Callaghan

    12 May 10 at 9:24 am

  2. If MyISAM didn’t recover so badly from crashes and didn’t have table-level locking, it could actually be a great storage engine. It is more sophisticated in some ways than InnoDB.

    Xaprb

    12 May 10 at 9:27 am

  3. Yeah, I found that curious as well. This could be alleviated, however, if sources were cited. The PostgreSQL documentation (or wiki or whatever) surely says somewhere that you need to tune the default installation, and gives some hints. If the SQL Server documentation indeed says that it is tuned optimally out of the box, then that would be OK.

    Peter Eisentraut

    12 May 10 at 9:35 am

  4. The benchmark Redhat did was similar to TPC-C.

    I took a quick look at one of the faster official TPC-C price/performance benchmarks based on MSSQL:

    http://www.tpc.org/results/FDR/TPCC/HP_ProLiant_DL385G7_100408_v3_fdr.pdf

    I am not very familiar enough with MSSQL to determine whether it was tuned or not.

    Rod Taylor

    12 May 10 at 10:12 am

  5. We run Microsoft’s SQL Server 2005 Standard at work and when I was looking for information about tuning it via settings (4 man IT team, no real DBA), most of what I found turned into tuning indexes, queries or what to monitor to find out what faster hardware to buy. There are settings for assigning resources to SQL Server such as CPU affinity (can be set to let SQL Server manage it, gave us the best results), I/O affinity mask for processors (can be set to let SQL Server manage it, gave us the best results), SQL Server process boost, maximum worker threads, minimum/maximum memory, index creation memory minimum memory query and you can set it to use AWE to allocate memory.
    If there’s more, it’s a bit difficult to find on the Internet.
    I didn’t find more in a SQL Server administration book we have a work either.

    There are more things on PostgreSQL and MySQL, but some seem to have become outdated/less effective over time.

    Slapo

    12 May 10 at 2:24 pm

  6. @Xaprb
    Could you expand upon the point about MyISAM being more sophisticated than Innodb? Anything other than multiple key caches?

    Rob Wultsch

    12 May 10 at 2:27 pm

  7. It’s apples and oranges, but for example it has had a midpoint LRU insertion strategy for a long time. It does prefix compression (and data compression — InnoDB only got that in the Plugin). It has full-text and geospatial indexes. It can build indexes by sort, also something InnoDB didn’t have until Plugin.

    Xaprb

    12 May 10 at 2:50 pm

  8. I used to be a MSSQL DBA and back then they claimed it was “self-tuning” rather than “optimal out of the box”. I think the former is more technically accurate, while the latter phrase was probably created by a marketing department (think they used “out of the box” thinking to come up with it?). It actually did work pretty well, adjusting page vs index cache, thread pool size, and so on as the work load varied. But it was reactive. So a couple of unusual queries could throw it a bit out of whack for a while and there was no way to keep it from doing that. We replicated parts of the data to different servers which allowed us to run an OLTP workload on one server and an OLAP workload on another so they could self-tune appropriately. I only had to use our premier support incidents once or twice to get the top secret registry setting (they did exist) to avoid some bad behaviors.

    No single configuration can be optimal for all work loads. So giving the DBA tools (i.e. configuration settings) to say “this is the work load for which I want to be optimized” and allowing them to deliberately sacrifice performance on the occasional odd query to work better on the expected load can be very helpful. The server won’t be able to fall into the reaction trap and have to constantly adjust to the thrashing the settings. OTOH the DBA has to know or easily be able to find out what each setting will actually do and what work loads will benefit from tuning. As Slapo points out the documentation on what each setting does needs to get better and keep more up to date — on ALL of the products. Your recent post on sort_buffer_size demonstrates that. In lieu of better information, people will think “Gee, I do lots of order bys. A bigger sort buffer must mean faster sorting. I think I’ll tune it…” to their detriment (and yes I did check our sort_buffer_size settings, and no we weren’t tuned up). Letting a system self-tune can be better in many cases than ignorant manual tuning.

    But to answer your question, if your marketing department came up with the claim, you point out that it was a specialized work load that doesn’t reflect real world usage for which your product is optimized. If the benchmarker or somebody else came up with it, you point out that they are obviously idiots who have no idea how to setup your product and besides, it was a specialized work load that doesn’t reflect real world usage for which your product is optimized with the default settings. Benchmarks are only good if they show your products are the best.

    John

    12 May 10 at 4:43 pm

  9. I know two people whose benchmarks I really respect. Actually, make that three. Maybe I’m forgetting someone.

    http://www.mysqlperformanceblog.com/
    http://dimitrik.free.fr/
    http://www.westnet.com/~gsmith/

    Everyone else is just out to prove something :-D

    Xaprb

    12 May 10 at 5:01 pm

  10. I would posit that anybody that is releasing benchmarks is out to prove something, including Percona (mysqlperformanceblog.com) and Sun/Oracle (Dimitri). ;)

    Mark Leith

    13 May 10 at 4:17 am

  11. Hence the smiley face! I forgot Mark Callaghan. He wants to prove stuff, too.

    Xaprb

    13 May 10 at 9:05 am

  12. Changing a PostgreSQL server to use checkpoint_timeout=1h and turning off autovacuum are both things I would consider cheating in a benchmark, and neither models practice you could deploy to a customer. I’m a little disappointed in what was done here. I’m sure there are similar things you could tune on SQL Server if you’re optimizing for benchmark performance rather than anything real-world.

    SQL Server does have some of the easiest to use tools for things like finding what indexes people should use, something they threw some serious research work into. So their claims of “self-tuning” have at least a bit of technical merit behind the marketing.

    PostgreSQL does a bit better without any tuning than some databases simply because it does expect and utilize the operating system buffer cache to be functional. It doesn’t use direct I/O or sync writes in the default setup to try and bypass the OS buffer cache. How well this works depends on the OS. If you’re using Solaris+UFS where that’s going to end up capped at 56MB of buffer cache unless you tune that, using the defaults is going to give you terrible PostgreSQL results. But on Linux or ZFS where disk caching is very aggressive out of the box, PostgreSQL can do just fine at its default settings sometimes.

    The simple PostgreSQL tuning wizard program I developed, pgtune, didn’t get anywhere until we decided that it was OK to split the types of workloads it could be expected to work out on and specifically biased toward and make those separate targets (Josh Berkus’s idea, just to credit properly). There’s ones for web apps, OLTP, DW, an “I’m not sure” mixed setting, and one for small desktops that’s still a bit better than the out of the box config.

    @Slapo: The only PostgreSQL tuning guide that has enough community members working on it to never fall out of date since its introduction is http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server , which averages an update about every three weeks. It will be updated to cover 9.0 before that version is even released.

    Greg Smith

    13 May 10 at 6:38 pm

Leave a Reply