Xaprb

Stay curious!

Archive for the ‘iostat’ tag

How to read Linux’s /proc/diskstats easily

with 3 comments

These days I spend more time looking at /proc/diskstats than I do at iostat. The problem with iostat is that it lumps reads and writes together, and I want to see them separately. That’s really important on a database server (e.g. MySQL performance analysis).

It’s not easy to read /proc/diskstats by looking at them, though. So I usually do the following to get a nice readable table:

  • Grep out the device I want to examine.
  • Push that through “rel” from the Aspersa project.
  • Add column headers, then format it with “align” from the same project.

Here’s a recipe. You might want to refer to the kernel iostat documentation too.

wget http://aspersa.googlecode.com/svn/trunk/rel
wget http://aspersa.googlecode.com/svn/trunk/align
chmod +x rel align
while sleep 1; do grep sdb1 /proc/diskstats; done > stats
# CTRL-C after a while
echo m m dev  reads   rd_mrg rd_sectors ms_reading \
  writes   wr_mrg     wr_sectors  ms_writing  cur_ios \
  ms_doing_io ms_weighted | cat - stats | ./rel | ./align
m  m dev      reads rd_mrg  rd_sectors ms_reading    writes wr_mrg  wr_sectors ms_writing cur_ios ms_doing_io ms_weighted
8 17 sdb1 233290130 310126 22472222903 2032292523 479678266 883257 35319718188 1491098806       0   675768709  3523591184
0  0 sdb0        80      0        2560       1049       236      1       13621        253       0         161        1302
0  0 sdb0       226      0        7232       1418       638      0       40156        235       0         224        1653
0  0 sdb0        39      0        1248        295       519      0       35440        573      17         196        1669
0  0 sdb0        73      0        2336       4031      2104      0      134480       3076     -17         908        6308
0  0 sdb0        33      0        1056        293         3      0         834          0       0         293         293
0  0 sdb0        35      0        1120        157         3      0         642          0       0         150         157
0  0 sdb0        36      0        1152        161         3      0         586          0       1         155         162
0  0 sdb0        36      0        1152        140         3      0         738          0       0         139         141
0  0 sdb0       208      0        6656       4514       630      0       40552       1002      -1         455        5514
0  0 sdb0        57      0        1824        547       485      0       35156        485      16         195        1566
0  0 sdb0        99      0        3168       4442      2067      0      133972       3491     -16         869        7403
0  0 sdb0        22      0         704        140        20      0        8801         12       0         144         152
0  0 sdb0        16      0         512         62         2      0         466          0       0          62          62

Written by Xaprb

May 14th, 2010 at 3:37 pm

Posted in GNU/Linux,SQL,Sys Admin,Tools

Tagged with

I want simple things to be easy

with 2 comments

I like to write tools that make hard things easy, when possible. By and large, MySQL is easy and simple. But some simple things are too hard with MySQL. I want to change that, at least for the things that matter the most to me, and which I think I know how to fix.

I will probably write a lot about this. I have already written a number of rants blog posts about the lack of instrumentation in MySQL, and that is where I’ll probably continue to put most of my energy.

To begin with, imagine this simple scenario. You are a remote DBA. Your client says “New Relic is showing periods of slow response time from the database.” You connect to MySQL at the command line and try to troubleshoot. How do you catch the problem in action, from within the database itself? The following are no good:

  • It doesn’t count to see the problem two minutes later by observing the application tier, as New Relic does. That’s too late, and it’s not from within the database.
  • It’s also not good enough to catch just after the slow queries finish executing, which you can get from the slow query log if you have a realtime log analyzer running; that is a) also after the fact, and b) again not possible from within the database itself.
  • Watching TCP traffic or using a proxy is similarly off limits.

This is a database, a complex piece of software — comparable to an operating system in some respects. It should be possible to know that response time is spiking while the slow queries are executing slowly, before they even finish and return their results to the application. This is a simple question that should be easy to answer.

It’s not currently possible inside MySQL, because MySQL doesn’t tell you how much time queries spend executing. It’s that simple. Fortunately the fix is equally simple: measure how long queries spend executing.

We can look at the lowly iostat (or /proc/diskstats in Linux) for an example of how this is possible to solve. We need a counter that shows the sum of execution time, including currently executing queries. I saw that the Facebook patch adds counters similar to this. Search for “Query timing” on that page. I want something slightly more complex, the way Linux’s IO counters work, because they add memory. Simple is really beautiful; a counter that has memory is an amazing thing, and you can apply Little’s Law to derive a surprising amount of information from it.

I might change my opinion about some of the more complex things that are being added to MySQL; much smarter people have, and I’m probably a hold-out because I’m not that smart. But I still see the need for simple combinations of counters and timers for essential metrics, which do not need complex queries or tools to use. I have a few more things on my wishlist, which I’ll write about later.

Written by Xaprb

May 13th, 2010 at 9:45 am

Version 1.1.6 of Better Cacti Templates released

with one comment

I’ve released version 1.1.6 of the Better Cacti Templates project. This release includes a bunch of bug fixes (but not all of them!) and two new sets of graphs. One set is for disk I/O on GNU/Linux, and the other is a new set of templates for OpenVZ. I’m looking for feedback on both of those. This release also has a bunch of code-level features: much better test coverage (hooray!), and a refactored ss_get_by_ssh.php that makes it much easier to create new graphs and templates. The SSH-based templates also take advantage of the same caching as the MySQL templates, which makes them a lot more efficient.

There are upgrade instructions on the project wiki for this and all releases. There is also a comprehensive tutorial on how to create your own graphs and templates with this project. Use the project issue tracker to view and report issues, and use the project mailing list to discuss the templates and scripts.

The full changelog follows.

2010-01-10: version 1.1.6

  * Added OpenVZ graphs (--type openvz) (issue 95).
  * Added IO usage graphs (--type diskstats) (issue 97).
  * Added extra error-reporting (issue 110).
  * The $debug $debug_log options couldn't be set in the .cnf file (issue 110).
  * Added a --use-ssh option to ss_get_by_ssh.php (issue 66).
  * Added a debugging log to ss_get_by_ssh.php (issue 54).
  * Enabled caching of results in ss_get_by_ssh.php (issue 46).
  * Added a test suite for ss_get_by_ssh.php (issue 110).
  * The 'free' stats suffered from PHP's issues with big numbers (issue 102).
  * There was ambiguity (but no error) in SHOW STATUS overrides (issue 106).
  * It was hard to debug failures caused by missing ext/mysql (issue 105).
  * Code to make ss_get_mysql_stats.php testable was broken (issue 108).

Written by Xaprb

January 10th, 2010 at 11:01 am

Posted in PHP,SQL,Sys Admin

Tagged with , , ,