Archive for the ‘GNU/Linux’ Category
How to read Linux’s /proc/diskstats easily
These days I spend more time looking at /proc/diskstats than I do at iostat. The problem with iostat is that it lumps reads and writes together, and I want to see them separately. That’s really important on a database server (e.g. MySQL performance analysis).
It’s not easy to read /proc/diskstats by looking at them, though. So I usually do the following to get a nice readable table:
- Grep out the device I want to examine.
- Push that through “rel” from the Aspersa project.
- Add column headers, then format it with “align” from the same project.
Here’s a recipe. You might want to refer to the kernel iostat documentation too.
wget http://aspersa.googlecode.com/svn/trunk/rel
wget http://aspersa.googlecode.com/svn/trunk/align
chmod +x rel align
while sleep 1; do grep sdb1 /proc/diskstats; done > stats
# CTRL-C after a while
echo m m dev reads rd_mrg rd_sectors ms_reading \
writes wr_mrg wr_sectors ms_writing cur_ios \
ms_doing_io ms_weighted | cat - stats | ./rel | ./align
m m dev reads rd_mrg rd_sectors ms_reading writes wr_mrg wr_sectors ms_writing cur_ios ms_doing_io ms_weighted
8 17 sdb1 233290130 310126 22472222903 2032292523 479678266 883257 35319718188 1491098806 0 675768709 3523591184
0 0 sdb0 80 0 2560 1049 236 1 13621 253 0 161 1302
0 0 sdb0 226 0 7232 1418 638 0 40156 235 0 224 1653
0 0 sdb0 39 0 1248 295 519 0 35440 573 17 196 1669
0 0 sdb0 73 0 2336 4031 2104 0 134480 3076 -17 908 6308
0 0 sdb0 33 0 1056 293 3 0 834 0 0 293 293
0 0 sdb0 35 0 1120 157 3 0 642 0 0 150 157
0 0 sdb0 36 0 1152 161 3 0 586 0 1 155 162
0 0 sdb0 36 0 1152 140 3 0 738 0 0 139 141
0 0 sdb0 208 0 6656 4514 630 0 40552 1002 -1 455 5514
0 0 sdb0 57 0 1824 547 485 0 35156 485 16 195 1566
0 0 sdb0 99 0 3168 4442 2067 0 133972 3491 -16 869 7403
0 0 sdb0 22 0 704 140 20 0 8801 12 0 144 152
0 0 sdb0 16 0 512 62 2 0 466 0 0 62 62
How Linux iostat computes its results
iostat is one of the most important tools for measuring disk performance, which of course is very relevant for database administrators, whether your chosen database is Postgres, MySQL, Oracle, or anything else that runs on GNU/Linux. Have you ever wondered where statistics like await (average wait for the request to complete) come from? If you look at the disk statistics the Linux kernel makes available through files such as /proc/diskstats, you won’t see await there. How does iostat compute await? For that matter, how does it compute the average queue size, service time, and utilization? This blog post will show you how that’s computed.
First, let’s look at the fields in /proc/diskstats. The order and location varies between kernels, but the following applies to 2.6 kernels. For reads and writes, the file contains the number of operations, number of operations merged because they were adjacent, number of sectors, and number of milliseconds spent. Those are available separately for reads and writes, although iostat groups them together in some cases. Additionally, you can find the number of operations in progress, total number of milliseconds during which I/Os were in progress, and the weighted number of milliseconds spent doing I/Os. Those are not available separately for reads and writes.
The last one is very important. The field showing the number of operations in progress is transient — it shows you the instantaneous value, and this “memoryless” property means you can’t use it to infer the number of I/O operations that are in progress on average. But the last field has memory, because it is defined as follows:
Field 11 — weighted # of milliseconds spent doing I/Os This field is incremented at each I/O start, I/O completion, I/O merge, or read of these stats by the number of I/Os in progress (field 9) times the number of milliseconds spent doing I/O since the last update of this field. This can provide an easy measure of both I/O completion time and the backlog that may be accumulating.
So the field indicates the total number of milliseconds that all requests have been in progress. If two requests have been waiting 100ms, then 200ms is added to the field. And thus it records what happened over the duration of the sampling interval, not just what’s happening at the instant you look at the file. We’ll come back to that later.
Now, given two samples of I/O statistics and the time elapsed between them, we can easily compute everything iostat outputs in -dx mode. I’ll take them slightly out of order to reflect how the computations are done internally.
- rrqm/s is merely the incremental merges divided by the number of seconds elapsed.
- wrqm/s is similarly simple, and r/s, w/s, rsec/s, and wsec/s are too.
- avgrq-sz is the number of sectors divided by the number of I/O operations.
- avgqu-sz is computed from the last field in the file — the one that has “memory” — divided by the milliseconds elapsed. Hence the units cancel out and you just get the average number of operations in progress during the time period. The name (short for “average queue size”) is a little bit ambiguous. This value doesn’t show how many operations were queued but not yet being serviced — it shows how many were either in the queue waiting, or being serviced. The exact wording of the kernel documentation is “…as requests are given to appropriate struct request_queue and decremented as they finish.”
- %util is the total time spent doing I/Os, divided by the sampling interval. This tells you how much of the time the device was busy, but it doesn’t really tell you whether it’s reaching its limit of throughput, because the device could be backed by many disks and hence capable of multiple I/O operations simultaneously.
- await is the total time for all I/O operations summed, divided by the number of I/O operations completed.
- svctm is the most complex to derive. It is the utilization divided by the throughput. You saw utilization above; the throughput is the number of I/O operations in the time interval.
Although the computations and their results seem both simple and cryptic, it turns out that you can derive a lot of information from the relationship between these various numbers. This is one of those tools where a few lines of code have a surprising amount of meaning, which is left for the reader to understand. I’ll get more into that in the future.
Switching from Ubuntu to Fedora, and Thunderbird to Claws Mail
This weekend I backed up all my data, repartitioned my hard drive, and re-installed. I needed to do this because the only thing I had on the laptop was Ubuntu, and sadly, the reality is sometimes my clients use things that require me to use Windows (and sometimes a virtual machine won’t solve that). So now I’m dual-booting Windows. I think the last time I did that was sometime before 2001, so I’ve regressed 9 years.
I took this opportunity to switch from Ubuntu to Fedora. Why? They both released new versions almost at the same time, and I grabbed the live CDs and noticed that Fedora just worked better — better support for dual monitors, for example. And some things about Ubuntu have always irked me, such as “sorry, can’t play WAV file, that’s a proprietary codec, you must install some big package of proprietary codecs.” WHAT? When did a codec become necessary to play a PCM file? Codec stands for “code/decode” and that doesn’t make sense for PCM. Anyway, these are small things, but sometimes they add up. I did not do a default install of Fedora. I don’t trust ext4, and ext3 works fine for me, so I stayed with that. I also disabled SELinux right away — no thanks, *shudder*.
The much bigger switch was ditching Thunderbird in favor of Claws. I last used Claws back in… 2003, maybe? It was called Sylpheed Claws then, and was GTK1 or so. Now it’s much nicer looking. Anyway, Fedora installed Thunderbird 3, and after giving that a small test drive I decided that my long-standing love/hate relationship with Thunderbird was due for a change. I just need a mail client that can open a yes/no dialog in less than 1.5 seconds — is that too much to ask? I’m much happier with Claws. I use email a lot, probably something like 350 emails per day, and I’ve already found Claws more capable than Thunderbird in every way but one: I can’t quite figure out how to get the functionality I got from Thunderbird’s quicktext extension. Everything else is amazing — I don’t need extensions, everything’s built in by default. I use filters extensively, and the filtering and searching in Claws is much nicer than in Thunderbird. There are a handful of other things.
One notable thing is an archiving feature. I like to move emails to a folder after I’m done with them. I had sort of a hack in Thunderbird. The key combination Ctrl+Shift+M moves the selected message(s) to the same folder used for the last move operation. This worked acceptably well, until I moved a message elsewhere and forgot that my “archive” key combo no longer sent my messages to the archive. In Claws, I set up a custom action and attached it to the Y key, and voila I have real archiving functionality (without pressing 3 keys, too). I also remapped keys so it’s more vim-like. Gmail is the client I use for my personal accounts, so now I have consistent keystroke commands between both emails and my favorite text editor.
Another notable thing is that when I send an email, the sending process happens in the background. This is so much nicer. In Thunderbird, I’d Ctrl-Enter and then have to alt-tab my way past the sending dialog and the compose window, back to the main window to keep working; in Claws, I press Ctrl+Enter to send, and I’m immediately back at the main window. This might seem silly, but it’s actually a big deal. It helps me process email quite a bit faster.





