How to read Linux’s /proc/diskstats easily
These days I spend more time looking at /proc/diskstats than I do at iostat. The problem with iostat is that it lumps reads and writes together, and I want to see them separately. That’s really important on a database server (e.g. MySQL performance analysis).
It’s not easy to read /proc/diskstats by looking at them, though. So I usually do the following to get a nice readable table:
- Grep out the device I want to examine.
- Push that through “rel” from the Aspersa project.
- Add column headers, then format it with “align” from the same project.
Here’s a recipe. You might want to refer to the kernel iostat documentation too.
wget http://aspersa.googlecode.com/svn/trunk/rel
wget http://aspersa.googlecode.com/svn/trunk/align
chmod +x rel align
while sleep 1; do grep sdb1 /proc/diskstats; done > stats
# CTRL-C after a while
echo m m dev reads rd_mrg rd_sectors ms_reading \
writes wr_mrg wr_sectors ms_writing cur_ios \
ms_doing_io ms_weighted | cat - stats | ./rel | ./align
m m dev reads rd_mrg rd_sectors ms_reading writes wr_mrg wr_sectors ms_writing cur_ios ms_doing_io ms_weighted
8 17 sdb1 233290130 310126 22472222903 2032292523 479678266 883257 35319718188 1491098806 0 675768709 3523591184
0 0 sdb0 80 0 2560 1049 236 1 13621 253 0 161 1302
0 0 sdb0 226 0 7232 1418 638 0 40156 235 0 224 1653
0 0 sdb0 39 0 1248 295 519 0 35440 573 17 196 1669
0 0 sdb0 73 0 2336 4031 2104 0 134480 3076 -17 908 6308
0 0 sdb0 33 0 1056 293 3 0 834 0 0 293 293
0 0 sdb0 35 0 1120 157 3 0 642 0 0 150 157
0 0 sdb0 36 0 1152 161 3 0 586 0 1 155 162
0 0 sdb0 36 0 1152 140 3 0 738 0 0 139 141
0 0 sdb0 208 0 6656 4514 630 0 40552 1002 -1 455 5514
0 0 sdb0 57 0 1824 547 485 0 35156 485 16 195 1566
0 0 sdb0 99 0 3168 4442 2067 0 133972 3491 -16 869 7403
0 0 sdb0 22 0 704 140 20 0 8801 12 0 144 152
0 0 sdb0 16 0 512 62 2 0 466 0 0 62 62



Baron, handy tools, thank you for sharing :) I’m using your previous post (http://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/) as a reference for interpreting this output.
In your example here, what does it mean when (ms_read + ms_write) > ms_weighted, and what does the opposite mean? I see cases of both — line 3, ms_weighted is almost 2x higher; third line from bottom, it is about 15% lower.
Also, would I be correct to say that ms_reading|writing are sum of per-thread stats (similar to ms_weighted), such that they are often higher than elapsed wall-clock time? If so, is the relationship between ms_doing_io (wall-clock time) and ms_weighted (per-thread) approximately indicative of the amount of IO parallelization happening?
Devananda
14 May 10 at 4:04 pm
I’m not sure it really helps to think of it in terms of per-thread.
From the iostats documentation, you can see that ms_read + ms_write is “the total number of milliseconds spent by all reads (as measured from __make_request() to end_that_request_last()).” This doesn’t get incremented until the read completes. However, ms_weighted increases as the reads are in progress. Various scenarios can lead to the numbers being bigger or smaller. Remember also that the numbers in my sample are all made relative to the previous line.
Milliseconds spent reading and writing are indeed the sum of all requests, so they are higher than wall-clock time if there is parallelism. And you can apply Little’s Law to these numbers to compute the parallelism and a bunch of other stuff. I think I have a draft blog post (part 2 of my previous post you referenced) on that.
Xaprb
14 May 10 at 5:09 pm
In my 2nd paragraph above, I used “read” as a synonym for “request” which I think confuses things more. Sorry.
Xaprb
14 May 10 at 5:13 pm
Hi,
rel doesn’t handle (‘s and )’s correctly at the moment. I’ve patched it for my purposed like such:
*** 73,78 ****
— 73,79 —-
}
if ( !$matched ) {
my $new_pat = $line;
+ $new_pat =~ s/([()*+{}|])/\\$1/g;
$new_pat =~ s/\d+/(\\d+)/g;
my @vals = $line =~ m/$new_pat/;
MKDEBUG && print “new pattern: $new_pat: @vals\n”;
There are probably some more special chars that don’t match themselves that need adding to the list. You may wish to do a better job and fix in source control. And delete/ignore this comment.
GBA
25 Mar 12 at 9:42 pm
Hi,
did you notice errors in diskstats for multipath devices? there are random writes stats for the dm device although there are only reads being done, as can be seen on the sdd and sdk devices….
36000d77d00008bbe6253d72a59a2d4b3 dm-2 FALCON,IPSTOR DISK
size=14T features=’1 queue_if_no_path’ hwhandler=’0′ wp=rw
`-+- policy=’round-robin 0′ prio=1 status=active
|- 6:0:3:22 sdd 8:48 active ready running
`- 7:0:0:22 sdk 8:160 active ready running
idlnxi730:~ # !1062
while sleep 1; do grep ‘sdd\|sdk\|dm-2′ /proc/diskstats; done > stats
^C
idlnxi730:~ #
idlnxi730:~ # !1069
echo m m dev reads rd_mrg rd_sectors ms_reading writes wr_mrg wr_sectors ms_writing cur_ios ms_doing_io ms_weighted | cat – stats | ./align
m m dev reads rd_mrg rd_sectors ms_reading writes wr_mrg wr_sectors ms_writing cur_ios ms_doing_io ms_weighted
8 48 sdd 373794 131 193547858 1772224 314299 0 239332928 6987912 0 1266628 8759344
8 160 sdk 373681 131 193579032 1935656 314344 0 242491352 7945516 5 1386768 9880364
253 2 dm-2 747335 5635 387120530 10986020 628643 59598690 481824280 168243040 5 2307788 179228592
8 48 sdd 374167 131 193552826 1774944 314299 0 239332928 6987912 5 1267204 8762084
8 160 sdk 374003 131 193581792 1937972 314344 0 242491352 7945516 0 1387256 9882664
253 2 dm-2 748030 5635 387128258 10991060 628643 59598690 481824280 168243040 5 2308792 179233632
8 48 sdd 374494 131 193556530 1777300 314299 0 239332928 6987912 0 1267700 8764420
8 160 sdk 374373 131 193586008 1940652 314344 0 242491352 7945516 5 1387820 9885356
253 2 dm-2 748727 5635 387136178 10996100 628643 59598690 481824280 168243040 5 2309796 179238664
8 48 sdd 374823 131 193560418 1779952 314299 0 239332928 6987912 5 1268264 8767084
8 160 sdk 374703 131 193589720 1943060 314344 0 242491352 7945516 0 1388332 9887752
253 2 dm-2 749386 5635 387143778 11001160 628643 59598690 481824280 168243040 5 2310800 179243724
8 48 sdd 375194 131 193565242 1782652 314299 0 239332928 6987912 0 1268832 8769772
8 160 sdk 375033 131 193593048 1945428 314344 0 242491352 7945516 5 1388844 9890156
253 2 dm-2 750087 5635 387151930 11006228 628643 59598690 481824280 168243040 5 2311804 179248816
8 48 sdd 375494 131 193569114 1784904 314299 0 239332928 6987912 3 1269304 8772032
8 160 sdk 375401 131 193596544 1948256 314344 0 242491352 7945516 2 1389440 9892964
253 2 dm-2 750755 5635 387159298 11011308 628643 59598690 481824280 168243040 5 2312804 179253884
8 48 sdd 375894 131 193573482 1787784 314299 0 239332928 6987912 0 1269916 8774904
8 160 sdk 375717 131 193600208 1950448 314344 0 242491352 7945516 5 1389904 9895152
253 2 dm-2 751471 5635 387167330 11016380 628643 59598690 481824280 168243040 5 2313808 179258944
8 48 sdd 376194 131 193576770 1790044 314299 0 239332928 6987912 0 1270400 8777164
8 160 sdk 376087 131 193604024 1953216 314344 0 242491352 7945516 5 1390492 9897936
253 2 dm-2 752141 5635 387174434 11021408 628643 59598690 481824280 168243040 5 2314812 179263988
8 48 sdd 376494 131 193579874 1792800 314299 0 239332928 6987912 0 1270992 8779920
8 160 sdk 376356 131 193607064 1955512 314344 0 242491352 7945516 5 1390984 9900228
aart
13 Mar 13 at 5:18 pm