Xaprb

Stay curious!

iopp: a tool to print I/O operations per-process

with 8 comments

Mark Wong’s entry titled “Following up a couple questions from the presentation at PSU on January 8, 2009” just caught my eye:

What is ‘iopp’?

It’s a custom tool to go through the Linux process table to get i/o statistics per process. It is open source and can be downloaded from:

http://git.postgresql.org/?p=~markwkm/iopp.git;a=summary

If you know me, you know I can’t pass up “I/O statistics per process.” No way. So, after a moment of browsing the code, which is short and to the point, I tried it out:

baron@kanga:~$ wget -q -O iopp.c "http://git.postgresql.org/?p=~markwkm/iopp.git;a=blob_plain;f=iopp.c;hb=HEAD"
baron@kanga:~$ gcc -o iopp iopp.c 
baron@kanga:~$ ./iopp --help
usage: iopp -h|--help
usage: iopp [-ci] [-k|-m] [delay [count]]
            -c, --command display full command line
            -h, --help display help
            -i, --idle hides idle processes
            -k, --kilobytes display data in kilobytes
            -m, --megabytes display data in megabytes

Sweet! Next,

baron@kanga:~$ ./iopp -i -k 5
  pid    rchar    wchar    syscr    syscw      rkb      wkb     cwkb command
 4912        2        1        0        0        0        0        0 dbus-daemon
 5713        0        1        0        0        0        0        0 hald
 5717       17        0        0        0        0        0        0 hald-runner
 5932        0        2        0        0        0        0        0 NetworkManager
22101       94       28        0        0        0        0        0 Xorg
22238        4        4        0        0        0        0        0 pulseaudio
22684       29       55        1        0        0        0        0 firefox
26860        0       43        0        0        0        0        0 gnome-terminal

It behaves just like vmstat — it loops every 5 seconds until I stop it.

So what are we looking at here? I don’t see any documentation, but I see from the source that it’s reading /proc/[PID]/io. Unfortunately that’s not documented in my proc manpage, but there’s a patch that provides documentation for the file’s contents.

According to that, we’re looking at the pid, the number of kibibytes read and written (even if they came from the cache), the number of read and write system calls, and the number of kibibytes read and written to physical medium (i.e. not just to the OS cache). Finally we have canceled write kibibytes, and the command name. I won’t repeat the documentation on the canceled write bytes — it is what it sounds like, but there’s a little bit more explanation on that patch I linked.

This tool would have been very handy to know about last week. One of my clients was seeing a lot of disk writes from a MySQL server, and it would have made it considerably easier to diagnose the problem.

There is one small bug — the -i flag causes idle processes not to be printed out, but it’s applied after bytes have been transformed into kibi/mebibytes, so any process that has zeroes after that transformation gets filtered out. So you’ll get different output from -i -k than you will from -i or from -i -m. I’ll see if I can find the author’s email address and let him know about this…

Further Reading:

Written by Xaprb

January 13th, 2009 at 8:16 pm

8 Responses to 'iopp: a tool to print I/O operations per-process'

Subscribe to comments with RSS

  1. I might also point out the similar tool, iotop http://guichaz.free.fr/iotop/ which I’ve found quite useful in the past. Unfortunately these require a relatively recent kernel (2.6.20+ or equivalent patched, I think) for the io accounting functionality so they don’t work for older linux systems (for instance, RHEL4,5).

    Andrew Garner

    13 Jan 09 at 9:11 pm

  2. [root@toro-host1 ~]# cat /proc/17304/io
    cat: /proc/17304/io: No such file or directory

    dang. :(

    I was so excited too. *kicks CentOS5*

    Don MacAskill

    13 Jan 09 at 9:42 pm

  3. If you’re on a system with DTrace you can get this and more (things like avg, min, max latency of those i/o ops and get it by table for PostgreSQL).

    Check out: https://labs.omniti.com/trac/pgsoltools

    in trunk/tools/pg_file_stress

    Enjoy!

  4. Newer versions of the “sysstat” package on most linux distros should contian the program “pidstat” wich basically does the same. So it should be available via the native distro tools as binary package.

    Ben Hicks

    14 Jan 09 at 1:19 pm

  5. You know what would really rock… I/O per query. :-)

    Boris Burtin

    14 Jan 09 at 9:09 pm

  6. I can’t tell whether you’re being sly! Percona’s patches provide that for MySQL. Tell me about Postgres, because I don’t know.

    Xaprb

    14 Jan 09 at 9:59 pm

  7. Ha ha. I’m kind of like the opposite of sly. I didn’t realize you had a patch for that. I’ll have to take a closer look now. Thanks for the tip!

    Boris Burtin

    15 Jan 09 at 1:14 pm

  8. Baron,

    Too bad most of people run older Linux kernels – newer ones get much better process accounting and IO is one of them.

    It actually would be very cool when/if per threads IO statistics will be implemented – this could be when used to get stats on Per Query level for MySQL easily.

    Peter Zaitsev

    18 Jan 09 at 2:01 pm

Leave a Reply