How to monitor server load on GNU/Linux

This article introduces six methods and 12 tools for monitoring system load, performance and related information on GNU/Linux and similar systems. I’ve seen many articles that mention one or two of these tools, but none that discusses and compares all the ones I find useful.

Gkrellm

Gkrellm is the choice of the “g33k” types. It’s a graphical program that monitors all sorts of statistics and displays them as numbers and charts. You can see examples of it in use on nearly every GNU/Linux screenshot website. It is very flexible and capable, and can monitor useful as well as ridiculous things via plugins. It can monitor the status of a remote system, since it’s a client/server system.

The downsides, in my opinion, are

  1. the impact on the monitored system’s performance (sometimes significant)
  2. the flashiness and eye candy make it seem more meaningful than it might be
  3. it’s graphical, needs to run as a daemon, and isn’t installed by default, so it’s not optimal for monitoring a server

“Task Manager” clones

gnome-system-monitor is a graphical program installed as part of the base Gnome system. It is somewhat similar to the Task Manager in Microsoft Windows. It isn’t very full-featured, with only three tabs (Processes, Resources, Devices). The Devices tab just shows devices, Resources shows the history of CPU, memory, swap and network usage, and the Processes tab shows the processes. The Processes tab is the only one that really lets the user “do” anything, such as killing or re-nicing processes, or showing their memory maps.

Of course, this tool is only available on systems with Gnome installed, and requires an X server to be running. This makes it impractical for use on a server.

I know there’s a similar tool on KDE systems, but I don’t have one handy to examine at the moment.

vmstat and related tools

vmstat is part of the base installation on most GNU/Linux systems. By default, it displays information about virtual memory, CPU usage, I/O, processes, and swap, and can print information about disks and more. It runs in a console. I find the command vmstat -n 5 very helpful for printing a running status display in a tabular format.

It’s great for figuring out how heavily loaded a system truly is, and what the problem (if any) is. For example, when I see a high number in the rightmost column (percent of CPU time spent waiting for I/O) on a database server, I know the system is I/O-bound.

iostat is part of the sysstat package on Gentoo, as are mpstat and sar. iostat prints similar statistics as vmstat, but gives more detail on specific devices and is geared toward understanding I/O usage in more detail than vmstat is. mpstat is a similar tool that prints processor statistics, and is multi-processor aware. sar collects, reports, and saves system activity information (for example, for later analysis).

All of these tools are very flexible and customizable. The user can choose what information to see and what format to see it in. These tools are not usually installed by default, except for vmstat.

top

top is the classic tool for monitoring any UNIX-like system. It runs in a terminal and refreshes at intervals, displaying a list of processes in a tabular format. Each column is something like virtual memory size, processor usage, and so forth. It is highly customizable and has some interactive features, such as re-nicing or killing processes. Since it’s the most widely known of the tools in this article, I won’t go into much detail, other than to say there’s a lot to know about it — read the man page.

top is one of the programs in the procps package, along with ps, vmstat, w, kill, free, slabtop, and skill. All these tools are in a default installation on most distributions.

htop is similar to top, except it is mouse-aware, has a color display, and displays little charts to help see statistics at a glance. It also has some features top doesn’t have.

On a somwhat-related note, mytop is a handy monitor for MySQL servers. Take a look at Jeremy Zawodny’s website while you’re there. He is a smart cookie.

tload

tload runs in a terminal and displays a text-only “graph” of current system load averages, garnered from /proc/loadavg. It is part of the base installation on most GNU/Linux systems. I find it extremely useful for watching a system’s performance over SSH, often within a GNU Screen session.

My favorite technique is to start a terminal, connect over SSH, resize the terminal to 150×80 or so, then start tload and shrink the window by CTRL-right-clicking and selecting “Unreadable” as the font size. The result looks like the following:

Server load diagram

I then set the terminal window as always-on-top and move it to a corner of my screen, where it prints a pretty little graph as time goes by.

The only trouble is, it’s not really obvious what the graph means. The man page isn’t terribly helpful; it just says tload gets its numbers from the /proc/loadavg file, and there’s no man page for that file. I looked in the kernel source for the answer.

Documentation/filesystems/proc.txt says loadavg is “Load average of last 1, 5 & 15 minutes,” but not how it’s calculated. Poking around in source/fs/proc/proc_misc.c and kernel/timer.c reveals the origin of the numbers: the number of running and uninterruptible processes (see http://lxr.linux.no/source/kernel/timer.c#L832).

watch

watch isn’t really a load-monitoring tool, but it’s beastly handy because it takes any command as input and monitors the result of running that command. For example, if I wanted to monitor when the “foozle” program is executing, I could run

watch --interval=5 "ps aux | grep foozle | grep -v xaprb"

Summary

I’ve given an overview of lots of tools above. Each has its use. I’m not a big fan of graphical tools, and they’re not very practical for monitoring servers anyway. Therefore, I lean towards running tload over SSH to monitor systems, and use vmstat, iostat and friends to troubleshoot specific problems.

Do you have any favorite programs for monitoring and troubleshooting GNU/Linux systems that should be on this list? Leave a response!

Technorati Tags:No Tags

You might also like:

  1. How we enabled threading in MySQL
  2. How to monitor MySQL status and variables with innotop
  3. The innotop MySQL and InnoDB monitor
  4. A review of MONyog
  5. MySQL Toolkit version 896 released

12 Responses to “How to monitor server load on GNU/Linux”


  1. 1 Xaprb

    Another tool I forgot to mention is lsof, which lists open files. Don’t be fooled by how simple that sounds! It’s tremendously powerful. Do some Google searches and you can find pages that give examples of how to figure out things you’d never think are possible to know just by looking at open files. Indeed, some of these things I can’t even think how else to do.

  2. 2 Anders Liljeqvist

    Excellent article - thanks a lot!

    Any ideas on how to best get an idea of network load from the terminal? It would be neat with a top-like terminal application to get an idea of how much bandwidth my servers are using at any moment.

    Thanks again,

    -A.

  3. 3 Jon

    “Any ideas on how to best get an idea of network load from the terminal?”

    You can use a product called ‘iftop’ which displays stats on interface (the ‘if’ in iftop) and it will show which hosts are using the most bandwidth to/from your host.

  4. 4 Tom

    Hi..I can’t get tload to look like your image. I tried scaling,
    but I couldn’t get it to work. I’m on FC4. Also, CTRL-right
    click does nothing to the window. What am I missing?

    Thanks

  5. 5 Xaprb

    You’re probably not using xterm; you’re probably using gnome-terminal or similar. Try explicitly running xterm and see if that behaves as you want.

  6. 6 Rashmi

    Thanks a lot!!!!!!!

  7. 7 Gobo

    My personal preference when monitoring remote is the Dstat tool. It shows a continuous status each second.

  8. 8 Chris

    Some other handy tools for me are:
    ethstats - shows throughput of each ethernet card on console line (deb pkg: ethstats)

    iptraf - More indepth throughput monitor, shows packet/data flow to each host currently conntect

    netstat - another handy tool to see whats connected (-p shows pid of connection, -t tcp only, -n numeric) man for the rest.

  9. 9 Balan

    Give me a simple understandable sample calcualtion of load average for unix machine.

  1. 1 7 modi per esaminare il carico di un server linux
  2. 2 7 modi per esaminare il carico di un server linux [.:: Maurizio Pelizzone ::.] | rubriche
  3. 3 Linux Monitoring, GNU Command Line.

Leave a Reply

Please do not use this blog to get help with problems or bugs in Maatkit or innotop: use the Sourceforge forums, mailing list, or bug trackers. If you're asking for help with MySQL, please use the MySQL mailing list instead. I'm writing a book and my time is extremely limited :-)