Tag Archive for 'monitoring'

Improved Cacti monitoring templates for MySQL

Download MySQL Cacti templates

As promised, I’ve created some improved software for monitoring MySQL via Cacti. I began using the de facto MySQL Cacti templates a while ago, but found some things I needed to improve about them. As time passed, I rewrote everything from scratch. The resulting templates are much improved.

You can grab the templates by browsing the source repository on the project’s homepage.

In no particular order, here are some things I improved:

  • Standard polling interval and graph size by default.
  • Full captions on every graph; you don’t have to guess at how big the values are. Each graph has current, max, and average values printed at the bottom for every value on it.
  • Much more data is captured. I’ve graphed almost everything I could think of.
  • The graphs are grouped better. Most graphs have only related values. There are some exceptions, but not many.
  • The templates don’t hijack your existing installation. They don’t depend on or alter anything in your default Cacti installation.
  • The script that gathers the data is totally rewritten from scratch, and much improved. For example, the math works on 32-bit systems. It has caching built-in so each poll cycle results in just one request to the server, instead of one request per graph. (This is a weakness of Cacti I’m trying to work around). It also has debugging aids and other good coding stuff.
  • By default, it assumes you have the same username and password across every server you’re monitoring, so you don’t have to fill in a username and password for every single graph you create.
  • One data template == one graph template. This helps work around another Cacti limitation.
  • Lots more. Honestly I can’t really remember everything I’ve done. I’m sure you’ll help me remember by asking me how to get X feature working the way you want, and I’ll go “oh, yeah, that’s another thing I improved…”

Cacti templates are very laborious to create if they’re complex at all; it takes a long time and is very error-prone. Instead of doing it through Cacti’s web interface and exporting a huge XML file, I eliminated the redundancies and created a small, easy-to-maintain file from which I generate the XML template with a Perl script. This gives the added benefit of letting me (or you) generate templates with different parameters such as polling interval or graph size. The README file has the full details. However, I’ve pre-generated a set of templates that matches Cacti’s defaults, so you can probably just use that.

This has taken a lot of time. In particular, I spent a lot of time working on it at my former employer, The Rimm-Kaufman Group (kudos to them for letting me open-source the work) and I just spent most of my weekend writing the scripts to convert from the compact format to XML templates, so it’s possible to maintain these beasts. Plus I had to develop the compact format, too. This took a lot of time because I had to understand the Cacti data model, which is pretty complex.

Please enter issue reports for bugs, feature requests, etc at the Google project homepage, not in the comments of this blog post. I do not look through comments on my blog when I’m trying to remember what I should be working on for a software project.

If these templates help you and you feel like visiting my Amazon.com wishlist and sending something my way, I’d appreciate it!

PS: You may also be interested in Alexey Kovyrin’s list of templates for monitoring servers.

Technorati Tags:, , , , , ,

You might also like:

  1. What’s the best way to choose graph colors?
  2. A new home for innotop in the new year

What’s the best way to choose graph colors?

I have an issue I hope someone can help me with. I am generating RRDtool graphs (for Cacti monitoring templates for MySQL, which I’ll release soon) that have up to 11 different metrics on them. With that many lines or areas on a graph, it becomes very hard to pick colors that are easy to see and easy to distinguish from each other. What’s a good way to choose such colors? Is there a way to do it automatically — is there a formal method that will produce good results?

I know some color theory and I have read about how you can distinguish colors from each other (hue, value etc). But I am unsure the best way to choose this many colors. Trying by hand produces garish results or graphs that are just hard to read.

My first attempt to solve this with a program was to simply create a list of every possible completely saturated color in a 32-bit space — essentially, the “pure” colors around the rim of the color wheel — and divide it into the desired number of evenly spaced intervals. This produces pure colors, which is not ideal. They are hard to look at. Did I mention garish?

I can shuffle the order so that they’re not adjacent, but that only helps avoid a “rainbow effect” if I’m stacking areas of color on top of each other, like in the following image:

MySQL Command Counters

Ugh, rainbows (I chose those by hand, not with my program). Lines on a white background might be placed in any order, so shuffling doesn’t help with those graphs.

I modified my little script to let me vary the saturation and value. My thinking was that lines on a white background really shouldn’t be full-value, and when I’m drawing areas instead of lines, I should de-saturate them so they become more pleasing pastels. This doesn’t really help as much as I might have hoped for, either. Colors around 80% saturation and 60% value look pretty good, but they’re still ugly colors. And I can’t get over five colors without them starting to run together again. Here’s an example with only four colors that’s already hard to look at:

InnoDB I/O Activity

Part of the problem, I’m currently thinking, is that I’m varying only one dimension. I could be varying the saturation as well as the hue, for example. But that might be another rabbit hole that will waste more time.

Right now I’m thinking that I should ask for help, instead of continuing to work on this myself. So, any ideas are welcome!

By the way, beautiful colors would be nice… a lot of the colors I choose by hand are very pretty and I’m sure my impartial, evenly-distributing script will never choose them in a million years. Also, it’s actually a good thing when graphs each have their own color scheme (as long as it’s attractive) because it becomes easier to identify graphs without having to read the title. Just some extra food for thought.

Technorati Tags:, , , , , ,

You might also like:

  1. Improved Cacti monitoring templates for MySQL
  2. innotop 1.3.5 released
  3. How to monitor MySQL status and variables with innotop
  4. Advanced HTML table features, Part 2

How to measure MySQL slave lag accurately

Kevin Burton wrote recently about why SHOW SLAVE STATUS is really not a good way to monitor how far behind your slave servers are, and how slave network timeouts can mess up the slave lag. I’d like to chime in and say this is exactly why I thought Jeremy Cole’s MySQL Heartbeat script was such a natural fit for the MySQL Toolkit. It measures slave lag in a “show me the money” way: it looks for the effects of up-to-date replication, rather than asking the slave how far behind it thinks it is.

The slave doesn’t even need to be running. In fact, the tool doesn’t use SHOW SLAVE STATUS at all. This has lots of advantages: for example, it tells you how far the slave lags behind the ultimate master, no matter how deep in the replication daisy-chain it is. In other words, unlike SHOW SLAVE STATUS, it won’t tell you a slave is up-to-date just because it’s caught up to its master. If a slave’s master is an hour behind, it will report that the slave is an hour behind, too — because it is.

It’s a really smart approach. And you can daemonize it, and it’ll keep a file up-to-date with running averages (by default it averages the last one, five and fifteen minutes, but of course you can choose that). Now your monitoring scripts can be as simple as “cat /var/log/slave-delay” or some such.

It’s not a hard tool to write, and I suspect lots of people have done it, but I bet that between Jeremy, whoever worked on it at Six Apart, and me, we’ve produced a pretty good version of the tool. It’s part of the MySQL Toolkit, and the full manual is online.

Technorati Tags:, , , , , , ,

You might also like:

  1. MySQL Toolkit version 896 released
  2. How to sync tables in master-master MySQL replication
  3. Why MySQL says the server is not configured as a slave
  4. Introducing MySQL Slave Delay
  5. How to know if a MySQL slave is identical to its master

MySQL Toolkit version 896 released

Download MySQL Toolkit

This release of MySQL Toolkit adds a new tool, fixes some minor bugs, and adds new functionality to several of the tools.

New tool: MySQL Heartbeat

This tool was contributed by Proven Scaling’s Jeremy Cole and Six Apart. It measures replication delay on a slave, which can be daisy-chained to any depth. It does not rely on SHOW SLAVE STATUS, and in fact it doesn’t even need the slave processes to be running. You could use it to measure replication delay on your own hand-rolled replication, if you wanted.

The most common way to use it is to run one process to update a heartbeat on the master, and another to monitor the lag on a slave (you can run as many as you wish to monitor multiple slaves). By default it prints moving averages of delay over one, five and fifteen-minute time windows:

   0s [  0.00s,  0.00s,  0.00s ]
   0s [  0.00s,  0.00s,  0.00s ]
   1s [  0.02s,  0.00s,  0.00s ]
   2s [  0.05s,  0.01s,  0.00s ]
   3s [  0.10s,  0.02s,  0.01s ]
   4s [  0.17s,  0.03s,  0.01s ]
   0s [  0.17s,  0.03s,  0.01s ]
   0s [  0.17s,  0.03s,  0.01s ]
   0s [  0.17s,  0.03s,  0.01s ]

(of course, I couldn’t resist making that configurable, so you can specify your own time windows).

You can also run it as a daemon. Running the update process as a daemon is intuitive. Running the monitoring process isn’t quite as obvious, because a daemon should re-open STDOUT to /dev/null. What you can do is give it the –file argument and it’ll keep a file current with the most recent line of output, which you can check anytime you want to see how your slave has been doing over the last X time windows.

Changelog

Here’s a changelog for the other tools I updated in this release:

Changelog for mysql-deadlock-logger:

2007-09-20: version 1.0.4

   * Added --interval, --time, and --daemonize options, and signal handling.
   * --askpass did not allow different passwords on --source and --dest.

Changelog for mysql-duplicate-key-checker:

2007-09-20: version 1.1.1

   * Exit code wasn't always defined.

Changelog for mysql-query-profiler:

2007-09-20: version 1.1.5

   * Documentation didn't specify how queries in FILE are separated.

Changelog for mysql-slave-delay:

2007-09-20: version 1.0.1

   * Added a --daemonize option to detach from the shell and run in the background.

Changelog for mysql-slave-restart:

2007-09-20: version 1.0.1

   * Added a --daemonize option to detach from the shell and run in the background.

Changelog for mysql-table-checksum:

2007-09-20: version 1.1.15

   * The CHECKSUM strategy was always disabled.

Changelog for mysql-visual-explain:

2007-09-20: version 1.0.3

   * filesort wasn't applied to the first non-constant table.
Technorati Tags:, , , , , ,

You might also like:

  1. How to measure MySQL slave lag accurately
  2. Maatkit version 1877 released
  3. Introducing MySQL Slave Delay
  4. MySQL Toolkit version 815 released
  5. Maatkit version 1508 released

Version 1.5.2 of the innotop MySQL monitor released

Download innotop

This release is part of the unstable 1.5 branch. Its features will ultimately go into the stable 1.6 branch. You can download it from the innotop-devel package.

The major change is I’ve ripped out the W (Lock Waits) mode and enabled innotop to discover not only what a transaction is waiting for, but what it holds too. The new mode that replaces W is L (Locks). My last article goes into more detail on this.

Technorati Tags:, , , , ,

You might also like:

  1. How to debug InnoDB lock waits
  2. How to monitor InnoDB lock waits
  3. Version 1.6.0 of the innotop monitor for MySQL released
  4. Version 1.5.1 of the innotop MySQL monitor released
  5. A look at innotop’s new features

How to debug InnoDB lock waits

This article shows you how to use a little-known InnoDB feature to find out what is holding the lock for which an InnoDB transaction is waiting. I then show you how to use an undocumented feature to make this even easier with innotop.

Background

One of the most common complaints I’ve heard from DBAs used to other database servers is “I can’t find out who holds the locks that are blocking all these connections and making them time out.” I feel your pain. Before I helped scale my employer’s systems to deal with larger volumes of data, InnoDB lock contention was a serious issue. And as far as I knew, you couldn’t find out who was holding locks. I knew you could see who was waiting for locks to be granted; that’s easy. You just run SHOW INNODB STATUS and look for the following text:

------------
TRANSACTIONS
------------
Trx id counter 0 4874
Purge done for trx's n:o < 0 4869 undo n:o < 0 0
History list length 21
Total number of lock structs in row lock hash table 2
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 4873, ACTIVE 6 sec, process no 7142, OS thread id 1141152064 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 368
MySQL thread id 9, query id 173 localhost root Sending data
select * from t1 for update
——- TRX HAS BEEN WAITING 6 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 9 page no 3 n bits 72 index `PRIMARY` of table `test/t1` trx id 0 4873 lock_mode X waiting
…

That’s fine, but who holds the lock? I thought there was no way to find that out.

InnoDB Lock Monitor

Until I learned about the InnoDB Lock Monitor, that is. You enable it by running the following command:

CREATE TABLE innodb_lock_monitor(a int) ENGINE=INNODB;

It’s quite an ugly hack, but it turns out the table name is actually “magical.” It’s a special table name that tells InnoDB to start the lock monitor. You can stop it by dropping the table again.

This little-noticed feature makes InnoDB print out a slightly modified version of what you see with SHOW INNODB STATUS. The “slight modification” is to print out not only the locks the transaction waits for, but also those it holds. For example, here’s the transaction that holds the locks:

---TRANSACTION 0 4872, ACTIVE 32 sec, process no 7142, OS thread id 1141287232
2 lock struct(s), heap size 368
MySQL thread id 8, query id 164 localhost root
TABLE LOCK table `test/t1` trx id 0 4872 lock mode IX
RECORD LOCKS space id 9 page no 3 n bits 72 index `PRIMARY` of table `test/t1` trx id 0 4872 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;; 1: len 6; hex 000000000d35; asc      5;; 2: len 7; hex 800000002d0110; asc     -  ;;

That’s fine, but there are, ah, limitations. As the manual says, InnoDB periodically prints out this text — essentially spewing InnoDB’s guts — to its standard output. This gets redirected to the server error log in any sane installation. Who’s looking there? And it gets printed out at long intervals, which seems to be about every 16 seconds on the machines I use.

Plus, if you’ve looked at the result, you’ll understand this is not something you want to search through manually looking for data. The output can be absolutely huge. What DBA wants to pore over thousands of hex-dumped rows from the table just to answer the question “who holds that lock?”

All in all, this is not very convenient (yep, I know that’s an understatement).

Slightly more convenient

What’s a little more convenient than combing through all that text by hand is writing a program to parse InnoDB’s status output. You don’t have to, though. That’s what I wrote innotop to do. And I’ve just released version 1.5.2, which at long last has the ability to watch a log file as well as connecting to server(s).

Here’s how this works: you start innotop, and press the L key to switch to Lock mode. This replaces the old Lock Wait mode, which was only able to monitor the InnoDB lock waits you see in the normal output of SHOW INNODB STATUS.

This mode shows you something like the following:

_____________________________ InnoDB Locks __________________________
CXN   ID  Type    Waiting  Wait   Active  Mode  DB    Table  Index
file  12  RECORD        1  00:10   00:10  X     test  t1     PRIMARY
file  12  TABLE         0  00:10   00:10  IX    test  t1
file  12  RECORD        1  00:10   00:10  X     test  t1     PRIMARY
file  11  TABLE         0  00:00   00:25  IX    test  t1
file  11  RECORD        0  00:00   00:25  X     test  t1     PRIMARY

That’s helpful! I can see the locks held and waited for in a nice tabular format. It’s pretty easy to see connection 11 is blocking connection 12.

This is still pretty inconvenient, though. To get access to the server’s error log, I have to run innotop on the database server machine itself. Is there a better way?

Even better

There is, in fact, but I discovered it completely by accident. It’s not documented, but the extra information doesn’t just get printed to the server log. It also shows up in SHOW INNODB STATUS! Now that’s a nice surprise. It means innotop can get lock information from a normal connection instead of monitoring a log file.

After discovering this, I immediately added some more features to innotop. There are now hot-keys in L mode to enable and disable the lock monitor. Now you can press L, press the ‘a’ key to start the lock monitor, see what’s blocking the waiting transaction, press ‘o’ to stop the lock monitor, and you’re done.

Best yet

I’m sure you InnoDB administrators already recognize what an improvement this is over the options you previously had (essentially, you didn’t have any). There’s still a long way to go, though. Locks could be in the INFORMATION_SCHEMA or in a SHOW LOCKS command. I won’t speculate on why they aren’t already.

Of course, the upcoming Falcon storage engine already has better features for debugging lock contention than this. But my guess is it’ll be a long time before Falcon has the market share InnoDB has. All things considered, InnoDB is a pretty nice piece of software.

Conclusion

Download innotop

The conclusion to this whole article is: use innotop if you use InnoDB. Heck, use it if you use MySQL at all. It makes a lot of things a lot easier, not just debugging InnoDB lock contention. Feedback is welcome — just use the Sourceforge bug tracker, forums, and mailing lists.

Technorati Tags:, , , , ,

You might also like:

  1. How to monitor InnoDB lock waits
  2. How I patched InnoDB to show locks held
  3. How to find out who is locking a table in MySQL
  4. Version 1.5.2 of the innotop MySQL monitor released
  5. A little-known way to cause a database deadlock

Version 1.5.1 of the innotop MySQL monitor released

Download innotop

This release is part of the unstable 1.5 branch. Its features will ultimately go into the stable 1.6 branch. You can download it from the innotop-devel package.

The major change is a new Command Summary’ mode (switch to this mode with the ‘C’ key) that’s similar to mytop’s ‘c’ mode. It shows you the relative size of variables from SHOW STATUS and SHOW VARIABLES. Here’s a sample:

Command Summary (? for help) localhost, 25+07:16:43, 2.45 QPS, 3 thd, 5.0.40

_____________________ Command Summary _____________________
Name                    Value    Pct     Last Incr  Pct    
Select_scan             3244858  69.89%          2  100.00%
Select_range            1354177  29.17%          0    0.00%
Select_full_join          39479   0.85%          0    0.00%
Select_full_range_join     4097   0.09%          0    0.00%
Select_range_check            0   0.00%          0    0.00%

The default is to show the Com_* variables, but I’ve used a different prefix to illustrate that you can view any variables you want. You just choose the prefix. Useful ones are Select_, Handler_ and Sort_. This gives you instant insight into the kind of work your server is doing. You can see in the sample above that the kinds of joins the server does is healthily balanced towards scans and ranges on the first table. The server does very few full joins, full range joins, and range-check query plans (this is good).

The example shows one server, as you can see by the first line. Naturally, you can monitor many servers in aggregate, and it’s configured to do this by default if you’re watching more than one server. However, there’s a bug in the percentage columns when you do that (the Value columns are accurate when aggregated). I have a fix in mind for that, which will also fix many other things that cause me (and you) too much work when customizing innotop. But that’ll come later. I feel this is good enough for now, since the main use for this mode is when you’re just trying to familiarize yourself with a server, perhaps at a consulting job, or when reading someone’s tuning tutorial or the like.

Technorati Tags:, , , ,

You might also like:

  1. The innotop MySQL and InnoDB monitor
  2. How to monitor MySQL status and variables with innotop
  3. Version 1.6.0 of the innotop monitor for MySQL released
  4. Version 0.1.106 of innotop MySQL/InnoDB monitor released
  5. Version 0.1.123 of innotop released

innotop 1.5.0 released

Download innotop

Version 1.5.0 of the innotop MySQL and InnoDB monitor is out. This release is the first in the unstable 1.5.0 branch, which will eventually become the stable 1.6 branch. I’m beginning to merge the various branches I’ve made to support some of our needs at my employer. This first release adds some major new features and prepares for some other large improvements and new features.

What’s new

Here’s what’s new:

  • Added plugin functionality.
  • Added group-by functionality.
  • Moved the configuration file to a directory.
  • Enhanced filtering and sorting on pivoted tables.
  • Many small bug fixes.

Plugins

Plugins let you hook custom code into innotop. Your custom Perl module can extend or change innotop without touching its source code, and all you have to do is drop it into a directory and activate it (sound familiar to you WordPress users?). As an example of how this is useful, about two dozen lines of code lets me add “program” and “unix_pid” columns into the Query List and InnoDB Transaction List modes. These show the originating program and PID for connections by querying tables in which this data is stored. The plugin adds the columns and expressions for them, and then adds the data in by using innotop’s own DBI connections.

There’s an example plugin in the documentation.

Grouping

This functionality lets you apply something like a SQL GROUP BY to a table. There are some built-in rules (press the ‘=’ key in Q or T mode; it’s easier if you hide the header with the ‘h’ key first).

The built-in rules let you group connections or transactions by status. They also automagically show a ‘count’ column, which is there but hidden until the grouping is applied. Now you can see how many connections are in what status. Here’s a screenshot of before and after:

innotop ungrouped

innotop grouped

You can toggle this on and off easily with the ‘=’ key on any table. (Most tables don’t have default group-by expressions, though, so you’ll have to read the docs to learn more about that. As with any features, let me know if you have a useful default you want me to include in innotop).

Notes

Don’t be scared by the “unstable” designation. It only means that I’m getting ready for a lot of changes that don’t belong in a stable branch; this release should generally be as good quality as any other. And I don’t want to use a naming scheme like “innotop-6.0-pre-alpha-1_rel5″. When I release a version I don’t think is good quality, I’ll let you know ;-) Generally I’m going to confine that code to the Subversion repository.

As an aside, both this and the MySQL Toolkit project are becoming more popular, and as that happens, I’m also getting busier — among other things, I’m writing a book! I must say SourceForge is great in some ways for helping to manage the project, but a lot of extra work in others. For example, it created a bunch of default forums, trackers, and settings when I created the projects, and that’s been pretty hard to slog through. The documentation system is not useful for my project. I think I’ve finally figured out how to get emails when people submit bug reports. I’m also trying to automate the tedious release process as much as I can, and it’s not proving easy. I don’t mean this to be a litany of woes, because I know they and I are doing our respective bests; it’s more of a commentary on the increased work that comes with a “generic, flexible” system — which is what people always seem to want, until they get it. I’m sure you all know what I mean!

Please go download, use, write plugins, and find and report bugs (via the sourceforge tracker, of course)! And happy innotop-ing.

Technorati Tags:, ,

You might also like:

  1. Version 1.5.2 of the innotop MySQL monitor released
  2. Version 1.6.0 of the innotop monitor for MySQL released
  3. A look at innotop’s new features
  4. Version 1.5.1 of the innotop MySQL monitor released
  5. What I’ve been doing lately

innotop version 1.4.3 released

Download innotop

Version 1.4.3 of the innotop MySQL and InnoDB monitor is out. This release fixes some minor bugs and feature annoyances, and at last innotop has thorough documentation, available online!

What’s new

Here’s what’s new:

  • Added standard –version command-line option
  • Changed colors to cyan instead of blue; more visible on dark terminals.
  • Added information to the filter-choosing dialog.
  • Added column auto-completion when entering a filter expression.
  • Changed Term::ReadKey from optional to mandatory.
  • Clarified username in password prompting.
  • Ten thousand words of documentation! Documentation is embedded in innotop, installed as a man page, and available online.

Bugs fixed:

  • innotop crashed in W mode when InnoDB status data was truncated.
  • innotop didn’t display errors in tables if debug was enabled.
  • The colored() subroutine wasn’t being created in non-interactive mode.
  • Don’t prompt to save password except the first time.

What’s next

I don’t know how much time I’ll get to put into this in the coming months, but there’s already a lot of half-finished functionality in the Subversion repository, including the ability to write innotop plugins. If you’re interested, the code is in the trunk and in various branches.

Hopefully I’ll get time to work on some of that before the year is out.

Technorati Tags:, , , , ,

You might also like:

  1. Version 0.1.106 of innotop MySQL/InnoDB monitor released
  2. innotop version 1.0 released
  3. What to do when innotop crashes
  4. innotop 1.4.2 released
  5. Version 0.1.132 of innotop released