Xaprb

Stay curious!

Archive for the ‘Sys Admin’ Category

Disk latency versus filesystem latency

with one comment

Brendan Gregg has a very good ongoing series of blog posts about the importance of measuring latency at the layer that’s appropriate for the question you are trying to answer. If you’re wondering whether I/O latency is a problem for MySQL, you need to measure I/O latency at the filesystem layer, not the disk layer. There are a lot of factors to consider. To quote from his latest post:

This isn’t really a problem with iostat(1M) – it’s a great tool for system administrators to understand the usage of their resources. But the applications are far, far away from the disks – and have a complex file system in-between. For application analysis, iostat(1M) may provide clues that disks could be causing issues, but you really want to measure at the file system level to directly associate latency with the application, and to be inclusive of other file system latency issues.

Someone should add Brendan’s feed to Planet MySQL. Here are the articles: part 1, part 2. Brendan will be talking about this topic at Percona Live on the 26th.

Written by Xaprb

May 15th, 2011 at 7:13 am

Posted in SQL,Sys Admin,Tools

Tagged with

How to gather statistics at regular intervals

with 4 comments

I gather a lot of statistics such as performance data. Sometimes I have multiple things going on a system and I want to be able to align and compare the resulting data from multiple processes later. That means they need to be aligned on time intervals. Here is a naive way to gather stats at intervals:

while sleep 1; do gather-some-stats; done

There are two problems: each iteration will take longer than a second, so there will be drift; and the iterations will not be aligned exactly on the clock ticks, so the data isn’t as easy to correlate with other samples. This becomes a bigger problem when there are many such jobs gathering data at longer intervals such as 15 seconds or 5 minutes, where the lack of correlation between samples can be frustrating.

Here is what I’ve been doing recently. Is there a better way?

INTERVAL=1
while true; do
   sleep=$(date +%s.%N | awk "{print $INTERVAL - (\$1 % $INTERVAL)}")
   sleep $sleep
   gather-some-stats
done

Written by Xaprb

March 18th, 2011 at 10:17 am

Posted in Coding,Sys Admin

Version 1.1.8 of Better Cacti Templates released

without comments

I’ve released version 1.1.8 of the Better Cacti Templates project. This release includes a bunch of bug fixes and several important new graphs. There are graphs for the new response-time statistics exposed in Percona Server, and a new set of graphs for MongoDB.

There are upgrade instructions on the project wiki for this and all releases. There is also a comprehensive tutorial on how to create your own graphs and templates with this project. Use the project issue tracker (not the comments on this post!) to view and report issues, and use the project mailing list to discuss the templates and scripts.

The full changelog follows.

2011-01-22: version 1.1.8

  * The cache file names could conflict due to omitting --port (issue 171).
  * Load-average parsing did not work correctly at high load (issue 170).
  * The --mpds option to make-template.pl did not create new inputs (issue 133).
  * The url and port were reversed in the Nginx commandline (issue 149).
  * Added $nc_cmd to ss_get_by_ssh.php (issue 154, issue 152).
  * InnoDB Transactions and other graphs showed NaN instead of 0 (issue 159).
  * Added graphs for Percona Server response-time distribution (issue 158).
  * Added graphs for MongoDB (issue 136).
  * Added a minimum option to the template construction logic (issue 169).
  * Added memtotal for Memory (issue 146).
  * make-template.pl sanity checks were too strict (issue 168).

Written by Xaprb

January 22nd, 2011 at 9:25 pm

Posted in Cacti,PHP,SQL,Sys Admin