Xaprb

Stay curious!

A script snippet to relative-ize numbers embedded in text

with 7 comments

A lot of times I’m looking at several time-series samples of numbers embedded in free-form text, and I want to know how the numbers change over time. For example, two samples of SHOW INNODB STATUS piped through grep wait might contain the following:

Mutex spin waits 0, rounds 143359179688, OS waits 634106844
RW-shared spins 1224152309, OS waits 38278807; RW-excl spins 2432166425, OS waits 35264871
Mutex spin waits 0, rounds 143386303439, OS waits 634292093
RW-shared spins 1224197048, OS waits 38281423; RW-excl spins 2432347936, OS waits 35271423

How much have the numbers changed in the second sample? My head is too lazy to do that math. So Daniel Nichter and I whipped up Yet Another Snippet to self-discover patterns of text and numbers, and compare each line against the previous line that matches the same pattern. Let’s fetch it:

wget http://maatkit.googlecode.com/svn/trunk/util/rel

Now give it the above input, and it’ll print out something useful (emphasis mine):

Mutex spin waits 0, rounds 143359179688, OS waits 634106844
RW-shared spins 1224152309, OS waits 38278807; RW-excl spins 2432166425, OS waits 35264871
Mutex spin waits 0, rounds 27123751, OS waits 185249
RW-shared spins 44739, OS waits 2616; RW-excl spins 181511, OS waits 6552

My lazy brain likes that much better.

Written by Xaprb

September 1st, 2009 at 8:29 pm

Posted in SQL, Tools

Tagged with ,

7 Responses to 'A script snippet to relative-ize numbers embedded in text'

Subscribe to comments with RSS

  1. Great idea!

    One suggestion: Since the real output doesn’t have the bold emphasis to identify what it relativized, what about prefixing the new numbers with a plus sign? (Or, obviously, a minus sign if the difference is negative.)

    Anyway, just a thought. Either way this’ll definitely come in handy for me on occasion.

    Ben

    2 Sep 09 at 4:29 am

  2. Nice utility, but it does assume the numbers are positive integers. I tried using it for the output from “mysqladmin status” and quickly found that that “Queries per second” were treated as two different numbers, one on each side of the decimal point.

    Mitch Wright

    2 Sep 09 at 2:24 pm

  3. I would also add to the script in this case an indication of the time interval between the two samples.

    While that means the output changes, if you are looking at the 4 lines in isolation in your example, you don’t know if that’s a minute,hour,or day of processing.

    Ronald Bradford

    2 Sep 09 at 5:28 pm

  4. @Ronald:

    Interesting, maybe it could be made smart enough to recognize timestamps at the beginning of lines, and/or if the output of a command is being piped to it in real time it can time the arrival of each line.

    That said, in the scenario described in the post, it’s almost certainly just reading from a static text file, so there’s no available source of timing information. And that’s probably be the most common use case for this tool.

    Ben

    2 Sep 09 at 6:24 pm

  5. Mitch, right I had the same thought — we need to recognize floating-point numbers too.

    Xaprb

    3 Sep 09 at 3:51 pm

  6. @Ben

    SHOW INNODB STATUS includes a date/time stamp, so if comparing two complete files, and then looking at these subset of lines, a time comparison is possible (but not trival with date/time in human format).

    I’ve modified all my logging these days for key scripts to always include epoch_secs for this exact reason.

    Ronald Bradford

    3 Sep 09 at 7:36 pm

  7. [...] Schwartz, had a script snippet to relative-ize numbers embedded in text to [...]

Leave a Reply