Archive for the ‘Rimm Kaufman Group’ tag
Yesterday I went to beCamp 2008 along with four roomfuls of other people interested in technology (perhaps close to 100 people total). The conference was a lot of fun. Not everything went as planned, but that was as planned. This was an Open Spaces conference and I thought it worked very well. From an email Eric Pugh sent:
Basically it all boils down to:
Open Space is the Law of Two Feet: if anyone finds themselves in a place where they are neither learning nor contributing they should move to somewhere more productive. And from the law flow four principles:
- Whoever comes are the right people
- Whatever happens is the only thing that could have
- Whenever it starts is the right time
- When it’s over, it’s over
From Hadoop to Bang-Splat
I used the law of two feet a time or two. In fact, the first session I wanted to go to, which was about Hadoop and MapReduce, had no knowledgeable attendees. Someone overslept. OK, that’s the way it goes: move along.
From there I went to a session about Unix command-line productivity. Most of the sessions I saw were traditional in that they had one person standing up talking and many people sitting and listening, but not all. This one had several clever command-line gurus mentioning their favorite power tips.
I learned about bang-splat and bang-dollar. The bangs have always gotten me in Bash: I avoid them because I’ve never felt like reading the Bash man page section on them. (Am I too lazy, or not lazy enough?) So it was great to hear some people say “bang-splat and bang-dollar are great” and then explain them. That was easy for me, and now I know how they can be useful to me.
This problem-first type of tip is great for me: tell me the problem, then how to solve it, rather than telling me what the solution is and leaving me guessing what kinds of problems I can solve with it. (The Bash man page is solution-first).
In case you’re wondering, bang-splat substitutes the arguments to the last command, and bang-dollar substitutes the last argument of the last command. So, instead of this:
$ touch file1 file2 file3 $ rm file1 file2 file3
I can do this:
$ touch file1 file2 file3 $ rm !*
There were lots of other nice tips too.
I ended up doing a talk on MySQL performance basics. I had no idea what the audience was looking for, so I winged it. I did make some slides, but most of the talk isn’t on the slides. You can get the slides from Percona’s slide page. It seemed to be useful to the folks attending, who had a wide variety of experience and knowledge about MySQL.
This session began with a demo of how to create an entire application stack in a few minutes with Cohesive Flexible Technologies. Someone else then demoed a similar thing using RightScale. rPath‘s Jeff Uphoff was also in the room, but we didn’t get to see a demo of that. During this session the talk turned to various topics including a little bit of the topics I wanted to hear about in the Hadoop session.
Lunch was catered Indian food provided by the Rimm-Kaufman Group. Yum.
Large Scale Storage
This session was sort of a round-table. The two people who talked the most were Josh Malone from the National Radio Astronomy Observatory and the Library of Congress, both of whom have a lot of storage needs they are unsure how to meet. Some people from UVA’s library were there as well, but I didn’t ask what they were working on.
This reminded me a lot of a recent keynote Jacek Becla gave at another conference. He’s with the Stanford Linear Accelerator Center, who are going to be generating a lotta data pretty soon.
High Availability Linux
This one started off with more from Josh Malone, who demoed Nagios briefly and then talked about his storage and backup systems. He uses BackupPC, which sounds pretty neat and very smart. We then talked about some of the things he’s looking into doing, with audience suggestions to look into shared storage or DRBD. We also looked at UltraMonkey briefly — it looks like it’s stagnating, though. And the Linux HA project.
Google App Engine
Finally, someone showed us a calculator application they’d built on Google App Engine, including the code and talking about the data model somewhat. It looks like a neat idea, but the lock-in worries me, a sentiment that was voiced by many others in the room.
I’m going to be at beCamp 2008, the followup to the first beCamp, which I sadly missed.
beCamp is a BarCamp un-conference. Tonight was about meeting, greeting, and throwing ideas at the wall to see which ones stick. Literally. We stuck pieces of paper on the wall with our ideas — things we can either talk about or want to hear about — and then scratched our votes on them to see which are popular.
I live and breathe MySQL for a decent part of the day, so I hesitated, but then stuck “MySQL Performance” on the wall. It got quite a few votes, so I assume will be giving a talk on MySQL performance basics at some point during the conference. (The exact schedule is probably being determined right now, in my absence, but I’m so tired right now that I’ll just take my chances on it not being at 8:00 AM tomorrow.) [edit: I just checked the website and there won't be anything before 9:00, and the schedule is determined tomorrow. I did say I'm tired, right?]
See you there!
As promised, I’ve created some improved software for monitoring MySQL via Cacti. I began using the de facto MySQL Cacti templates a while ago, but found some things I needed to improve about them. As time passed, I rewrote everything from scratch. The resulting templates are much improved.
You can grab the templates by browsing the source repository on the project’s homepage.
In no particular order, here are some things I improved:
- Standard polling interval and graph size by default.
- Full captions on every graph; you don’t have to guess at how big the values are. Each graph has current, max, and average values printed at the bottom for every value on it.
- Much more data is captured. I’ve graphed almost everything I could think of.
- The graphs are grouped better. Most graphs have only related values. There are some exceptions, but not many.
- The templates don’t hijack your existing installation. They don’t depend on or alter anything in your default Cacti installation.
- The script that gathers the data is totally rewritten from scratch, and much improved. For example, the math works on 32-bit systems. It has caching built-in so each poll cycle results in just one request to the server, instead of one request per graph. (This is a weakness of Cacti I’m trying to work around). It also has debugging aids and other good coding stuff.
- By default, it assumes you have the same username and password across every server you’re monitoring, so you don’t have to fill in a username and password for every single graph you create.
- One data template == one graph template. This helps work around another Cacti limitation.
- Lots more. Honestly I can’t really remember everything I’ve done. I’m sure you’ll help me remember by asking me how to get X feature working the way you want, and I’ll go “oh, yeah, that’s another thing I improved…”
Cacti templates are very laborious to create if they’re complex at all; it takes a long time and is very error-prone. Instead of doing it through Cacti’s web interface and exporting a huge XML file, I eliminated the redundancies and created a small, easy-to-maintain file from which I generate the XML template with a Perl script. This gives the added benefit of letting me (or you) generate templates with different parameters such as polling interval or graph size. The README file has the full details. However, I’ve pre-generated a set of templates that matches Cacti’s defaults, so you can probably just use that.
This has taken a lot of time. In particular, I spent a lot of time working on it at my former employer, The Rimm-Kaufman Group (kudos to them for letting me open-source the work) and I just spent most of my weekend writing the scripts to convert from the compact format to XML templates, so it’s possible to maintain these beasts. Plus I had to develop the compact format, too. This took a lot of time because I had to understand the Cacti data model, which is pretty complex.
Please enter issue reports for bugs, feature requests, etc at the Google project homepage, not in the comments of this blog post. I do not look through comments on my blog when I’m trying to remember what I should be working on for a software project.
If these templates help you and you feel like visiting my Amazon.com wishlist and sending something my way, I’d appreciate it!
PS: You may also be interested in Alexey Kovyrin’s list of templates for monitoring servers.