Archive for April, 2008

Improved Cacti monitoring templates for MySQL

Download MySQL Cacti templates

As promised, I’ve created some improved software for monitoring MySQL via Cacti. I began using the de facto MySQL Cacti templates a while ago, but found some things I needed to improve about them. As time passed, I rewrote everything from scratch. The resulting templates are much improved.

You can grab the templates by browsing the source repository on the project’s homepage.

In no particular order, here are some things I improved:

  • Standard polling interval and graph size by default.
  • Full captions on every graph; you don’t have to guess at how big the values are. Each graph has current, max, and average values printed at the bottom for every value on it.
  • Much more data is captured. I’ve graphed almost everything I could think of.
  • The graphs are grouped better. Most graphs have only related values. There are some exceptions, but not many.
  • The templates don’t hijack your existing installation. They don’t depend on or alter anything in your default Cacti installation.
  • The script that gathers the data is totally rewritten from scratch, and much improved. For example, the math works on 32-bit systems. It has caching built-in so each poll cycle results in just one request to the server, instead of one request per graph. (This is a weakness of Cacti I’m trying to work around). It also has debugging aids and other good coding stuff.
  • By default, it assumes you have the same username and password across every server you’re monitoring, so you don’t have to fill in a username and password for every single graph you create.
  • One data template == one graph template. This helps work around another Cacti limitation.
  • Lots more. Honestly I can’t really remember everything I’ve done. I’m sure you’ll help me remember by asking me how to get X feature working the way you want, and I’ll go “oh, yeah, that’s another thing I improved…”

Cacti templates are very laborious to create if they’re complex at all; it takes a long time and is very error-prone. Instead of doing it through Cacti’s web interface and exporting a huge XML file, I eliminated the redundancies and created a small, easy-to-maintain file from which I generate the XML template with a Perl script. This gives the added benefit of letting me (or you) generate templates with different parameters such as polling interval or graph size. The README file has the full details. However, I’ve pre-generated a set of templates that matches Cacti’s defaults, so you can probably just use that.

This has taken a lot of time. In particular, I spent a lot of time working on it at my former employer, The Rimm-Kaufman Group (kudos to them for letting me open-source the work) and I just spent most of my weekend writing the scripts to convert from the compact format to XML templates, so it’s possible to maintain these beasts. Plus I had to develop the compact format, too. This took a lot of time because I had to understand the Cacti data model, which is pretty complex.

Please enter issue reports for bugs, feature requests, etc at the Google project homepage, not in the comments of this blog post. I do not look through comments on my blog when I’m trying to remember what I should be working on for a software project.

If these templates help you and you feel like visiting my Amazon.com wishlist and sending something my way, I’d appreciate it!

PS: You may also be interested in Alexey Kovyrin’s list of templates for monitoring servers.

Technorati Tags:, , , , , ,

You might also like:

  1. What’s the best way to choose graph colors?
  2. A new home for innotop in the new year

Baron Schwartz on a podcast at MySQL Conference and Expo 2008

I did an interview with Barton George from Sun while I was at the conference last week. Barton has now posted the interview. If you’re quick, you can listen to it before I do.

Topics: everything and anything, including Maatkit and PostgreSQL.

Technorati Tags:, , , , , ,

You might also like:

  1. Like it or not, it is the MySQL Conference and Expo
  2. My presentations at the 2008 MySQL Conference and Expo
  3. Going to PostgreSQL Conference East
  4. Slides for the innotop workshop at MySQL Conference and Expo 2007
  5. MySQL Conference and Expo 2007 Audio

Like it or not, it is the MySQL Conference and Expo

The conference that many of us just went to is called the MySQL Conference and Expo, but a lot of people don’t call it that. They call it by the name it had in 2006 and earlier: MySQL User’s Conference. In fact, some people say (or blog) that they dislike the new name and they’re going to call it the old name, because [… insert reason here…].

I call it by the new name that some people dislike so much. Why? Because it is a conference and expo, not a user’s conference. There’s no reason to pretend otherwise. The conference is organized and owned by MySQL, not the users. It isn’t a community event. It isn’t about you and me first and foremost. It’s about a company trying to successfully build a business, and other companies paying to be sponsors and show their products in the expo hall. Times have changed.

I’m not saying any of this is bad. Being successful in business is a good thing, and having sponsors and partners is fine too. I’m just pointing out that trying to make it be a user’s conference, just by calling it one, isn’t going to work.

If community members want a community conference, we’ll have to make one. MySQL/Sun cannot do this for us, because then it wouldn’t be a community conference.

There’s a simple test of whether people want this: if it happens, then the community wanted it badly enough to do something about it.

The PostgreSQL East 2008 conference I went to a few weeks ago was a great example of how this works. And the attendance fee was $75, not thousands. A conference doesn’t have to be expensive.

Who wants a conference by, for, and of the community?

Technorati Tags:, , ,

You might also like:

  1. Going to PostgreSQL Conference East
  2. Baron Schwartz on a podcast at MySQL Conference and Expo 2008
  3. MySQL Conference and Expo 2008, Day Two
  4. Remember to sign up for MySQL Conference and Expo!
  5. My presentations at the 2008 MySQL Conference and Expo

Spring 2008 issue of MySQL Magazine

Keith Murphy and his hard-working crew have released the spring 2008 issue of MySQL Magazine. Go take a look — it includes quite a few articles on various topics, even a mention of our upcoming book (High Performance MySQL, Second Edition).

Technorati Tags:, ,

You might also like:

  1. Get a free sample chapter of High Performance MySQL Second Edition
  2. High Performance MySQL 2nd Edition is in production
  3. High Performance MySQL Second Edition Schedule
  4. Coming soon: High Performance MySQL, Second Edition
  5. High Performance MySQL, Second Edition: Replication, Scaling and High Availability

MySQL Conference and Expo 2008, Day Three

Here’s a rundown of Thursday (day 3) of the MySQL Conference and Expo. This day’s sessions were much more interesting to me than Wednesday’s, and in fact I wanted to go to several of them in a single time slot a couple of times.

Inside the PBXT Storage Engine

This session was, as it sounds, a look at the internals of PBXT, a transactional storage engine for MySQL that has some interesting design techniques. I had been looking forward to this session for a while, and Paul McCullagh’s nice explanations with clear diagrams were a welcome aid to understanding how PBXT works. Unlike some of the other storage engines, PBXT is being developed in full daylight, with an emphasis on community involvement and input. (Indeed, I may be contributing to it myself, in order to make its monitoring and tuning capabilities second to none).

PBXT has not only a unique design, but a clear vision for differentiating itself from other transactional storage engines. It’s not trying to clone any particular engine; Paul and friends are planning to add some capabilities that will really set it apart from other engines, including high-availability features and blob streaming.

I left this session with a much better understanding of how PBXT balances various demands to satisfy all sorts of different workload characteristics, how it writes data, how it achieves transactional durability, and so on. I think these capabilities, and its performance, can really be assessed only in the real world (of course), but in principle it sounds good. I love knowing how things work!

There were about 30 people in the talk. I wish there had been more, because I think PBXT is going to be an important part of the open ecosystem going forward. However, I feel pretty confident people will take more notice if it starts to get used in the real world. Someone had a video camera there, so you might check out the video when it’s available. Paul’s explanations are really good.

Helping InnoDB Scale on Servers with Many CPU Cores and Disks

This session was Mark Callaghan’s chance to unveil the work he and others have been doing on InnoDB’s scalability issues, which mostly revolve around mutex contention. Mark’s team has completely solved the problems on their workload and benchmarks. In fact, after the changes, InnoDB exhibited significantly better performance even than MyISAM, which began to be limited by the single mutex that synchronizes access to its key cache. (Yes, in fact MyISAM has scalability problems too).

Google’s workload for MySQL, in case you’re wondering, is pretty traditional (i.e. not web-like; more like an “enterprise” application). Heavily I/O-bound, 24/7 critical systems, and so on.

Mark also wore several community t-shirts at various points in the talk, including one of my Maatkit t-shirts. Mark said Maatkit would be perfect if only it were written in Python (Google’s preferred scripting language). Alas, Mark, it’ll stay in Perl. But thanks for the nice compliment anyway.

The room was packed full.

Scaling Heavy Concurrent Writes In Real Time

Dathan Pattishall, formerly the lead architect at Flickr, explained his techniques for scaling Flickr’s write capacity. He talked about how he’d worked to reduce primary key sizes, queued writes for batching, separated different types of data into different types of tables, and more. Dathan has never been afraid to do what he thinks is a good idea, even if it flies in the face of “best practices,” so I was happy to finally hear him talk.

By the way, Dathan pointed out that distributed locking with memcached and add() isn’t a silver bullet. It works ok until memcached evicts your lock due to the LRU policy. He uses MySQL’s built-in GET_LOCK() function for locking.

Dathan’s blog is a good source of information about his sometimes unorthodox approaches to database design.

The Power of Lucene

This was the only one of Frank (Farhan) Mashraqi’s talks I got to attend. This was pretty technical: how Lucene works, how to configure and install it, how to index documents, how to execute searches. If you were wondering how much work and complexity it would be to install and use Lucene, this talk would have been good for you to attend; I’ve never used it myself, but I’m pretty sure Frank covered everything you need to know.

Technorati Tags:, , , , , , , , , , ,

You might also like:

  1. Speed up your MySQL replication slaves
  2. How pre-fetching relay logs speeds up MySQL replication slaves
  3. MySQL Conference and Expo 2008, Day Two
  4. Sessions I want to see at the MySQL Conference
  5. Like it or not, it is the MySQL Conference and Expo

MySQL Conference and Expo 2008, Day Two

Day two of the conference was a little disappointing, as far as sessions went. There were several time blocks where I simply wasn’t interested in any of the sessions. Instead, I went to the expo hall and tried to pry straight answers out of sly salespeople. Here’s what I attended.

Paying It Forward: Harnessing the MySQL Contributory Resources

This was a talk focused on how MySQL has made it possible for community members to contribute to MySQL. There was quite a bit of talk about IRC channels, mailing lists, and the like. However, the talk gave short shrift to how MySQL plans to become truly open source (in terms of its development model, not its license). I think there was basically nothing to talk about there. I had a good conversation about some of my concerns with the speaker and some others from MySQL right afterwards.

There was basically nobody there — I didn’t count, but I’d say maybe 10 or 12 people. I think this is a telling sign.

Architecture of Maria: A New Storage Engine with a Transactional Design

I was interested in this talk because I’m interested in the tension between Falcon and Maria (and between Falcon and everything, for that matter) but I left and went to the expo hall again after a bit. The talk was good but I’d already seen and/or read it, and the question-and-answer component wasn’t enough to keep me there.

The MySQL Query Cache

This was the second session I gave at the conference, and again it was standing-room-only, with nearly 300 attendees according to the person who was watching the door. The questions were frequent and added a lot to the discussion. Slides will be on the conference website when they post them.

Grazr: Lessons Learned Building a Web 2.0 Application Using MySQL

I was keenly interested in this talk because a) I am a big fan of Patrick Galbraith’s work with many different projects, and b) I had heard a lot about Grazr but didn’t know much about it. However, I missed most of the talk. About ten minutes into it, I got a call I couldn’t refuse: my wife!

However, I did sneak back into the room for the last bit too. And I gave Grazr a try. Unfortunately, I got really confused by it; I tried a bunch of different ways to import my Google Reader’s OPML. I got that to work, but then I couldn’t figure out how to read the feeds in the OPML via Grazr. Then I think I figured that out (I’m not sure) but it didn’t strike me as a very handy way to read my feeds. I’ll try taking another look at it later if I get time. (I’m all ears if there’s a better way to read feeds).

Extending MySQL

This one was mostly for fun. I knew a lot about UDFs already (I’ve created some) and I knew about the pluggable storage engine API. But I didn’t know about pluggable event daemons. Holy cow, what a great way to shoot yourself (or your server) in the foot! All the power of an atomic bomb, with all the safety of SPF 5 sunblock in a nuclear attack. Or something like that. But darn, it sure is nifty. Brian is a great speaker too — very lively.

You know, there’s another way to extend MySQL that most people don’t seem to know about, which Brian didn’t mention. That is procedures (not stored procedures). They are sort of like a post-filter for a result set, and like UDFs they’ve been around forever. I have never heard of anyone writing their own, but there’s an example in the server itself: PROCEDURE ANALYSE.

Expo hall

I went to the expo hall to meet and greet many of the companies that Percona (my employer) is already working with (doing independent benchmarks, performance verification, analysis etc) or will be in the future. I also wanted to grill some of the vendors on their technology. Usually I find them very cagey; they claim X times faster this-or-that, but won’t tell you how, and won’t tell you what their systems don’t do well. I don’t understand why they take this approach; you can’t hide your system’s strong and weak spots. There is no security through obscurity, and shrewd independent observers are going to get to the bottom of it with or without your permission.

So, for instance, I was talking with Tokutek, who claimed to be a drop-in replacement for InnoDB with 200x better performance and apparently no downsides. However, on closer questioning, I did get him to admit that the system has table-level locking. Thus it won’t give any concurrency, so saying it’s a drop-in InnoDB replacement is questionable. And the comparison against InnoDB seemed contrived to create a worst-case situation with bad tuning and a workload so it would perform terribly. An honest comparison tunes both systems to their highest performance and measures them; you can’t tune one system as badly as possible and compare it to the other’s best-case performance. I pressed on further and asked about range scans in some specific cases (they claim they’re great at range queries, and equal to InnoDB on everything else). At last they admitted they can’t perform well on some very common queries such as real-life queries InnoDB performs very well on for me. They said these are “point queries” but that’s not true; you can design indexes to support many different ways to range-query a table in InnoDB and get great performance. So it sounds to me like Tokutek’s storage format is extremely narrowly focused, and there is indeed a trade-off. I will be interested to see how their technology develops, though. It’s not done yet.

In general

There are a lot of Maatkit t-shirts walking around, which makes me happy. If I’d printed 200 of them, I probably could have given them all away. I was wearing a PostgreSQL t-shirt myself. Proudly, I might add. I’m not the only person here who’s interested in PostgreSQL. This morning I met a person from EnterpriseDB.

Yesterday was a bit slow in terms of interesting sessions, but there was a lot going on in the hallways, the expo hall, the meetings over lunch, and so on.

Technorati Tags:, , , , , , , , , ,

You might also like:

  1. Like it or not, it is the MySQL Conference and Expo
  2. Sessions I want to see at the MySQL Conference
  3. Remember to sign up for MySQL Conference and Expo!
  4. My presentations at the 2008 MySQL Conference and Expo
  5. How pre-fetching relay logs speeds up MySQL replication slaves

Get a free sample chapter of High Performance MySQL Second Edition

If you’re at the MySQL Conference and Expo, you can get a free sample chapter of the upcoming High Performance MySQL Second Edition. Just go to the exhibition area. As you go through the doors, take an immediate left and look for the sample chapter on O’Reilly’s table. It’s a rough draft and contains typos and my incredibly crude drawings instead of those that will go into the final book, but it should serve to give you an idea of the book’s depth and scope. Kudos to Andy Oram, our editor, who was able to get these done for us on very short notice.

Technorati Tags:,

You might also like:

  1. Progress on High Performance MySQL, Second Edition
  2. Coming soon: High Performance MySQL, Second Edition
  3. Progress on High Performance MySQL Backup and Recovery chapter
  4. High Performance MySQL Second Edition Schedule
  5. Progress report on High Performance MySQL, Second Edition

MySQL Conference and Expo 2008, Day One

Today is the first day at the conference (aside from the tutorials, which were yesterday). Here’s what I went to:

New Subquery Optimizations in 6.0

By Sergey Petrunia. This was a similar session to one I went to last year. MySQL has a few cases where subqueries are badly optimized, and this session went into the details of how this is being addressed in MySQL 6.0. There are several new optimization techniques for all types of subqueries, such as inside-out subqueries, materialization, and converting to joins. The optimizations apply to scalar subqueries and subqueries in the FROM clause. Performance results are very good, depending on which data you choose to illustrate. The overall point is that the worst-case subquery nastiness should be resolved. I’m speaking of WHERE NOT IN(SELECT…) and friends. It remains to be seen how this shakes out as 6.0 matures, and what edge cases will pop up.

The Lost Art Of the Self Join

This was just great. Among many other things, Beat Vontobel showed how a Su Doku can be solved entirely with declarative queries: a very large self-join query against a table of digits and a table of the board’s initial state. I had been promoting this session because last year’s was so very good. I can’t wait to see what he comes up with for next year. Can he find another creative idea? Time will tell.

He wasn’t able to solve a 9×9 puzzle with MySQL because of the limitation on the number of joins, but PostgreSQL had no trouble doing it.

EXPLAIN Demystified

This was my session, of course. (Slides will be on the O’Reilly conference site, if they aren’t already). It went great, I thought. The room was full and people were standing in the back of the room and in the door. The questions came fast and furious; all really good questions. I think we ended up exploring a lot of the MySQL query execution method, strengths, and weaknesses by the time we were through. And I gave away all the remaining Maatkit t-shirts. Hopefully the people who took them will wear them tomorrow and the conference will be sea of deep, rich red shirts.

Someone did an audio recording of the session, but I don’t recall who it was.

Investigating InnoDB Scalability Limits

This session was given by Peter Zaitsev (disclosure: I now work for Percona, the company he co-founded). Peter and Vadim Tkachenko spent a lot of time over the last weeks and months running a dizzying array of benchmarks on MySQL 5.0.22, 5.0.51, and 5.1.24 (if I recall the versions correctly). They were able to show InnoDB’s scaling patterns for a number of different micro-benchmarks on a variety of configurations. If you didn’t attend, please look up the slides if you care about InnoDB performance. A lot of work went into the benchmarks — a lot of work. The slides should be on the conference website or on our blog, http://www.mysqlperformanceblog.com/.

Replication Tricks and Tips

Lars Thalmann and Mats Kindahl gave this session. At a high level, I’d say it was a run-down of all the different ways you can use MySQL replication. Replication is really a flexible tool, and they covered a large array of the most important ways you can use it to achieve different purposes. Many of the techniques they mentioned are implemented by various tools in Maatkit. A couple of the others are implemented in MySQL Master Master Manager and MySQL Semi Multi-Master tools. Don’t re-code these! You can save weeks of work and get quality code by using the pre-built tools. (I built Maatkit, so I know exactly how tricky it is to get some of these things right.)

BoF Sessions

I dropped in on a few BoF sessions, including the Sphinx one and the PBXT/Blob Streaming one. (Keep an eye on the PrimeBase folks — they are up to great things.) Ronald Bradford protected me from those who wanted to get me drunk. Hint: it’s really easy… I have to say, though, Monty’s black vodka was amazing.

Speaking of Blob Streaming, Paul McCullagh and I were talking earlier in the day about the project’s name, MyBS. This has been smirked about a few times. I think it’s a great name, because after all my initials are BS (I usually insert one of my four middle names in to alleviate this problem, but I digress). The conversation went like this:

Me: I like it. My initials are BS.

Paul: BS actually means British Standard, so it can’t be bad.

Me: Better than American Standard. That’s a toilet.

We also debated the merits of watching the original move The Blob. It’s a classic. It must be good.

Technorati Tags:, , , , , , , , , , , , , , ,

You might also like:

  1. Like it or not, it is the MySQL Conference and Expo
  2. Speed up your MySQL replication slaves
  3. MySQL Conference and Expo 2008, Day Three
  4. Baron Schwartz on a podcast at MySQL Conference and Expo 2008
  5. MySQL Conference and Expo 2008, Day Two

MySQL Community Member of the Year

MySQL just gave me an award at this morning’s keynote, along with Sheeri Kritzer Cabral (for the second year in a row!) and Diego Medina, for my code contributions to the MySQL community, specifically Maatkit, which makes it easier to make MySQL reliable, fast, and robust. It’s an honor to be recognized. And while I could leave it at that, I’d like to say a word or two more.

The economy, community, and ecosystem that’s building around Free Software can often be very rewarding financially. This is a great motivation; being rewarded for your efforts is one of the chief virtues of a culture of entrepreneurship, along with the idea that to try and fail is just as noble as to succeed. But I find that isn’t enough. If I were only rewarded financially and with recognitions such as this morning’s, I would quickly become bankrupt at a deeper level. I would become focused on external measures of success, such as accolade and wealth.

That’s why it’s so important to be of service to others and to work for the good of all. This is one of the strongest counterbalances for me. It helps keep me humbler and more open.

In the end, Free Software is all about this. It reminds me always that we are all interconnected, and that to work for your highest good is to work for my own.

I believe we all need at least these three things deeply:

  • To chart your own course in life.
  • To be of service to others.
  • To make the most of what you have.

Does proprietary software offer you the chance to do this? No, it does not. It makes you beholden and dependent, not free. To pursue these three goals to their maximum extent you need freedom. “Make the most of what you have” doesn’t imply that you have to just accept what’s given to you; you can also take some time to see what your choices are, and choose something that gives you more freedom if possible. That’s what I did years ago when I moved away from using proprietary software.

I hope you’ll give this a try yourself: contribute what you build internally in your company, and put in the extra effort to make it really high quality and useful for everyone. This is how Maatkit started. Don’t wait for others to make it happen: chart your own course.

This morning’s award is most important to me because it reinforces that I’m serving others well.

Technorati Tags:,

You might also like:

  1. The truth about MySQL Community and Enterprise
  2. Announcement: Xaprb scripts are re-licensed
  3. Maatkit t-shirts are here
  4. Four companies to sponsor Maatkit development

A different angle on the MySQL Conference

There are quite a few business angles you might see only if you’re here at the conference, and you won’t get from blogs. For example, let’s take a look at the contents of the shoulder bags they hand out with your registration. (This is only a partial list.)

  • SnapLogic’s flyer gets it right: their system is compatible with “GNU Linux.” Hooray, a commercial company acknowledging the GNU operating system for what it is!
  • MySQL Enterprise’s flyer has three big bullet points: MySQL Load Balancer, MySQL Connection Manager, and MySQL Enterprise Monitor Query Analyzer. The first two look like they’re probably built on MySQL Proxy. The last has a visual explain plan feature, which according to an elevator conversation is not yet built. I’ll stop by their booth and see. As you may know, Maatkit has provided a tool (which is designed for integration into other tools) that shows a visual explain plan for a long time.
  • There’s an issue of Linux Journal, which does not get the GNU part right. And it has no articles about MySQL. Off-topic! Discarded!
  • Infobright’s flyer says they can load data nearly real-time. I don’t know how you read it, but to me that says “can’t quite keep up with how fast you generate data.” So… what good can it possibly be, right?
  • The conference bag itself has Zmanda’s logo on the side.
  • Webyog’s flyer has one side for SQLyog, and one for MONyog. Each side takes the sparse but visually appealing approach of shiny icons to present a feature list. My favorite is the “Find slow SQL” turtle.
  • JasperSoft’s flyer has soothing, professional blues and rich reds. It makes them look very trustworthy. (I’m not being snarky.) And they have lots of nice whitespace. It’s a little bit of a different look.
  • Kickfire’s marketing department is really on the ball. I’ve seen a large number of flyers and other materials from them (online and offline) and they just changed their name and created a new logo and look-and-feel a short time ago. How do they do it so fast?
  • O’Reilly has a bunch of half-sized flyers for their conferences. We should have asked them to throw in one about our upcoming book, the second edition of High Performance MySQL. Alas, opportunity lost. By the way, stop by the bookstore and grab a copy of the sample chapter.
  • Zmanda, not content with stamping the outside of the bag, has a half-flyer inside it too, plus a chance to win a Digital Rebel to lure you to their booth. If you’re doing backups the way a lot of people seem to, you might want to stop by their booth anyway…
  • There’s a CD for a free trial of WinSQL. But the CD case doesn’t say what the

Sorry. I have a short attention span.

Technorati Tags:, , , ,

You might also like:

  1. High Performance MySQL 2nd Edition is in production
  2. High Performance MySQL, Second Edition: Advanced SQL Functionality
  3. Progress on High Performance MySQL Backup and Recovery chapter