Archive for May, 2008

MySQL: Free Software but not Open Source

The title of MySQL’s website states that they are the world’s most popular open-source database. This is false; MySQL is not an open-source database. That assertion is a fact, not an opinion.

MySQL is Free Software, licensed under the GNU GPL. People frequently use the two phrases “Free Software” and “Open Source Software” as synonyms, but there are very large, very important differences.

The difference between Free and Open Source

Open Source is much more of a development methodology than a philosophical standpoint. The first thing on the Open Source Initiative’s website is this introduction:

Open source is a development method for software that harnesses the power of distributed peer review and transparency of process. The promise of open source is better quality, higher reliability, more flexibility, lower cost, and an end to predatory vendor lock-in.

In contrast, Free Software is not about development practices at all. You can develop Free Software any way you like; what makes it Free is the license. Free Software is about protection of rights and freedoms. It is a moral and ethical platform. The promise of Free Software is quite different from “better quality, higher reliability…” Free Software is not about quality or reliability.

So why is MySQL not Open Source? Simple. Sun/MySQL uses a closed development model. Nobody can get code in from the outside without accepting a Contributor License Agreement (CLA) which requires surrendering important rights, including ownership of the code. Sun/MySQL controls the code absolutely and maintains ownership of it. And even people who have signed the CLA report their patches stagnating — often for years — and still not being accepted into the source. This is not Open Source.

Open Source software is usually maintained, owned, and controlled by a decentralized network of peers. This is exactly the opposite of MySQL. You cannot get more opposite. The differences are often summarized as the cathedral versus the bazaar. I’m not sure this analogy always holds or is always useful and accurate, but it’s a helpful piece of common vocabulary.

Why this matters

This matters because both Freedom and an open development model are necessary to an empowered, enlightened, free society. Licensing isn’t the only thing that matters: ownership matters, too. So does control.

Google’s patches to MySQL are a good example of excellent code with many simple, highly useful features that have not been included into the official MySQL distribution. And there are no signs of that changing, as far as I can see.

I’m not the only one who notices this. Here’s another quote:

With all due respect to Marten, there is a significant difference between the captive open source development model for MySQL and the community open source development model for PostgreSQL.

If this interests you, perhaps you would like to join the discussion on the oursql-sources Google group.

Technorati Tags:, , ,

You might also like:

  1. Announcement: Xaprb scripts are re-licensed

Maatkit in RHEL and CentOS

Update: Karanbir says “Just one thing to keep in mind is that we dont want too many people using it from the Testing repository - we only need enough feedback to move it from testing to stable ( and to be honest, there are already 8 people who have said yes it works - so move to stable should happen within the next 24 - 48 hrs ). Once the package is in stable, users on CentOS4 and 5 wont need to do anything more than just ‘yum install maatkit’ and it will install for them.”

At least one person (Karanbir Singh) is working to get Maatkit into the CentOS repositories, and I believe there might be movement towards RHEL also. From an email to the Maatkit discussion list a little while ago,

I am in the process of getting maatkit into the CentOS-Extras repositories. The first step for that is that every package needs to go into a CentOS-Testing repo and feedback is required from the project and users on its stability / usability and packaging quality.

maatkit-1887 is now available in the CentOS-Testing[*] repo’s and as soon as we can get some feedback ( needs to be 5 different people, none of whom can be CentOS Developers ) - the packages will move into the main repository so that all users can get access.

I’d appreciate it if people on this were able to give those packages a go and let me know if there are any issues. You can leave feedback :

  • via the maatkit-discuss mailing list (http://sourceforge.net/mailarchive/forum.php?forum_name=maatkit-discuss)
  • on the centos-devel list ( http://lists.centos.org ) or
  • http://bugs.centos.org/ against category ‘maatkit’

[*] : Info about the Testing repo and howto set it up on your machine : http://wiki.centos.org/Repositories

If you’re interested in getting Maatkit into these repositories, please take a moment and give the requested feedback. I can’t do it because it would be a conflict of interest for the main developer to assert that the code is stable and usable.

Technorati Tags:, ,

You might also like:

  1. innotop is available from openSUSE buildservice

High Performance MySQL Second Edition Schedule

I just got the rest of the production schedule from the publisher, plus the PDF files for quality control, for our upcoming book. (Now I have to proofreeed the whole book!) This is the first time I’ve seen the entire production schedule. The book is supposed to go to the printer in the first week of June. I don’t know what the on-the-shelf date will be, but I think very shortly after that. The publisher has promised that it’ll physically be on sale at Velocity.

I also took a peek at the PDFs. Without the appendixes, the last page of Chapter 14 (Tools for High Performance) is page 604. The appendixes bring it to 660 pages. That’s real material, not including tables of contents and indexes. So my estimate (620) was not too far off.

660 pages is not bad, considering that the contract was for 384 pages.

Another note: the marketing materials for the book emphasize that it covers MySQL 5.1. While this is true, I want to point out that we took a real-life approach: we write about what we’ve seen in the real world, and 5.1 is not as widely deployed in the real world. However, the book’s real value, as far as version-specific content goes, is its tremendous depth and breadth in MySQL 4.1 and 5.0. These have been “out there” for a long time, and among the four of us we’ve seen about every conceivable scenario with it. So you’ll get a lot of insight about current, production-ready, widely-used versions. Let the other guys speculate — we just report the facts. It’s not like there’s any shortage of things to say about 5.0, right?

Technorati Tags:

You might also like:

  1. High Performance MySQL 2nd Edition is in production
  2. Coming soon: High Performance MySQL, Second Edition
  3. Progress on High Performance MySQL, Second Edition
  4. High Performance MySQL, Second Edition: Backup and Recovery
  5. An alternative to canonical URIs

You have the right to see code samples in an interview

Joel Spolsky writes about 12 steps to better code, and elsewhere about how candidates should write code in interviews.

The reverse conditions are true, too. If you’re a candidate, you should evaluate the employer against the 12 steps, and you should also see code samples. How else will you know what you’re getting into? You really have the right to do this, and you should exercise the right. If you don’t, you’ll get stuck in a crap job maintaining crap code. [dramatic voice] It happened to me.

In many companies, you can see code they’ve released as open-source. (The fact that they’ve done this says a lot about them.) But in others, you’re going to need to surprise someone and say “pick some code that’s not sensitive and show it to me.” Something simple, like the HTML for the search form on their website, or a utility to do some systems administration task. Any company is going to have a lot of code like this that they can show you.

The other approaches I see are to ask about it, assume, or ask the interviewer to write some code for you.

  1. Asking is a valid approach. If you see hesitation, or if someone says “well, it’s not as nice as we’d like, and we’re hoping you will offset that” run don’t walk, is my advice. If you’re reading this as you consider your first job out of college or something, I strongly suggest not getting a job with a company that wants you to improve the way they do things. You should be learning from them, not vice versa.
  2. You can also assume. “Oh, they use Perl? Nevermind.” That’s a stupid approach. Really. Is it acceptable to judge people’s character by the color of their skin? Then why would you judge their code by the language? In all seriousness, I have actually written very elegant, clean VBScript. And I mean, good-quality code by anyone’s standards. It’s hard in VBScript. It’s easy in Perl if you follow the Dog, which is a sign of great intelligence. Think about it this way: people who write beautiful Perl are people you should be eager to work with; they are rocket scientists. You will be the dumbest person in the room, and that should make you happy.
  3. I’ve never asked an interviewer to write code for me. Let me know how it works out for you.
Technorati Tags:, , ,

No related posts.

Summary of beCamp 2008

Yesterday I went to beCamp 2008 along with four roomfuls of other people interested in technology (perhaps close to 100 people total). The conference was a lot of fun. Not everything went as planned, but that was as planned. This was an Open Spaces conference and I thought it worked very well. From an email Eric Pugh sent:

Basically it all boils down to:

Open Space is the Law of Two Feet: if anyone finds themselves in a place where they are neither learning nor contributing they should move to somewhere more productive. And from the law flow four principles:

  • Whoever comes are the right people
  • Whatever happens is the only thing that could have
  • Whenever it starts is the right time
  • When it’s over, it’s over

From Hadoop to Bang-Splat

I used the law of two feet a time or two. In fact, the first session I wanted to go to, which was about Hadoop and MapReduce, had no knowledgeable attendees. Someone overslept. OK, that’s the way it goes: move along.

From there I went to a session about Unix command-line productivity. Most of the sessions I saw were traditional in that they had one person standing up talking and many people sitting and listening, but not all. This one had several clever command-line gurus mentioning their favorite power tips.

I learned about bang-splat and bang-dollar. The bangs have always gotten me in Bash: I avoid them because I’ve never felt like reading the Bash man page section on them. (Am I too lazy, or not lazy enough?) So it was great to hear some people say “bang-splat and bang-dollar are great” and then explain them. That was easy for me, and now I know how they can be useful to me.

This problem-first type of tip is great for me: tell me the problem, then how to solve it, rather than telling me what the solution is and leaving me guessing what kinds of problems I can solve with it. (The Bash man page is solution-first).

In case you’re wondering, bang-splat substitutes the arguments to the last command, and bang-dollar substitutes the last argument of the last command. So, instead of this:

$ touch file1 file2 file3
$ rm file1 file2 file3

I can do this:

$ touch file1 file2 file3
$ rm !*

There were lots of other nice tips too.

MySQL Performance

I ended up doing a talk on MySQL performance basics. I had no idea what the audience was looking for, so I winged it. I did make some slides, but most of the talk isn’t on the slides. You can get the slides from Percona’s slide page. It seemed to be useful to the folks attending, who had a wide variety of experience and knowledge about MySQL.

Cloud Computing

This session began with a demo of how to create an entire application stack in a few minutes with Cohesive Flexible Technologies. Someone else then demoed a similar thing using RightScale. rPath’s Jeff Uphoff was also in the room, but we didn’t get to see a demo of that. During this session the talk turned to various topics including a little bit of the topics I wanted to hear about in the Hadoop session.

Lunch

Lunch was catered Indian food provided by the Rimm-Kaufman Group. Yum.

Large Scale Storage

This session was sort of a round-table. The two people who talked the most were Josh Malone from the National Radio Astronomy Observatory and the Library of Congress, both of whom have a lot of storage needs they are unsure how to meet. Some people from UVA’s library were there as well, but I didn’t ask what they were working on.

This reminded me a lot of a recent keynote Jacek Becla gave at another conference. He’s with the Stanford Linear Accelerator Center, who are going to be generating a lotta data pretty soon.

High Availability Linux

This one started off with more from Josh Malone, who demoed Nagios briefly and then talked about his storage and backup systems. He uses BackupPC, which sounds pretty neat and very smart. We then talked about some of the things he’s looking into doing, with audience suggestions to look into shared storage or DRBD. We also looked at UltraMonkey briefly — it looks like it’s stagnating, though. And the Linux HA project.

Google App Engine

Finally, someone showed us a calculator application they’d built on Google App Engine, including the code and talking about the data model somewhat. It looks like a neat idea, but the lock-in worries me, a sentiment that was voiced by many others in the room.

Technorati Tags:, , , , , , , , , , , , , ,

You might also like:

  1. Come to beCamp 2008
  2. Bash parameter expansion cheatsheet
  3. MySQL Conference and Expo 2008, Day Three

News flash: MySQL 5.1 has zero bugs

Zack Urlocker says MySQL 5.1 has zero bugs. He may have been misquoted, or quoted out of context, but there it is. I’ll quote enough of it that you can’t take it out of context twice:

Mickos also said MySQL 5.1 has upgraded its reliability and ease of use over 2005’s v5.0.

“Now we can admit it, but this version is much improved over 5.0, which we weren’t totally happy with,” Mickos confided.

He reported that more than 1,300 bugs (997 in 2007, 386 so far in 2008) have been fixed in v5.1, and that, according to standard DBT2 benchmarks, the performance of v5.1 is 10 to 15 percent better than the previous version.

“This version now has zero bugs,” Urlocker told eWEEK.

You can check for yourself at the MySQL bug statistics page.

Of course it’s not true. But what did Zack really say, I wonder?

Technorati Tags:, , , ,

No related posts.

Come to beCamp 2008

I’m going to be at beCamp 2008, the followup to the first beCamp, which I sadly missed.

beCamp is a BarCamp un-conference. Tonight was about meeting, greeting, and throwing ideas at the wall to see which ones stick. Literally. We stuck pieces of paper on the wall with our ideas — things we can either talk about or want to hear about — and then scratched our votes on them to see which are popular.

I live and breathe MySQL for a decent part of the day, so I hesitated, but then stuck “MySQL Performance” on the wall. It got quite a few votes, so I assume will be giving a talk on MySQL performance basics at some point during the conference. (The exact schedule is probably being determined right now, in my absence, but I’m so tired right now that I’ll just take my chances on it not being at 8:00 AM tomorrow.) [edit: I just checked the website and there won’t be anything before 9:00, and the schedule is determined tomorrow. I did say I’m tired, right?]

See you there!

PS: if you want to meet some of my colleagues from my former employer, the Rimm-Kaufman Group, they’ll be there too, wearing the “We’re Hiring” t-shirts. They’re hiring, by the way.

Technorati Tags:, , , ,

You might also like:

  1. I have joined Percona
  2. Summary of beCamp 2008
  3. Remember to sign up for MySQL Conference and Expo!
  4. Going to PostgreSQL Conference East
  5. My presentations at the 2008 MySQL Conference and Expo

Pre-Order High Performance MySQL Second Edition

High Performance MySQL

If you’re waiting for High Performance MySQL Second Edition to hit the shelf, you’re not the only one. I am too! I can’t wait to actually hold it in my hands.

But you don’t have to wait idly. No, not at all! You can pre-order it and then you’ll get it as soon as possible. Plus your pre-order will help them figure out how much demand there is, so it doesn’t sell out and make you wait for your own copy.

Technorati Tags:No Tags

You might also like:

  1. L. L. Bean’s privacy policy
  2. High Performance MySQL 2nd Edition is in production