Xaprb

Stay curious!

Archive for July, 2011

Planned change in Maatkit & Aspersa development

without comments

I’ve just sent an email to the Maatkit discussion list to announce a planned change to how Maatkit (and Aspersa) are developed. In short, Percona plans to create a Percona Toolkit of MySQL-related utilities, as a fork of Maatkit and Aspersa. I’m very happy about this change, and I welcome your responses to that thread on the discussion list.

Written by Xaprb

July 6th, 2011 at 11:26 pm

I’m speaking at Surge 2011

without comments

I’ll be speaking at Surge again this year. This time, unlike last year’s talk, I’m tackling a very concrete topic: extracting scalability and performance metrics from TCP network traffic. It turns out that most things that communicate over TCP can be analyzed very elegantly just by capturing arrival and departure timestamps of packets, nothing more. I’ll show examples where different views on the same data pull out completely different insights about the application, even though we have no information about the application itself (okay, I actually know that it’s a MySQL database, and a lot about the actual database and workload, but I don’t need that in order to do what I’ll show you). It’s an amazingly powerful technique that I continue to find new ways to apply to real systems.

Take a look at the other speakers too — it is an impressive lineup. I hope you can attend. Last year’s show was a great event.

Written by Xaprb

July 6th, 2011 at 8:55 pm

Posted in Conferences,Scalability,SQL

Tagged with ,

Measuring open-source success by jobs

with 8 comments

It’s notoriously hard to measure the usage of open-source software. Software that’s open-source or free can be redistributed far and wide, so the original creators have no idea how many times it’s installed, deployed, or distributed. As a proxy, we often use downloads, but that’s woefully inadequate.

I’ve recently begun trying to figure out how many job openings are mentioning various open-source projects. I think that this might be a better metric because it’s driven by the end result (usage), rather than intermediate processes (downloads, etc). I think that it’s likely that usage and demand for skilled people is somewhat realistically related.

To be more concrete, I’ve been watching RSS feeds from job posting aggregators for several alternative versions of MySQL: Percona Server, MariaDB, and Drizzle. It appears that Percona Server is by far the most in-demand in terms of job skills. (I haven’t seen a job posting for the others at all, so far.)

On the other hand, my sample is skewed; I think Percona Server is better known in America, but MariaDB might be more visible in Europe. And I’m not sure that the sample data set is large enough to be statistically significant. Percona Server jobs are utterly dwarfed by MySQL jobs.

There are other flaws in my method: some software doesn’t really need as much manpower to run as others. I would say that given an equal number of WordPress and Drupal websites, more of the Drupal websites are going to be trying to hire experts to manage their sites. So nothing is apples to apples.

What do you think about this metric and its merits or drawbacks? Is there a better way to figure out how much adoption a project really has?

Written by Xaprb

July 4th, 2011 at 8:15 pm