Archive for July, 2008

What is LinkedIn’s main database server?

Someone who should know told me that LinkedIn runs its main application on Oracle. So when I saw the press release about MySQL being their database, I read carefully, and they are not very specific about exactly what MySQL is used for. Depending on how you read it, you could argue that they left open the possibility that the main application database is not MySQL, and the MySQL deal is for something peripheral.

Now, this is nothing but a rotten rumor and I will probably burn in hell for spreading it, but I’d like it to be debunked if it’s false. What is LinkedIn’s main database server? Anyone have the provably correct answer?

PS: I see that LinkedIn is “seeing daily downloads of approximately 200 million.” I didn’t know it was downloadable. I’ve been missing out! Where can I download it?

Technorati Tags:, , ,

No related posts.

MySQL manual gets improved searching

Hooray! The MySQL reference manual has a new search system. It now uses a Google Appliance and the results should be a lot better. The old system was not very helpful. It used to break config variables into multiple words and search on them individually and give a billion results I didn’t care about. I’ve just tried to search for some things like key_buffer_size and got results I think are very useful.

I love the MySQL manual. It is a great example of quality software documentation. As someone recently mentioned, it is not released under a Free license though — that would be a great improvement, too!

When did this change happen, by the way? Maybe it’s been there for a while and I just missed it because I grew accustomed to using Google search instead.

Edit: I actually would suggest a change to this search, too. It’s the same change I have suggested in the past: put the document title in front of the manual’s title. Instead of “MySQL :: MySQL 5.1 Reference Manual :: 7.4.6.6 Restructuring a Key …” I would rather see “Restructuring a Key Cache :: MySQL 5.1 Reference Manual”. (Note that the title gets truncated as-is, and it’s hard to see in the browser’s titlebar/tab/system taskbar).

Technorati Tags:, ,

You might also like:

  1. Google thinks I might be a nerd

Using BASE instead of ACID for scalability

My editor Andy Oram recently sent me an ACM article on BASE, a technique for improving scalability by being willing to give up some other properties of traditional transactional systems.

It’s a really good read. In many ways it is the same religion everyone who’s successfully scaled a system Really Really Big has advocated. But this is different: it’s a very clear article, with a great writing style that really cuts out the fat and teaches the principles without being specific to any environment or sounding egotistical.

He mentions a lot of current thinking in the field, including the CAP principle, which Robert Hodges of Continuent first turned me onto a couple months ago. It has been proven formally, though I have not read the proof myself.

One of the most important concepts he advances is giving up the illusion of control. As programmers and DBAs, I think we may tend to like control too much. Foreign keys are a perfect example. I think the point here is that these things make you feel safe, but they don’t really make you safe. Just as with so many things in life, recognizing our inability to really control the systems we build is key to working with their strengths — instead of trying to bind them with iron bands.

Another great point is idempotency. This is a great way to help avoid problems with MySQL replication, by the way. I’ll leave the “why” as an exercise for the reader, but let me just point out that the file MySQL uses to remember its current position in replication is not synced to disk, so it will almost certainly get out of whack if MySQL dies ungracefully. (Google has solved this problem.)

A highly recommended read — worth more than most case studies about how specific companies have scaled their specific systems.

Technorati Tags:, , , , , , , , , , ,

You might also like:

  1. Get a free sample chapter of High Performance MySQL Second Edition

Book Review: Building powerful and robust websites with Drupal 6

Drupal

I just finished reading Building Powerful and Robust Websites with Drupal 6 (this title on Packt’s site). I’ve been working on a website powered by Drupal, and though it was obvious that Drupal is very flexible and capable, I was getting pretty lost in the website. So I wanted to read a book that would explain it to me.

Unfortunately, this book didn’t help me much. I’d give it 2 out of 5 stars.

Overall the content of the book is not about what the title promises. In fact, I’d say the title ought to be something like “an introduction to…” or “basic concepts with…” instead. Unfortunately, these titles don’t really capture the spirit of the book either. It’s hard to actually explain what the book is about, and I think that’s its main problem: the book has an identity crisis. It can’t decide what it is really trying to say, and thus it seems to go from one topic to another without a really strong direction. (It does have some direction and organization, but it could be stronger.)

As a result I now know almost nothing about Drupal that I didn’t discover by trial and error over perhaps 8 hours or so working with a live site.

The book is 380 pages and I think it ought to be a lot shorter to cover the material it covers. The writing is far too wordy, so whole paragraphs sometimes end up saying nothing at all. For example, “reduces the amount of work required later down the line” should just be “saves work later.” A lot of what’s included is just irrelevant. For example, the preface spends a lot of time telling me that the Internet is an exciting revolution, yada yada:

The Internet is arguably one of the most profound achievements in human history. It has become so pervasive in our lives that we hardly even notice it—except when it happens to be unavailable! It’s one of those things that make you sit back and wonder how people got along without it in the old days. Without the ability to surf the Internet to order groceries, do our banking, book flights and make travel arrangements, meet friends, meet partners, download music and videos, study, run businesses, trade shares, run campaigns, express views, share ideas, learn about other people… where would we be?

There’s a lot of that throughout the book, unfortunately. See if you can understand this:

Chapter 8 gives you a run down of how attractive interfaces are created in Drupal through the use of themes. As well as discussing briefly some of the considerations that must be taken into account when planning your website and ends off by looking at how to make important modifications to your chosen theme.

And on page 4: “little to now experience…” and later, a really confusing one: “How you deal with file system settings really depends on what type of content you use to visualize your site.”

The typesetting is also hard to read. For example, it’s set ragged-right, but even worse, there are many places where lines are broken far shorter than they need to be. It looks to me like they just put Word documents onto paper with no real typesetting.

When the content actually picks up momentum, it covers a bunch of things I could do without, such as explanations of what open source is, and Drupal’s licensing. As I said, this book has an identity crisis; these are important topics, but not in this book.. In chapter 3, I begin to see some stuff about how to use Drupal. But then it jumps right into how to set up forums. Unfortunately I really missed explanations of the concepts. The concepts are too intermingled with the examples, which I didn’t understand because I was only reading the book, not actually working through the examples. I got lost very quickly and never found myself again.

This book is not skimmable to get info at any level other than following along with the author — I mean, installing your own copy and literally performing every action the author performs so you can see it in action. For example on page 69, there’s a lot of talk about containers — and then it says “click Add container to bring up a page…” wait, I thought that’s what we were doing already? I’m sure I read it wrong, but unless I’m willing to build myself a website on my laptop to follow along, I can’t tell. I really wanted the book to explain things to me without making me work through the examples. I’m sure it’s hard to do, but it must be possible.

At the end of each chapter, there’s a summary. They are too congratulatory. “This chapter provided a good grounding in the basics of controlling access to your site’s content.” I disagree. Then later, “From learning about what considerations must be taken into account when planning a website’s look-and-feel, to making changes to the code, this chapter has provided a firm grounding in the fundamentals of working with Drupal interfaces.” Again, I just didn’t get that out of it when I read it.

So what was I looking for from the book? I wanted to understand what Drupal’s special lingo means, really. I mean, if you’ve worked with Drupal, you’ve seen all the words it reinvents for its own special uses — taxonomy, hierarchy, term, vocabulary, node, field definition, content type, and on and on. Each of these means something really particular in Drupal — not at all what we mean in common English. I am still confused about all this: what each of these things really is, how each relates to the other, why I see them in various contexts, when I should use one instead of another, and so on.

In summary, I’d say that I’m now even more convinced that Drupal is almost ridiculously flexible and powerful, but this book just didn’t help me much. It is a noble effort but alas, I needed more from it.

PS: some people have asked me about the grammar rules and regular expressions I used while writing my own book. This book is a great example of how specific those are to my own writing. If I were this book’s author, I would have a rule to catch “make use of,” “head on over,” and “go ahead.” There is also a lot of “of course” that substitutes for a real explanation on topics where I really wanted more help understanding.

Technorati Tags:, , , ,

You might also like:

  1. Official website launched for High Performance MySQL
  2. Progress report on High Performance MySQL, Second Edition
  3. Copyright statement, privacy policy and terms of use
  4. How to read the clipboard from JavaScript
  5. How to exploit an insecure order of access to resources

Are you sure you’re reading the second edition of High Performance MySQL?

I have been getting a lot of comments and errata from people who seem to be mistakenly buying the first edition and believing it’s the second edition. A lot of the blame for this probably rests with Amazon, who did not distinguish between the two editions at all until the editor and I (among others) leaned on them persistently for about 6 weeks. I think some people are buying the second edition and getting the first edition.

I’ve even spoken to people in person who said “yeah, I’ve been reading it” and I give them a copy of the second edition to hold in their hands, and they go “whoa, that is like twice the size. I don’t have this edition at all.”

If you have any question at all, just look at the front cover. If you have the second edition, you will see it clearly in the upper right-hand corner of the cover, as shown in this picture:

Cover

I feel like a Microsoft Anti-Piracy Minion “educating” you about how to verify that you are installing Genuine Spyware. Don’t worry, the feeling will pass and I’ll be okay *grin*

If you ordered the second edition and got the first edition, Amazon should send you a new book free of charge. If they’re really making the mistake that they seem to be, I predict they’ll fix it when it starts costing them money.

Technorati Tags:No Tags

You might also like:

  1. High Performance MySQL is going to press, again
  2. High Performance MySQL Second Edition goes to press!
  3. Pre-Order High Performance MySQL Second Edition

High Performance MySQL is going to press, again

Apparently High Performance MySQL, 2nd Edition is selling quite well — I’m not sure exactly how well — because we’re preparing for a second printing. This makes me very happy. I don’t think they anticipated going back to the press for quite some time.

The book fluctuates between sales rank 1000 and 2000 on Amazon during the day, and has reached as high as 600 or so. This is just phenomenal. The O’Reilly team was psyched when it broke 5000, and so was I — but now we’ve stayed under 2000 for a long time (except when Amazon sold out of it). Frankly I’d have thought that for a niche-market book like this, we’d have been in the 10,000 range or something like that.

Clearly we (the authors, editors, publisher, etc) have done something right! This is a great feeling.

Thanks for sending errata, by the way. I have just completed proofreading the whole book myself, and found a number of things that may be fixed in the second printing. I think certain types of errors won’t be fixed, but the important ones certainly will be.

Technorati Tags:, ,

You might also like:

  1. High Performance MySQL Second Edition goes to press!
  2. What if you find errors in High Performance MySQL?
  3. High Performance MySQL 2nd Edition gets revised and translated
  4. Are you sure you’re reading the second edition of High Performance MySQL?
  5. Official website launched for High Performance MySQL

Sphinx 0.9.8 is released!

The Sphinx project just released version 0.9.8, with many enhancements since the previous release. There’s never been a better time to try it out. It’s really cool technology.

What is Sphinx? Glad you asked. It’s fast, efficient, scalable, relevant full-text searching and a heck of a lot more. In fact, Sphinx complements MySQL for a lot of non-search queries that MySQL frankly isn’t very good at, including WHERE clauses on low-selectivity columns, ORDER BY with a LIMIT and OFFSET, and GROUP BY. A lot of you are probably running fairly simple queries with these constructs and getting really bad performance in MySQL. I see it a lot when I’m working with clients, and there’s often not much room for optimization. Sphinx can execute a subset of such queries very efficiently, due to its smart I/O algorithms and the way it uses memory. By “subset” I mean you don’t get the full complexity of SQL, but you get enough functionality for lots of the poorly-performing queries I see in the wild. It’s a 95% solution.

Is Sphinx for you? Good question. You can find answers in Appendix C in High Performance MySQL. And yes, that is why I wrote this blog post — to put in a plug for the book. *grin* But before I go, let me put in another plug for Sphinx: go vote for it on Sourceforge! If it’s voted as one of the Community Choice projects of the year, that will be fantastic.

Technorati Tags:, ,

You might also like:

  1. Progress on High Performance MySQL, Second Edition