Archive for the ‘Brian Aker’ tag
Postmodern databases
Dr. Richard Hipp gave a talk at Southeast Linux Fest today on choosing an open-source database. He thinks that NoSQL is not a very good name for the new databases we’re seeing these days, so he proposed a new name: postmodern databases. Why postmodern?
- The absence of objective truth
- Queries return opinions, not facts
I thought this was the best proposal I’ve heard for an alternative to the NoSQL moniker. And this is not bashing — the absence of objective truth can actually be an enabling quality, not necessarily a drawback. There’s a lot to compliment about the new databases, and calling them NoSQL is really a disservice — like calling a car a horseless carriage.
Brian Aker: 20GB doesn’t fit on a single server
Brian got interviewed by O’Relly recently, and part of it quoted him as saying this:
When everything doesn’t fit onto a computer, you have to be able to migrate data to multiple nodes. You need some sort of scaling solution there… MapReduce works as a solution when your queries are operating over a lot of data; Google sizes of data. Few companies have Google-sized datasets though. The average sites you see, they’re 10-20 gigs of data.
Users shouldn’t need to put that data onto multiple machines anyway. In fact, I don’t think we need a multi-machine solution for the common case at all. We need software that can scale up with today’s hardware. 37signals likes to run boxes with half a terabyte of RAM. Are we there yet with MySQL and InnoDB? No. Postgres? No. Anything open-source? Not that I know of. We’ve got database software that can only do a fraction of what it should be able to on that size of server.
I think we have to be clear about the use case for a solution that partitions data across multiple machines. It isn’t 20GB of data, and in my opinion it shouldn’t even be half a terabyte. I think that in the ideal world, we should be thinking about that for terabytes and larger — and in a few years, single-server datasets should be even larger.
I say should because today’s database software obviously has a lot of catching up to do.
Drizzle stops the rain
I’ve been following the Drizzle project with some interest. There’s a lot to like about it. But you know what I like most about the project?
No dual licensing. Just plain GPL, version 2.
I personally think this is the foundation for why people are empowered, why there is excitement, why there is progress, why people are contributing.
Read the rest of this entry »




