Archive for the ‘Sphinx’ tag
Peter wrote about this recently, but I don’t know if it was really clear what was going on.
Point One: Sphinx can be contacted by the MySQL protocol. Not “as a MySQL storage engine.” Not “from MySQL.” It understands the MySQL protocol itself. So from the protocol point of view, the Sphinx search daemon can look just like a MySQL server.
Point Two: Sphinx understands a SQL-like query language. Don’t be fooled. You’re not writing SQL. It just looks like you are.
Point Three: Because of point One and point Two, you can use the mysql command-line client program to talk directly to Sphinx, with absolutely no MySQL server anywhere in sight. This also means you can connect to Sphinx from your application and query it, exactly like connecting to a MySQL server and querying it.
Go take a look at Peter’s blog post. He’s not writing MySQL queries. He’s writing queries to Sphinx.
Now think about how cool this is — how easy this is to integrate with your code that already communicates with MySQL. Is there any other external full-text search system that masquerades as a MySQL server? I don’t know of one.
Thunderbird’s email search isn’t that great by today’s standards. If I type “red dog” into the search bar it doesn’t even tokenize and search for the words separately — it’s a substring match so “my dog is red” isn’t found. And it’s slow.
What about embedding “real” fulltext search into it? Sphinx perhaps?
Just a thought.
The Sphinx project just released version 0.9.8, with many enhancements since the previous release. There’s never been a better time to try it out. It’s really cool technology.
What is Sphinx? Glad you asked. It’s fast, efficient, scalable, relevant full-text searching and a heck of a lot more. In fact, Sphinx complements MySQL for a lot of non-search queries that MySQL frankly isn’t very good at, including WHERE clauses on low-selectivity columns, ORDER BY with a LIMIT and OFFSET, and GROUP BY. A lot of you are probably running fairly simple queries with these constructs and getting really bad performance in MySQL. I see it a lot when I’m working with clients, and there’s often not much room for optimization. Sphinx can execute a subset of such queries very efficiently, due to its smart I/O algorithms and the way it uses memory. By “subset” I mean you don’t get the full complexity of SQL, but you get enough functionality for lots of the poorly-performing queries I see in the wild. It’s a 95% solution.
Is Sphinx for you? Good question. You can find answers in Appendix C in High Performance MySQL. And yes, that is why I wrote this blog post — to put in a plug for the book. *grin* But before I go, let me put in another plug for Sphinx: go vote for it on Sourceforge! If it’s voted as one of the Community Choice projects of the year, that will be fantastic.