Xaprb

Stay curious!

Extended covering indexes

with 8 comments

As you can probably guess, I’m catching up on reading my blogs. I’ve just read with interest about TokuDB’s multiple clustering indexes. It’s kind of an obvious thought, once someone has pointed it out to you. I’ve only been around products that insist there can be Only One clustered index (and then there’s ScaleDB, who say “think differently already”).

Anyway, we already know that there are quite a few database products that use clustered indexes and to avoid update overhead, require every non-clustered index to store the clustered key as the “pointer” for row lookups. Thus there are “hidden columns” which are present at the leaf nodes, but not the non-leaf nodes, of secondary indexes. Why not take that idea and run with it a little? Here’s what I mean:

create table test (
  a int,
  b int,
  c int,
  primary key(a),
  key(b) plus(c)
);

This would index column b, which because of the clustered primary key would contain column a at the leaf nodes; and additionally we’ve requested for it to store column c. And then we would be able to do this:

explain select c from test where b = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: ref
possible_keys: b
          key: b
      key_len: 5
          ref: const
         rows: 1
        Extra: Using index

The “Using index” is the key to note there. (Yes, I invented that EXPLAIN result; it is not possible to get with current MySQL and current storage engines.) This strikes me as an improvement over TokuDB, which apparently says you can have all or none. Why not let people say which columns they want?

Written by Xaprb

June 7th, 2009 at 3:15 pm

8 Responses to 'Extended covering indexes'

Subscribe to comments with RSS

  1. @Baron, you are right to say that this is an improvement over what TokuDB
    currently provides. Our implementation is a simple first step towards
    doing this. We have plans (and a design)
    to add this functionality.

    There is precedent for this functionality already. In Microsoft’s SQL
    Server, one can do the following:

    CREATE INDEX IX_Document_Title ON Production.Document (Title, Revision)
    INCLUDE (FileName);

    This would be the equivalent of your syntax doing:
    CREATE INDEX IX_Document_Title ON Production.Document (Title, Revision)
    plus (FileName);

    The major roadblock for me has been familiarizing myself with the higher levels of the stack enough to learn how to implement this grammar and pass it down to the storage engine. The reason implementing clustering indexes was simple was that I could
    pattern match the grammar of other types of indexes (such as spatial) and pass this information in the flags of a key.

  2. Zardosht, thanks for your comment. I was going to post back to your blog and ask which other platforms permitted this; I didn’t remember any that did.

    If you figure it out, would you be willing to open-source that patch so others could consider doing the same for InnoDB (or in Percona’s case, XtraDB)?

    Xaprb

    7 Jun 09 at 9:26 pm

  3. By ‘Using index’ do you mean EXPLAIN to convey that all the data for the query will be retrieved from an index? If so, I’d be a little more verbose- something like ‘all fields from index ‘.

    Robert Collins

    8 Jun 09 at 7:26 am

  4. The “include” clause was added in SQL Server 2005.

    http://www.mssqltips.com/tip.asp?tip=1078

    This is really convenient if you want to cover a column in a unique index that would otherwise break the uniqueness constraint.

    Nick

    8 Jun 09 at 10:24 am

  5. Yeah, I left the SQL Server world just as 2005 was being released, so now it makes sense that I didn’t know about it. Any other platforms support this? I don’t think Postgres does.

    Xaprb

    8 Jun 09 at 1:21 pm

  6. Baron, we plan on sharing the patch for clustering keys, along with issues to beware of, very soon in a blog post. As for implementing a feature to include additional columns suggested above, once we figure it out, we will share the MySQL changes. The simplest way to implement it would be if index comments made its way from 6.0 (and eventually 5.4) into 5.1, and we leveraged that.

  7. Baron, the blog post that describes our current patch for clustering indexes is now up: http://blogs.tokutek.com/tokuview/mysql_51_grammar_changes_to_support_clustering_indexes/.

    Zardosht Kasheff

    11 Jun 09 at 2:48 pm

  8. Prety smart. I really like this :) thnx

    Istvan Podor

    24 Aug 09 at 12:46 pm

Leave a Reply