Extended covering indexes
As you can probably guess, I’m catching up on reading my blogs. I’ve just read with interest about TokuDB’s multiple clustering indexes. It’s kind of an obvious thought, once someone has pointed it out to you. I’ve only been around products that insist there can be Only One clustered index (and then there’s ScaleDB, who say “think differently already”).
Anyway, we already know that there are quite a few database products that use clustered indexes and to avoid update overhead, require every non-clustered index to store the clustered key as the “pointer” for row lookups. Thus there are “hidden columns” which are present at the leaf nodes, but not the non-leaf nodes, of secondary indexes. Why not take that idea and run with it a little? Here’s what I mean:
create table test ( a int, b int, c int, primary key(a), key(b) plus(c) );
This would index column b, which because of the clustered primary key would contain column a at the leaf nodes; and additionally we’ve requested for it to store column c. And then we would be able to do this:
explain select c from test where b = 1\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: test
type: ref
possible_keys: b
key: b
key_len: 5
ref: const
rows: 1
Extra: Using index
The “Using index” is the key to note there. (Yes, I invented that EXPLAIN result; it is not possible to get with current MySQL and current storage engines.) This strikes me as an improvement over TokuDB, which apparently says you can have all or none. Why not let people say which columns they want?



@Baron, you are right to say that this is an improvement over what TokuDB
currently provides. Our implementation is a simple first step towards
doing this. We have plans (and a design)
to add this functionality.
There is precedent for this functionality already. In Microsoft’s SQL
Server, one can do the following:
CREATE INDEX IX_Document_Title ON Production.Document (Title, Revision)
INCLUDE (FileName);
This would be the equivalent of your syntax doing:
CREATE INDEX IX_Document_Title ON Production.Document (Title, Revision)
plus (FileName);
The major roadblock for me has been familiarizing myself with the higher levels of the stack enough to learn how to implement this grammar and pass it down to the storage engine. The reason implementing clustering indexes was simple was that I could
pattern match the grammar of other types of indexes (such as spatial) and pass this information in the flags of a key.
Zardosht Kasheff
7 Jun 09 at 8:46 pm
Zardosht, thanks for your comment. I was going to post back to your blog and ask which other platforms permitted this; I didn’t remember any that did.
If you figure it out, would you be willing to open-source that patch so others could consider doing the same for InnoDB (or in Percona’s case, XtraDB)?
Xaprb
7 Jun 09 at 9:26 pm
By ‘Using index’ do you mean EXPLAIN to convey that all the data for the query will be retrieved from an index? If so, I’d be a little more verbose- something like ‘all fields from index ‘.
Robert Collins
8 Jun 09 at 7:26 am
The “include” clause was added in SQL Server 2005.
http://www.mssqltips.com/tip.asp?tip=1078
This is really convenient if you want to cover a column in a unique index that would otherwise break the uniqueness constraint.
Nick
8 Jun 09 at 10:24 am
Yeah, I left the SQL Server world just as 2005 was being released, so now it makes sense that I didn’t know about it. Any other platforms support this? I don’t think Postgres does.
Xaprb
8 Jun 09 at 1:21 pm
Baron, we plan on sharing the patch for clustering keys, along with issues to beware of, very soon in a blog post. As for implementing a feature to include additional columns suggested above, once we figure it out, we will share the MySQL changes. The simplest way to implement it would be if index comments made its way from 6.0 (and eventually 5.4) into 5.1, and we leveraged that.
Zardosht Kasheff
8 Jun 09 at 4:48 pm
Baron, the blog post that describes our current patch for clustering indexes is now up: http://blogs.tokutek.com/tokuview/mysql_51_grammar_changes_to_support_clustering_indexes/.
Zardosht Kasheff
11 Jun 09 at 2:48 pm
Prety smart. I really like this :) thnx
Istvan Podor
24 Aug 09 at 12:46 pm