Why won't MySQL use the best index in a join?

Someone recently asked me why the wrong index was being used for a JOIN, making the query run very slowly. We ran EXPLAIN and saw this abridged output:

explain select [columns] from m
   left join u on m.intcol = u.intcol and m.url = u.url
where u.url is null\G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: m
         type: ALL
         rows: 2717
*************************** 2. row ***************************
  select_type: SIMPLE
        table: u
         type: ref
          key: idx_intcol
      key_len: 2
          ref: m.intcol
         rows: 64486
        Extra: Using where; Not exists

The column is a 2-byte unsigned integer. Here is the relevant part of the table definition:

  `intcol` smallint(5) unsigned NOT NULL,
  `url` varchar(760) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL,
  UNIQUE KEY `url` (`url`,`intcol`),
  KEY `idx_intcol` (`intcol`),

As you can see, the url index looks like it should be a better index for the query, with two columns instead of one. The shortcut I often use to diagnose issues like this is EXPLAIN EXTENDED, followed by SHOW WARNINGS. The resulting warning message is often revealing:

select 1 AS `1` from... on...
  (`m`.`url` = convert(`u`.`url` using utf8))))...

Now it’s easy to see that the index can’t be used because of a character set mismatch. The ’m’ table indeed has a different character set and collation:

  `url` varchar(255) DEFAULT NULL

The easiest solution in this case was to change the ’m’ table’s definition, because it is a scratch table used as part of an import process.

About The Author

Baron is the founder and CEO of VividCortex. He is the author of High Performance MySQL and many open-source tools for performance analysis, monitoring, and system administration. Baron contributes to various database communities such as Oracle, PostgreSQL, Redis and MongoDB.