Xaprb

Stay curious!

How to understand key length limitations in MySQL

with 14 comments

Suppose I try to create a table with a primary key that’s varchar(500), and MySQL complains the key length is longer than the maximum of 1000 bytes. 500 is less than 1000. What’s happening? In this article I’ll explain why I, not MySQL, am wrong. Plus, I’ll show you a tasty (yet apparently harmless) bug in MySQL.

Here’s a statement that will fail on my server:

create table test(c varchar(250), d varchar(250), primary key(c,d));
ERROR 1071 (42000): Specified key was too long; max key length is 1000 bytes

Why does it fail? Simple; my default character set is multi-byte:

show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

While most characters will fit in one or two bytes, the utf8 encoding of Unicode, as implemented by MySQL can require up to 3 bytes per character, so MySQL must be pessimistic and assume the worst-case scenario of every character requiring 3 bytes. It’s easy to see this by trying to create a table with a single VARCHAR(334) primary key. It will fail, but VARCHAR(333) will succeed, because 3 * 333 is less than 1000 bytes.

Here’s a fun bug (bug #18927):

mysql> create table test(c varchar(250), d varchar(250),primary key(c,d));
ERROR 1071 (42000): Specified key was too long; max key length is 1000 bytes
mysql> create table test(c varchar(334), d varchar(334), primary key(c,d));
ERROR 1071 (42000): Specified key was too long; max key length is 999 bytes

Sometimes it says 999, sometimes 1000. I have no idea why.

It may be a good idea to check the default character set to get the best performance out of a database. If there’s no need for multi-byte encodings, switching to a single-byte encoding might be significantly more efficient. It’s especially important to keep indexes as small as possible.

Written by Xaprb

April 17th, 2006 at 11:34 pm

Posted in Uncategorized

14 Responses to 'How to understand key length limitations in MySQL'

Subscribe to comments with RSS

  1. MySQL has always been weird IMHO. However the problem gets more interesting if you convert your happily running DB into multi-byte from latin1. All of a sudden quite a number of indexes stop working.

    However having keys of that size is quite weird in the first place. Maybe a key for a MD5 of 32 characters would be ok, still huge by itself. When I see a DB with such enormous key sizes, then I am sure something is badly designed anyways.

    Just MHO,

    Ries van twisk

    Ries

    3 Jun 06 at 6:07 pm

  2. Of course, huge keys like that are a terrible idea. I don’t remember now why this was happening. I think I had to convert some data that had been automatically imported as large character columns, even though most of them didn’t have much data.

    What happens to the indexes when you convert to multi-byte? If you drop and recreate the indexes, will they work right again, or is there a bug with multi-byte indexes?

    Xaprb

    3 Jun 06 at 6:20 pm

  3. I got this error when I install Mambo with utf-8 as default in PHPMyAdmin. Does that mean that I can’t install Mambo when default is utf-8?

    Nazar Hotsa

    14 Jun 06 at 10:07 am

  4. Nazar,

    It looks like this statement in installation/sql/mambo.sql is causing the problem:

    CREATE TABLE `#__core_acl_aro` (
      `aro_id` int(11) NOT NULL auto_increment,
      `section_value` varchar(240) NOT NULL default '0',
      `value` varchar(240) NOT NULL default '',
      `order_value` int(11) NOT NULL default '0',
      `name` varchar(255) NOT NULL default '',
      `hidden` int(11) NOT NULL default '0',
      PRIMARY KEY  (`aro_id`),
      UNIQUE KEY `section_value_value_aro` (`section_value`,`value`),
      UNIQUE KEY `#__gacl_section_value_value_aro` (`section_value`,`value`),
      KEY `hidden_aro` (`hidden`),
      KEY `#__gacl_hidden_aro` (`hidden`)
    ) TYPE=MyISAM;

    That is definitely going to cause a problem, and my guess is the Mambo developers are unaware of it. They may also not know that they are duplicating indexes — and MySQL is not “smart enough to combine duplicate indexes.” Duplicate indexes really do get duplicated in every way, from CPU time to disk space.

    You should be able to edit this statement and make the charset latin1 like this:

    create table (...) default charset=latin1;

    Xaprb

    14 Jun 06 at 2:57 pm

  5. Well, I know this solution :) I can set also cp1251 encoding, etc.

    My concern is that I’m going to create a 3-language website, that will use English, Russian, and Ukrainian languages. That’s why I need utf8 here…

    I guess this is impossible for now. Am I right?

    Nazar Hotsa

    16 Jun 06 at 5:24 am

  6. I don’t know any other workaround unless you can make the columns smaller, or remove the index completely. I have no idea whether Mambo really needs those columns to be 240 characters long, or whether they really need to be indexed. If the unique index is only there to ensure no duplicate columns exist, maybe that can be coded in the application logic instead. It is usually a very bad idea to make indexes on such long columns.

    Xaprb

    16 Jun 06 at 7:59 am

  7. The problem appears to be the combined column’s length. For example, if the primary key is specified with more than 1 columns, then the total length is tested for this limit, and it’ll fail if it’s over 999.

    Nan

    8 Jan 08 at 4:21 pm

  8. Is there a way to increase this value? the 999 bytes? Why this limitation anyway? I can’t see any reason for that !!

    Sleem

    9 Aug 09 at 9:56 pm

  9. Nice Article.
    However, the workaround can be like this:

    create table test(c varchar(250), d varchar(250), primary key(`c`(100),`d`(100));

    You can choose only the first few to index.

    Mohinish

    4 Mar 10 at 3:47 pm

  10. Dear Author, Dear Readers,

    the best link to understand to what happened here is
    http://bugs.mysql.com/bug.php?id=4541

    you have to scroll down to the bottom to the last post by Jason Urrich:

    ————————————
    TO ANYONE WHO RUNS INTO THIS PROBLEM
    ————————————

    Before trying to work around this problem, consider what this error is telling you: you
    are trying to set up a key (i.e., a tree searching index) that needs more than a certain
    number of bytes. If you are getting this error for something that you’ve marked as a
    primary key and your keylength is over 20 bytes (latin1) or 60 bytes (utf8), then stop
    right and go back to your design – this is not a suitable primary key.

    If you are running into this problem for a non-primary key or an indexed column, the
    problem comes down to the same thing: you’re trying to use an enormously long string to
    do tree index lookup, which means you’ve probably not understood how keys are used in
    database management systems.

    Of course most of the Webapps, especially php based, will break, if they really need utf8 – so the result of this stupid design decision is that in 2010 you still can not have full utf-8 support with mysql – instead you are faced with arrogant “you do not know how to use indexes” statements of the devs and they want you to look like the idiot – totally ridicolous.

    My serious advise for 2010, considering not only the never ending line of stupid mysql bugs, but also the general situation for this piece of software: move your apps either to postgres or, if web only, to one of the upcoming nosql databases like e.g. couchdb – it is time for mysql to die, and you will be lost on a sinking ship.

    J0NES

    15 Mar 10 at 5:38 am

  11. annotation to my previous post: the citation markup I did not work – citation ends with “how keys are used in
    database management systems.” – the following rant is by me.

    J0NES

    15 Mar 10 at 5:40 am

  12. what could be minimum length could be used for the sake of optimization too? field(2) or field(10), which would be fast and optimized for search?

    Erik

    30 Mar 10 at 7:10 am

  13. I understand perfectly well how keys are used in database systems, and if I choose to design my system with a long key for a good reason, that is my tradeoff to make.

    Perhaps using long keys will slow down mysql relative to an unindexed table; if so, they should look at how sqlite does it — it works quite well, even in a 150 GB (hundred fifty gigabyte) database.

    mark

    10 Feb 11 at 12:31 pm

  14. Anyone who does need a larger key length should look at innodb_large_prefix http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_large_prefix . It’ll probably do what you are after if you’re using InnoDB.

    James Day, MySQL Senior Principal Support Engineer, Oracle

    James Day

    28 Mar 12 at 12:10 pm

Leave a Reply