Baron Schwartz's Website
Archives Talks

Charset Support in MySQL Is Really Not All That Complex

Published Feb 22, 2010 by Baron Schwartz in Databases at https://www.xaprb.com/blog/2010/02/22/charset-support-in-mysql-is-really-not-all-that-complex/

The headline is flame-bait, don’t take it. I just wanted to point something out about character sets and collations in MySQL.

To the uninitiated, it may seem overwhelming. Everything has a character set! Everything has a collation! And they act weirdly! The server has one. The database has one (oh, and it changes magically as I USE different databases.) Every table has one, and columns too. Is that all? NO! My connection has one! Kill me now!

Relax. In truth, only one kind of thing actually has a charset/collation. That is values. And values are stored in columns. The only thing that really has a charset/collation is a column.1

What about all the rest of those things—connection, database, server, table? Those are just defaults, which determine what charset/collation a value gets if it isn’t overridden. So if the table’s default charset is utf8, and you add a column without saying what it should be—why, it’ll be utf8. If the database’s default is latin1 and you add a table without saying what its default should be, it’ll be latin1.


  1. It’s not quite true. Literal values in SQL statements are values too, as are @user_variables. But if your connection’s charset is latin1 and you say SELECT "xaprb", without an explicit introducer, you’re really saying SELECT _latin1 "xaprb". Again, the only thing that really has a charset/collation is a value. Other things are just defaults. [return]

Baron Schwartz

I’m a founder, author of several books, and creator of various open-source software. I write about topics such as technology, mindfulness, and fitness, and I tweet at @xaprb. More about me.

Story logo

© 2020 Baron Schwartz