Why you should not use BIT columns in MySQL
MySQL implements the BIT data type differently in different versions, and the behavior is not what one might expect. In this article I’ll explain how MySQL’s behavior has changed over time, what strange things can happen as a result, and how to understand and work around display issues. I’ll tell you about a serious bug I’ve found, and discuss differences in the BIT data type between MySQL and Microsoft SQL Server.
History
MySQL has supported the BIT data type for a long time, but only as a synonym for TINYINT(1) until version 5.0.3. Once the column was created, MySQL no longer knew it had been created with BIT columns.
In version 5.0.3 a native BIT data type was introduced for MyISAM, and shortly thereafter for other storage engines as well. This type behaves very differently from TINYINT.
Changed behavior
The old data type behaved just like the small integer value it really was, with a range from -128 to 127. The new data type is a totally different animal. It’s not a single bit. It’s a fixed-width “bit-field value,” which can be from 1 to 64 bits wide. This means it doesn’t store a single BIT value; it’s something akin to the ENUM and SET types. The data seems to be stored as a BINARY value, even though the documentation lists it as a “numeric type,” in the same category as the other numeric types. The data isn’t treated the same as a numeric value in queries, however. Comparisons to numeric values don’t always work as expected.
This change in behavior means it’s not safe to use the BIT type in earlier versions and assume upgrades will go smoothly.
Display issues
The client libraries, including the command-line client and all the GUI clients I’ve seen, don’t seem to know how to handle BIT values. They don’t display them as a series of ones and zeroes. For instance, the following code even breaks the alignment of the command-line output!
mysql> create table test (i bit not null default 0); mysql> insert into test (i) values (1), (1), (0), (0); mysql> select * from test; +---+ | i | +---+ | | | | | | | | +---+
As I mentioned above, the data seems to be stored internally, and transmitted through the client libraries, as a BINARY value, which is actually a string type in MySQL. How it displays depends on the width of the column. For example, if the column is 32 bits wide, it is treated as CHAR(4). If it’s 8 bits wide, it is treated as CHAR(1):
create table test(ch bit(8)); mysql> insert into test values (b'01011010'); mysql> select * from test; +------+ | ch | +------+ | Z | +------+
To display the value as an integer, it has to be cast to another type. One way to do this is add 0 to the value: select ch + 0 from test;. Another way is select cast(ch as unsigned) from test;
Display width seems to be related to value with BIT, in contrast to what the manual’s section on Overview of Numeric Types states: “Display width is unrelated to the storage size or range of values a type can contain.” It appears that a field of size M can only store M bits, so it’s the storage size, not the display size, that’s affected. As I mentioned, bit values don’t display as ones and zeroes anyway, so it makes no sense to say an 8-bit wide column “displays with a width of 8.” It doesn’t display 8 bits, it stores 8 bits. In fact, inserting b'100000000' into the table I defined above will store the value 255, demonstrating that the actual value has a maximum capacity of 8 bits. Any bits not set explicitly to 1 are set to 0, so values are left-padded with zeroes (the most significant bits are zeroes).
Bugs
I’ve discovered some very strange bugs with BIT columns in MySQL. The issue I noticed was a LEFT OUTER JOIN failing when it should have succeeded. I discovered a combination of factors could cause the bug to appear and vanish. For example, the join will succeed or fail depending on combinations of these factors:
- the presence of an additional column, not involved in the query at all
- the presence of additional rows which don’t match in the join
- the order of columns in the table
- the presence of an additional tautology in the join clause
I’ve filed a bug with MySQL about the issues I found, including a script which demonstrates several ways to trigger the bug.
Why use it?
Given the problems I’ve mentioned, I recommend avoiding it entirely. It provides nothing that’s not already possible with standard numeric types and adds a lot of confusion.
The only reason I personally would consider using this data type in MySQL is to document the actual usage of the values in it, as described in an earlier article on choosing column types. Unfortunately, I think it’s counterintuitive enough that I wouldn’t even choose to use it there, because I think a BIT column intuitively sounds like it ought to store a single bit.
Differences from Microsoft SQL Server
Microsoft SQL Server also provides a BIT data type. However, it’s completely different; it’s a single-bit column. Internally, it is stored as a single bit within an integer data type. As successive BIT columns are added to a table, SQL Server packs them together behind the scenes. This is equivalent to doing bitmask operations on a single column (my previous employer loved bitmask columns!), but it allows the bits to be named explicitly, avoiding the need to pass around named constants (or embed magic numbers) and deal with bitwise arithmetic.
Pros and cons of bitmask columns
Bitmask columns (an integer within which each bit is retrieved and set via bitwise arithmetic) can be extremely handy. They’re a very compact way to pack true/false values together for efficient storage. They can also facilitate certain types of queries; for example, “if any value is set” queries become simple. I’ve used them in ACLs stored in a database, for instance. Certain types of problems are just easy to solve with bitwise arithmetic, and for those problems, creating an integer column and declaring “bit 5 is whether the value is [something]” makes a lot of sense.
On the negative side, bitmask columns can be hard to use. For one thing, they’re hard to understand. Without the documentation that says which bit means what, they’re pretty much useless. SQL Server avoids this and the other problems I’ll name by treating each bit as a separate column and naming it, but that’s only if you use that facility, which my previous employer didn’t. Bitwise arithmetic can also be pretty tricky to write, and even harder to read.
Magic numbers in queries are just as meaningless as a column named bitcolumn1. Declaring and passing around constants to name the magic numbers is a nice thought, but it’s error-prone and it’s extra work. Creating a table to define the bits can be quasi-helpful as well, unless (as often happened at my previous employer) you can’t find the table, or the column is named such that you can’t tell which table defines the values for which column, or the table’s values don’t make any sense for bitwise arithmetic.
Bitmask columns are also not index-friendly, so querying against them isn’t optimal. Of course, any column with only two values is useless to an index anyway, so this is no worse, performance-wise, than storing the yes/no values in columns by themselves. Since there’s less data to examine, it can actually be more efficient.
Finally, there can be some subtleties about bitwise arithmetic, such as issues relating to signed and unsigned numbers.
The bottom line is, I think bitmask columns should be used sparingly, and only when the nature of the data and computations really makes them the obvious choice.
Further Reading:






Good article, thanks. Saved me time.
For ColdFusion Users: an updated MySQL Connector/J allows ColdFusion to connect to MySQL 5, but does not seem to play well with these new MySQL BIT fields (at least as of now). Using TINYINT instead of BIT seems to be the best work-around.
Marek
12 May 06 at 2:36 pm
[...] I found this article that explains how it works, and that you should avoid it. Why you should not use BIT columns in MySQL – Xaprb Why you should not use BIT columns in MySQL [...]
One Geeks Opinion » Why you should not use BIT columns in MySQL - Xaprb
3 Oct 06 at 10:21 pm
Your previous employer is probably the one who should read this…
I hope he/she loves bitmasks because they truly provide a necessary performance or storage advantage.
…As opposed to bad reasons for loving them, e.g.:
In a general sense, here in “modern times”, I think individually named columns/variables are a superior solution. I think in the vast majority of cases, the potential performance and storage gains of bitmasks are at least negated, if not outweighed, by the readability — and thus maintainability — difficulties they pose. (I mean … if simple magic numbers are bad, then these, which are essentially compound magic numbers, must be compound bad!)
Just one man’s considered opinion…
Now off to convert all my BIT columns to TINYINTs!
Someone who generally doesn't comment on blogs
3 Nov 06 at 2:34 pm
MySQL Administrator 1.1.9 for Windows exports my bit(1) fields as strings when executing a backup, for example ‘\0′ (our server version is Server version 5.0.24a-max and the table in question has ENGINE=ndbcluster).
Ole V. Villumsen
12 Mar 07 at 9:46 am
[...] far so good… …until you read the blog post called “Why you should not use BIT columns in MySQL” by Xaprb. Another serious deficiency is the fact that a database dump will not export bit data as [...]
ioannis cherouvim » Blog Archive » Overcoming the MySQL BIT datatype problems with hibernate
15 Jun 07 at 2:38 pm
[...] Yet another difference between MySQL and SQL Server concerning small details, as described in here. [...]
Dream blog » Blog Archive » Roaming the RDBMS world
21 Jun 07 at 11:04 am
Here’s another reason not to use this type in your tables:
Using the MySQL .NET connector (5.1.4), and running a SELECT query that returns a column of BIT type, I could find no valid cast to retrieve the value of the column into a variable in my C# code. I tried bool, Byte, and Int16, and it always threw the exception “The specified cast is not valid.” I didn’t try the larger integers, assuming they wouldn’t work either.
In fact, I ran across this blog while searching for a solution to this problem.
For instance:
MySqlCommand cmd = MySqlConn.CreateCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = “SELECT bitField FROM myTable”;
MySqlDataReader dr = cmd.ExecuteReader();
Object ob;
while (dr.Read())
{
if ((ob = dr[0]) != DBNull.Value)
bool BitField = (bool)ob;
}
gave me this exception, as did making BitField a Byte and Int16 type and trying those casts. I also tried simply doing BitField = (bool)(dr["bitField"]); Same result.
Once I changed the column type to TINYINT and my code variable type to Byte, it worked fine. So thanks!
(I would submit this as a bug to MySQL, but they would just say “That’s the way it was designed, therefore it’s not a bug.” Uh huh…)
(Consulting Webster to find out whether “cat” is synonymous with “feline”… or is this a trick question?)
Craig
7 Jan 08 at 1:08 am
How about using bit fields to solve search “within” search queries on the magnatude of web search engine data stores?
rebump
29 Jan 08 at 5:41 pm
Found your blog entry when I started getting inexplicable problems with BIT…
I found a problem using NET connector 5.0.8.1 and version MySQL version 5.1.22 (so it persists). My problem was that data in a column defined as BIT was appearing as ’1′ when I selected the table with a simple select statement, but appeared as ’0′ when I added a LEFT JOIN to another table !!! I have moved this column to TINYINT. And now I need to fix a load of other columns previously defined as BIT.
Tyrone
30 Apr 08 at 7:47 am
Thanks for the info, Xaprb, I use to have this problem but I never realized why. It’s very frustating to use MySql when you always use MSSQL.
Regards.
Suru
4 Dec 09 at 12:08 am
The bug has been fixed 4 years ago, you can safely use BIT data type now.
Daniel
15 Feb 10 at 7:39 am
I disagree, it’s still a terrible mess that has very little value. But that’s just my opinion.
Xaprb
18 Feb 10 at 10:27 pm
Thanks for the explanation.
Your article was very helpfull to me, because I was expecting BIT to behave like the BIT in Microsoft Server
ncorpse
8 Apr 10 at 2:37 am
Great. This saved me a couple of hrs
Bass
14 Jul 10 at 8:29 pm
I was wondering why all the MySQL GUIs displayed a BIT(32) value as either as CHAR or decimal values, unless I ran a SELECT … query.
Seems like BIT is half baked and incompletely implemented, and hence to be avoided.
Petroldave
4 Nov 10 at 8:26 am
Hi!
I was just asked about a bitfield type in Drizzle and came up with your post after google’ing for it. If you were going to design one, and I am not sure that it is a good idea at all, how would you do it?
Would a bool type be just as good?
Cheers,
-Brian
Brian Aker
7 Dec 10 at 6:48 pm
Bool would probably be fine. I think it is a bad idea to pack multiple bits into a single variable-width column the way MySQL does. A bit column should be a single bit, and if you want multiple bits, you should define multiple columns.
Xaprb
8 Dec 10 at 12:14 pm
It seems that the problem with joins was fixed, that’s correct? Your script no longer trigger any bug.
daniel
9 Dec 10 at 5:47 am
I realise that this is nothing to do with the article as such (interesting by the way), but you sould as if you could answer this.
In a routine in MySql 5.5 there seems to be no equivalent MySqlDBType for the SqlDBType.Char in MS SQL.
What should one use?
David L
22 Jun 11 at 7:14 am
Good article. Thanks!!
Pravin Singh
5 Dec 11 at 7:58 am