Debugging metadata locking in MySQL 5.5
MySQL 5.1 added a long-needed feature: INFORMATION_SCHEMA tables for inspecting locks held and waited-for inside of InnoDB. With this addition, it became possible to figure out who is blocking whom.
MySQL 5.5 changed a lot more things inside the server. One of the new features is improved metadata locking. This solves a lot of inconsistencies and bugs that were previously possible. The output of SHOW PROCESSLIST is also changed; instead of the venerable Locked status, there are more fine-grained status indicators such as Waiting for table metadata lock.
Unfortunately, the additional locking is not possible for the DBA to inspect. Where InnoDB’s storage-engine-level locking used to be opaque and was then made transparent, now server-level locking is a mystery, and in my experience much more likely to become a problem than it used to be. (The server now does a lot more locking of things it previously neglected to lock.)
I have recently run into a number of circumstances where metadata locking prevents users from doing any work. The locking is not inside InnoDB, so the INFORMATION_SCHEMA tables show nothing at all. I tried using mysqladmin debug and looking at the contents of the server’s error log, but the table in question was shown as unused in the output there. I tried examining each of the other connections to the server, but was unable to find any connection that I believed could have touched the table and locked it. The lock wait was ongoing for a long time, and no other transaction had been started previous to that, so it did not seem that a long-running transaction could have touched the table and was keeping a metadata lock open.
It feels like a bug in metadata locking to me, but I will need to dig in more deeply the next time I see it happen. Unfortunately, it’s unlikely that I’ll be able to create a reproducible test case, because there’s no way to actually see what is happening. I hope that a future version of MySQL will include a more comprehensive set of tables for inspecting locks, requests, and waits at all layers of the server.



one that is particularly tricky to diagnose when innodb table is not involved. e.g.
conn1: begin;
conn1: select * from myisam_table;
conn2: drop table myisam_table;
conn2 will simply hang until lock_wait_timeout expires and there’s no easy way to know that conn1 is responsible.
sbester
28 Aug 12 at 1:34 pm
I have a feeling something like this is the problem in my case too.
Xaprb
28 Aug 12 at 2:02 pm
if i kill a huge insert qry, I get a “Waiting for table metadata lock” too. but the connection status is just “killed”
Normann
28 Aug 12 at 2:43 pm
Maybe watching open files with lsof could give some hints about what’s going on?
It could also be hardware/OS related. (full filesystem, out of memory, SSD firmware bug, etc)
Daniël van Eeden
3 Sep 12 at 1:34 am
Would it be possible for pt-deadlock-logger to store this information too? (or is that where it gets it?) I’ve enjoyed using it but it only explains one of the 2 locks that cause the deadlock :(
Sheeri K. Cabral
6 Sep 12 at 4:11 pm