昨晚半夜处理了一个故障,只是简单SQL执行缺少索引导致CPU飙高,加一条索引就搞定。原来以为事情到此为止,让另外一个同事收尾准备睡觉了,这个时候高潮出现了。刚才处理的是一台slave server,为了保证实例数据对象一致性,让同事先删掉,然后在master server再加这条索引,最后发现add index的操作在slave server一直在等metadata lock释放了,并且在processlist里面居然找不到任何引起表锁的请求。就这样折腾了到凌晨,后面索性就丢在自己等锁释放,差不多过了6分钟创建成功,load再次下降。
分析
事后在5.5的手册中找到了解释:
To ensure transaction serializability, the server must not permit one session to perform a data definition language (DDL) statement on a table that is used in an uncompleted transaction in another session. The server achieves this by acquiring metadata locks on tables used within a transaction and deferring release of those locks until the transaction ends. A metadata lock on a table prevents changes to the table’s structure. This locking approach has the implication that a table that is being used by a transaction within one session cannot be used in DDL statements by other sessions until the transaction ends.