慢查询阻塞了xtrabackup进而阻塞以后的sql导致的系统瘫痪问题

mac2024-05-10  5

收到开发反应一库的sql频繁超时,系统几乎瘫痪,无法执行任何操作,我登上库先查看到当前的线程,发现有大量的线程状态是

 Waiting for table flush

 

查看当前的事务

从昨天开始执行,到今天早晨还没执行完,具体原因还没深究,先将此线程释放,然后备份才可以flush table成功继而备份完成后后面一系列被阻塞的sql都得以正常运行

mysql> select * from information_schema.innodb_trx\G

*************************** 1. row ***************************

                    trx_id: 192611452

                 trx_state: RUNNING

               trx_started: 2017-11-30 18:33:58

     trx_requested_lock_id: NULL

          trx_wait_started: NULL

                trx_weight: 3688

       trx_mysql_thread_id: 352932171

                 trx_query: DELETE FROM xx WHERE xx IN(SELECT xx

                                                FROM xx WHERE Remarks LIKE xx)

       trx_operation_state: unlock_row

         trx_tables_in_use: 2

         trx_tables_locked: 2

          trx_lock_structs: 3688

     trx_lock_memory_bytes: 368848

           trx_rows_locked: 4

         trx_rows_modified: 0

   trx_concurrency_tickets: 0

       trx_isolation_level: READ COMMITTED

         trx_unique_checks: 1

    trx_foreign_key_checks: 1

trx_last_foreign_key_error: NULL

 trx_adaptive_hash_latched: 0

 trx_adaptive_hash_timeout: 0

          trx_is_read_only: 0

trx_autocommit_non_locking: 0

后来想了一下每天的凌晨两点有物理备份,于是查看备份日志,发现果然是上面的事务阻塞了物理备份;

 

物理备份的整个流程

先记录当前redo log的序列号

171201 02:00:02 >> log scanned up to (54138135415)

xtrabackup: Generating a list of tablespaces

xtrabackup: using the full scan for incremental backup

xtrabackup: Starting 4 threads for parallel data files transfer

然后备份innodb库表

171201 02:00:12 [01] Copying .

备份完之后flush table;因为被阻塞,所以知道释放完事务后才成功

171201 02:00:17 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...接着开始备份非事务库表

171201 09:36:13 Executing FLUSH TABLES WITH READ LOCK...

171201 09:36:13 >> log scanned up to (54147795188)

171201 09:36:14 Starting to backup non-InnoDB tables and files     

171201 09:36:14 [01] Copying ....

xtrabackup: The latest check point (for incremental): '54138858140'

xtrabackup: Stopping log copying thread.

.171201 09:36:14 >> log scanned up to (54147795198)

171201 09:36:14 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...备份完之后释放表锁

171201 09:36:14 Executing UNLOCK TABLES

171201 09:36:14 All tables unlocked

171201 09:36:14 [00] Copying ib_buffer_pool to xxx

171201 09:36:14 [00]        ...done

171201 09:36:14 Backup created in directory xxxx

MySQL binlog position: xxx

171201 09:36:14 [00] Writing backup-my.cnf

171201 09:36:14 [00]        ...done

171201 09:36:14 [00] Writing xtrabackup_info

171201 09:36:14 [00]        ...done

xtrabackup: Transaction log of lsn (54138129801) to (54147795198) was copied.

171201 09:36:15 completed OK! 

被阻塞的语句是FLUSH NO_WRITE_TO_BINLOG TABLES...

官方解释flush tables

Closes all open tables, forces all tables in use to be closed, and flushes the query cache and prepared statement cache.

没有涉及到锁相关的字眼;但是测试表明在执行查询或者变更还未完成时,如果另起一个会话执行flush tables 则会被阻塞,

如果此后如果有操作慢查询中的表的任何sql都会被阻塞;

 

 

最新回复(0)