Deadlock found when trying to get lock; try restarting transaction
Once you face this problem of deadlock, you may start googling and stumble upon this page. This article is an overview of the problem and some best practices I found when faced with it at hands.ru. It consists of three sections: what is a deadlock, how to find it, and how to fix it. Move to the last section if you know the problem and searching for ways to solve it.
So, what is a deadlock? Since we have a MySQL database with an InnoDB storage engine under the hood, each update/insert/create
instruction locks affected rows until the end of the transaction. Deadlock happens when different concurrent transactions are unable to proceed because each one holds a lock that the other needs. Here is an example:
Considertransaction #1
and transaction #2
both running at the same time. transaction #1
is ready to acquire the lock for therow B
, but can’t because it's already being locked by transaction #2
. On the other side transaction #2
is waiting to acquire the lock for therow A
, which is held by transaction #1
. Both transactions are stuck.
When deadlock detection is enabled, InnoDB instantly detects it and rolls back one of the transactions. Otherwise, it relies on the innodb_lock_wait_timeout to roll it back. Whatever, you’ll meet the error: “Deadlock found when trying to get lock; try restarting transaction”. Note, that even if you don’t run transactions manually it always exists, see how autocommit option works.
Let’s see how you can face it in practice. Consider you have a clients
table with three columns: id
, name
and balance
and updating the balance of Alice
and John
in two simultaneous transactions in a different order:
How to find it
If you have a rather huge codebase it’s not obvious where does the problem happens. The first time we faced it, we had greedily browsed every transaction at our code and checked for a potential problem. It was useful for the whole project — we removed redundant transactions and squeezed some others. But we did not find the problem. Fortunately, there is an instruction that helps us to see where the most recent deadlock happened:
SHOW ENGINE INNODB STATUS \G
The output can be rather large, but we need a section titled LATEST DETECTED DEADLOCK
where we should focus our efforts. Here is the truncated output with some comments:
Most times it’s enough to see the SQL instructions that led to deadlock and you know where is the code.
If it’s not enough to see the only last deadlock, enable innodb_print_all_deadlocks option and checkout error log:
SET GLOBAL innodb_print_all_deadlocks = 1;
Ok, now we found it. Now, let’s see how to fix it.
How to fix it
Avoid long-running transactions
Occasionally, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so that locks are released sooner. Follow the rule: the transaction must not be longer than the business logic it requires to be.
Here is an example. I’ll use Django code examples–it’s easy to read and understand the idea. Consider you ran a background job that performs some heavy calculation and updates a batch of objects in your database:
But usually, you don’t need atomic batch updates of independent data. This job may run for 5 minutes or more. It means that it’ll lock each client row from the batch till the end of the transaction. And nobody else could update it this time. You even may not meet deadlocks if it’s been running for several seconds. But wonder that the system is being worked slowly without having a high load. It happens because concurrent processes spend time waiting for the lock release. A better solution may look like this:
Update in the same order
Once several concurrent transactions perform updates in the same order, deadlock is not possible. This is a good chance to organize your code better by moving common code parts into methods instead of writing the sameselect for update
at different places of the project.
In the above example with Alice and John, we’d not meet the deadlock if update their balance in the same order. It’s may not be obvious why it works. Consider one transaction that locks Alice and John rows in turn. No other transaction acting in the same order can lock the John before locking the Alice. Apparently, the situation when one transaction has locked Alice and waiting for John, and the other one has locked John and waiting for Alice is not possible.
Re-issue a transaction if it gets rolled back
Deadlocks are not dangerous. Just try again if you can afford it.
When you meet a code with potential deadlock, be prepared to run it again. Here is a Django example:
Note, that we used NOWAIT
keyword with a loop there to emulate spinlock behavior. There is also a pretty beautiful solution using optimistic locking:
There are no transactions. So, the deadlock is impossible. But we have to preserve the data consistency. To do that, we added a new field update_version
to the client
object. By selecting it before the calculation and updating it at the end, we guarantee that no concurrent process has come to update the same field or retry otherwise.
It locks not only selected rows but all the scanned
This is not evident, but when performing the query:
SELECT * from clients where name = 'John' FOR UPDATE;
You may expect only one row to be locked because there is only one John at our table. But bad news — this is not true. MySQL locks all the rows that it scans during the query. So, since we do not have an index for the name
column, our query will scan the whole table and lock all the rows till the end of the execution. Thus, the documentation recommends adding column indexes to reduce the number of scanned rows and respectively locked:
Add well-chosen indexes to your tables so that your queries scan fewer index records and set fewer locks.
Use update/delete where
and select for update/share
only with conditions that use index. It’s not obvious, why it needs to lock every scanned row. It directly depends on the transaction isolation level you choose and whether the index was used.
For example, to avoid phantom reads, at SERIALIZABLE
transaction isolation level rows are locked even in simple select without for update/share.
Read more about how different types of InnoDB lock types and how record and gap locks work there.
By the way
MySQL server team has a blog where they give a deep understanding of what happens under the hood. There are several articles about data locking: https://mysqlserverteam.com/innodb-data-locking-part-3-deadlocks/.