How to make a thread safe insert/update procedure on SQL Server
When developing an application that does a lot of inserts and updates, it is sometimes easier to wrap both statements in one stored procedure. This results in a simple procedure that will check if a certain record exists. When no record is found a new one is inserted, when a record is found the values are updated.
We will use the following table script as example to explain the problem:
create table updatetest ( id int primary key, name varchar(20), amount numeric ) insert into updatetest values (1, 'first', 100) insert into updatetest values (2, 'second', 500) insert into updatetest values (3, 'third', 30) insert into updatetest values (4, 'fourth', 200)
When making an insert/update procedure most people write something like this:
CREATE PROCEDURE updatevalues(@id int, @name varchar(20), @amount numeric) AS IF EXISTS(SELECT * FROM updatetest WHERE id = @id) BEGIN UPDATE updatetest SET name = @name, amount= @amount WHERE id = @id END ELSE BEGIN INSERT INTO updatetest VALUES (@id, @name, @amount) END
Let's alter our procedure script a bit to make it easier to reproduce the problem.
CREATE PROCEDURE updatevalues(@id int, @name varchar(20), @amount numeric) AS IF EXISTS(SELECT * FROM updatetest WHERE id = @id) BEGIN waitfor delay '00:00:10' UPDATE updatetest SET name = @name, amount= @amount WHERE id = @id END ELSE BEGIN waitfor delay '00:00:10' INSERT INTO updatetest VALUES (@id, @name, @amount) END
Compile this script and then run the following statements (at the same time) in 2 query windows.
exec updatevalues 7, 'Seven', 777
Surprise surprise, look at the error we get in the second query window...
Violation of PRIMARY KEY constraint 'PK__updatetest__29AC2CE0'. Cannot insert duplicate key in object 'updatetest'.
The statement has been terminated.
(0 row(s) affected)
Why does this happen? Simple, the first execution checks if the record exists and decides to insert a new record because the if exists returns false. The second thread is executing the same query at the same time and of course the existance check returns false as well. They both try to insert and a primary key error happens. 

Some people think that this will never happen, but they are wrong, from the moment you are working with large amounts of data and users, this conflict will happen very often.
Luckily there is an easy way to solve this (very annoying) problem by using Lock hints in our advantage.
We will update our test procedure to include a transaction and an Exclusive Lock:
CREATE PROCEDURE updatevalues(@id int, @name varchar(20), @amount numeric) AS BEGIN TRANSACTION IF EXISTS(SELECT * FROM updatetest (XLOCK, SERIALIZABLE) WHERE id = @id) BEGIN waitfor delay '00:00:10' UPDATE updatetest SET name = @name, amount= @amount WHERE id = @id END ELSE BEGIN waitfor delay '00:00:10' INSERT INTO updatetest VALUES (@id, @name, @amount) END COMMIT TRANSACTION RETURN(0)
The updated code has as effect that the second thread will wait until the first thread has completed it's operation. It is no longer possible that another operation on the same record is "sneaking" in between the check and the insert or update. 

Since I simplified the procedure script for the example you can download a complete version of the code.
 
10 comments:
Thanks man!
You thought me a good lesson :)
Great post, thanks for sharing.
BTW, I just read a stripped down (and slightly altered) version of this post on another blog, someone else thinks the post is great that they decided to publish it on their own blog, under their name though :)
http://www.semdendoncker.com/blog/?p=17
Thank you Wal, I sent him a message to correctly attribute the work to me.
No problem, I will add your link as a source.
Sorry about copying your post.
No problem, I'm glad this was settled down
Great post!
I have one question though:
Would this code behave correctly?
--------------------
BEGIN TRANSACTION
UPDATE updatetest with(holdlock)
SET name = @name,
amount= @amount
WHERE id = @id
if (@@rowcount = 0)
INSERT INTO updatetest VALUES (@id, @name, @amount)
COMMIT TRANSACTION
---------------
I'm trying to avoid the extra reads of the EXISTS(...).
This is a perfectly viable solution, but performance wise it is similar to the "if exists approach"
Where the "update/insert" construction is faster on updates(44% of query plan), the "if exists" solution is faster on inserts (41% of query plan)
I got another question concerning the holdlock mechanism. In both cases, if no rows match the criteria, then an insert will be done.
My question is, what's locked during that period?
In the case of the update, the selected row is the only thing locked. But in the case of the insert, I'm not sure... From what I've read, it locks a range on the index of the query ... whatever that means ...
Do you have a better understanding of what happens with the locking in the case of the insert? Is it causing a table lock?
The "if Exists" creates a "Shared Table Lock", so during the lock period the other queries are blocked. Nothing you can do about that unfortunately.
Thanks Filip. That's EXACTLY what I was looking for :)
Post a Comment