Friday, November 23, 2007

How to make a thread safe insert/update procedure on SQL Server

When developing an application that does a lot of inserts and updates, it is sometimes easier to wrap both statements in one stored procedure. This results in a simple procedure that will check if a certain record exists. When no record is found a new one is inserted, when a record is found the values are updated.

We will use the following table script as example to explain the problem:

create table updatetest 
  (
   id     int primary key, 
   name   varchar(20), 
   amount numeric
  )


insert into updatetest values (1, 'first', 100)
insert into updatetest values (2, 'second', 500)
insert into updatetest values (3, 'third', 30)
insert into updatetest values (4, 'fourth', 200)

When making an insert/update procedure most people write something like this:

CREATE PROCEDURE updatevalues(@id     int, 
                              @name   varchar(20), 
                              @amount numeric)
AS

IF EXISTS(SELECT * FROM updatetest WHERE id = @id)
  BEGIN
    UPDATE updatetest 
    SET name = @name, 
        amount= @amount
    WHERE id = @id
  END
ELSE
  BEGIN
    INSERT INTO updatetest VALUES (@id, @name, @amount)
  END
This looks like the perfect solution to our problem except for one detail. When this procedure is executed from several threads at the same time to update the same record, it will crash.

Let's alter our procedure script a bit to make it easier to reproduce the problem.

CREATE PROCEDURE updatevalues(@id     int, 
                              @name   varchar(20), 
                              @amount numeric)
AS

IF EXISTS(SELECT * FROM updatetest WHERE id = @id)
  BEGIN
    waitfor delay '00:00:10'
    UPDATE updatetest 
    SET name = @name, 
        amount= @amount
    WHERE id = @id
  END
ELSE
  BEGIN
    waitfor delay '00:00:10'
    INSERT INTO updatetest VALUES (@id, @name, @amount)
  END

Compile this script and then run the following statements (at the same time) in 2 query windows.

exec updatevalues 7, 'Seven', 777

Surprise surprise, look at the error we get in the second query window...

"Msg 2627, Level 14, State 1, Procedure updatevalues, Line 17
Violation of PRIMARY KEY constraint 'PK__updatetest__29AC2CE0'. Cannot insert duplicate key in object 'updatetest'.
The statement has been terminated.

(0 row(s) affected)

Why does this happen? Simple, the first execution checks if the record exists and decides to insert a new record because the if exists returns false. The second thread is executing the same query at the same time and of course the existance check returns false as well. They both try to insert and a primary key error happens.
Threading without locking gives errors.

Some people think that this will never happen, but they are wrong, from the moment you are working with large amounts of data and users, this conflict will happen very often.

Luckily there is an easy way to solve this (very annoying) problem by using Lock hints in our advantage.

We will update our test procedure to include a transaction and an Exclusive Lock:

CREATE PROCEDURE updatevalues(@id     int, 
                              @name   varchar(20), 
                              @amount numeric)
AS

BEGIN TRANSACTION
IF EXISTS(SELECT * FROM updatetest (XLOCK, SERIALIZABLE) WHERE id = @id)
  BEGIN
    waitfor delay '00:00:10'
    UPDATE updatetest 
    SET name = @name, 
        amount= @amount
    WHERE id = @id
  END
ELSE
  BEGIN
    waitfor delay '00:00:10'
    INSERT INTO updatetest VALUES (@id, @name, @amount)
  END
COMMIT TRANSACTION
RETURN(0)

The updated code has as effect that the second thread will wait until the first thread has completed it's operation. It is no longer possible that another operation on the same record is "sneaking" in between the check and the insert or update.
Threading without locking gives errors.

Since I simplified the procedure script for the example you can download a complete version of the code.