Tuesday, July 8, 2008

How to create SQL Server temp tables without collation problems

In SQL Server it is possible to create temporary tables which are special tables that are automatically cleaned up when they go out of scope. There are two types of temporary tables, global and local temporary tables.

Global Temporary Tables are visible in all sessions currently connected to the SQL Server instance and only go out of scope when the last session disconnects from the server. It is easy to recognise global temporary tables because their name starts with 2 #-signs. (for example "##customers")

Local Temporary Tables are visible only in the current session and go out of scope when this session ends. Their name always starts with a single #-sign. (for example "#customer")

Traditionally people create temporary tables the same way as they create normal tables

CREATE TABLE #RockAndRoll(
  ID           int     primary key 
  SongTitle    nvarchar(20),
  Genre        nvarchar(200),
  Sales        int
)

This seems like a perfectly reasonable way to create a temp table. Unfortunately SQL Server has one big flaw.

Presume we have the following situation. My SQL Server is installed with a Latin1_General_Bin collation and we create a database with a different collation.

CREATE DATABASE strangeCollationDb COLLATE Latin1_General_CI_AI_KS_WS
GO
USE strangeCollationDb 

Then we create a table in this database with some details about Rock Stars and a temp table with additional info.

CREATE TABLE RockStars (id int, name varchar(50), location varchar(400))
INSERT INTO RockStars VALUES (1, 'Mick Jagger', 'USA')
INSERT INTO RockStars VALUES (2, 'Jim Morrison', 'Heaven or Hell')

CREATE TABLE #RockStarInfo (name varchar(50), information varchar(400))
INSERT INTO #RockStarInfo VALUES ('Mick Jagger', 'Alive and kicking')
INSERT INTO #RockStarInfo VALUES ('Jim Morrison', 'Passed away')

This looks like a perfectly valid situation, but when we run the following query the problems start:

SELECT * 
FROM Rockstars s
  LEFT JOIN #RockStarInfo i
  ON s.name = i.name

Msg 468, Level 16, State 9, Line 2
Cannot resolve the collation conflict between "Latin1_General_BIN" and "Latin1_General_CI_AI_KS_WS" in the equal to operation.

When we inspect the metadata closely we find the following:

SELECT db_name() as db, TABLE_NAME, COLUMN_NAME, COLLATION_NAME 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME   = 'RockStars' 
  AND TABLE_SCHEMA = current_user
  AND COLUMN_NAME  = 'name'
 
SELECT 'tempdb' as db, TABLE_NAME, COLUMN_NAME, COLLATION_NAME 
FROM tempdb.INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME   like '#RockStarInfo%' 
  AND TABLE_SCHEMA = current_user
  AND COLUMN_NAME  = 'name'
dbTABLE_NAMECOLUMN_NAMECOLLATION_NAME
strangeCollationDbRockStarsnameLatin1_General_CI_AI_KS_WS
tempdb#RockStarInfo...nameLatin1_General_BIN

For some reason SQL Server will create temporary tables using the collation of tempdb instead of inheriting the collation of the current db in the connection context.

But all is not lost. If we simply use a different approach to creating our temporary table we can solve all collation problems without adding collate statements to our code. Besides using the create table statement it is also possible to use SELECT INTO to create a table.

SELECT cast(null as varchar(50)) as name, cast(null as varchar(400)) as information
INTO #RockStarInfo2 
INSERT INTO #RockStarInfo2 VALUES ('Mick Jagger', 'Alive and kicking')
INSERT INTO #RockStarInfo2 VALUES ('Jim Morrison', 'Passed away')

SELECT * 
FROM Rockstars s
  LEFT JOIN #RockStarInfo2 i
  ON s.name = i.name
idnamelocationnameinformation
1Mick JaggerUSAMick JaggerAlive and kicking
2Jim MorrisonHeaven or HellJim MorrisonPassed away

This time no errors were raised and if we check the metadata in tempdb we can see that the collation is the same as the collation in the source db.

SELECT 'tempdb' as db, TABLE_NAME, COLUMN_NAME, COLLATION_NAME 
FROM tempdb.INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME   like '#RockStarInfo2%' 
  AND TABLE_SCHEMA = current_user
  AND COLUMN_NAME  = 'name'
dbTABLE_NAMECOLUMN_NAMECOLLATION_NAME
tempdb#RockStarInfo...nameLatin1_General_CI_AI_KS_WS

The End... or not completely...

When all you use are the base system datatypes the problems are solved, unfortunately when you want to use alias data types the syntax I put in the example does not work since you can only use CAST() with system data types.

Fortunately a (bit more convulated) solution exists for this as well.

CREATE TYPE NameType FROM varchar(50)
CREATE TYPE LocationType FROM varchar(50)

CREATE TABLE MoreRockStars (id int, name NameType, location LocationType)
INSERT INTO MoreRockStars VALUES (1, 'Jeff Buckley', 'Heaven')
INSERT INTO MoreRockStars VALUES (2, 'Neil Young', 'USA')

DECLARE @Name     NameType, 
        @Location LocationType

SELECT @Name as name, @Location as information into #RockStarInfo3 
INSERT INTO #RockStarInfo3 VALUES ('Jeff Buckley', 'Drowned sadly')
INSERT INTO #RockStarInfo3 VALUES ('Neil Young', 'Keeps on rocking')

SELECT * 
FROM MoreRockstars s
  LEFT JOIN #RockStarInfo3 i
  ON s.name = i.name
idnamelocationnameinformation
1Jeff BuckleyHeavenJeff BuckleyDrowned sadly
2Neil YoungUSANeil YoungKeeps on rocking

As you can see, it is not possible to use the alias data type directly in the SELECT INTO statement, but it is possible to declare a variable with the datatype and then use the variable in the SELECT INTO statement.

And again, when we check the table layout in tempdb, we can see the collation has been inherited.

SELECT 'tempdb' as db, TABLE_NAME, COLUMN_NAME, COLLATION_NAME 
FROM tempdb.INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME   like '#RockStarInfo3%' 
  AND TABLE_SCHEMA = current_user
  AND COLUMN_NAME  = 'name'
dbTABLE_NAMECOLUMN_NAMECOLLATION_NAME
tempdb#RockStarInfo...nameLatin1_General_CI_AI_KS_WS

When you start supporting the internationalization features of SQL Server, many of the development practices you have learned and applied for years become invalid. And it is important to test your software in "non standard" installations like databases deployed with different collations.

4 comments:

David Cumps said...

Great solution :)

Anonymous said...
This post has been removed by a blog administrator.
Anonymous said...
This post has been removed by a blog administrator.
Sundee said...

Thanks! I thought I would find the last page on the internet before I found out how to address that cryptic error. Well done!