Push Ups and String Concatenation

SQL sucks at string manipulation. It’s not just SQL Server, it’s the SQL language in general. I’ve mentioned this before on Stack Overflow in answer to the question “What are five things you hate about your favorite language?”It’s almost easier to return the raw rows to application code than it is to work with them inside of a database. There’s one place where SQL Server has made it easier to work with string data – string building.

I don’t mean string building in the name = 'Jeremiah' + ' ' + 'Peschka' sense. I mean bulk string building.

So, what started this line of inquiry? Why am I filling your RSS reader with gibberish about string manipulation instead of my usual Memorial Day Ascot Review?

A friend hit me up on twitter for some advice and help with a particular database problem. To paraphrase: Human readable messages are being stored in the database in 2000 character chunks. This is part of some kind of legacy database design and, as such, cannot be changed. Stupid legacy apps. How could this friend get his data out of the database and combine it into a single row that made sense to the users?

STUFF

STUFF is the answer – specifically the STUFF function. In fact, STUFF used to have the best Books Online page title:

this never gets old

The purpose of STUFF is to cram one string into another string and remove some characters in the process. How does this happen? I have an example. An annotated example.

Basically, this works because we’re using a trick with FORXML to let us produce a string of text that we then continue to push into itself. The STUFF function is used to remove the leading space from the final version of the string.

IF OBJECT_ID(N'tempdb..#t1') IS NOT NULL
  DROP TABLE #t1 ;

CREATE TABLE #t1
  (
   id INT IDENTITY(1, 1) ,
   message_text NVARCHAR(35) ,
   message_number INT 
  ) ;

INSERT  INTO #t1
        (
         message_text
        )
        SELECT  [name]
        FROM    master..spt_values
        WHERE   type = 'A' ;

-- set up a fake table with two "messages": 1 and 0
UPDATE  #t1
SET     message_number = 1
WHERE   id % 2 = 1 ;

UPDATE  #t1
SET     message_number = 2
WHERE   id % 2 = 0 ;

-- this grabs the message, as distinguished by the message_number
-- and the text of the message
-- the STUFF function mucks together a whole bunch of data and, 
-- in this case, starts at the 1st character and removes 1 
-- character, replacing it with ''
--
-- the FOR XML PATH('') is a dirty hack to do rapid string
-- building in T-SQL
SELECT  x.message_number ,
        STUFF((SELECT ' ' + message_text AS [text()]
               FROM   #t1 AS t
               WHERE  t.message_number = x.message_number
              FOR
               XML PATH('')
              ), 1, 1, '') AS the_message
FROM    #t1 AS x
GROUP BY x.message_number ;

The best part: this solution scales incredibly well.

Menu