Xaprb

Stay curious!

A bug in Microsoft SQL Server’s replace() function

with 5 comments

I have found a bug in Microsoft SQL Server’s replace() function. The function treats char and varchar differently, and replacements will fail in some cases. Here is a demonstration of the bug:

declare @string varchar(40),
    @find char(2),
    @replace char(1)
set @string = 'two  spaces'
set @find = '  '
set @replace = ' '
if replace(@string, @find, @replace) = 'two spaces'
    print 'Replacement worked'
else
    print 'Replacement failed'

Result: “Replacement failed.” This is clearly incorrect behavior. To make it more interesting, here are some test cases that succeed:

  1. Change the type of @find to varchar(2).
  2. Change the value of @string to '  ' (two spaces).
  3. Use replicate(@replace, 2) instead of @find.

I found another discussion about this on Google groups, but it’s incoherent, doesn’t demonstrate the problem clearly, and speculates about the internal causes rather than suggesting ways to avoid the behavior (pointless). So I count that as “not posted elsewhere” and offer it here.

Written by Xaprb

November 15th, 2005 at 9:56 pm

Posted in SQL

5 Responses to 'A bug in Microsoft SQL Server’s replace() function'

Subscribe to comments with RSS or TrackBack to 'A bug in Microsoft SQL Server’s replace() function'.

  1. I did a bit more testing on this and it appears to be a problem with the replace function trimming char variables. Changing the section

    set @replace = ' ' 

    to

    set @replace = '*'

    and outputting the replace exhibits the extent of the problem. The result is:

    hi**there

    Furthermore, changing @string to 'This is a test' and @find = 's ' with @replace = '*' results in:

    Thi*  i* a te*t

    Note that the 's' in 'test' has been changed although it has no spaces anywhere around it. This is the strongest evidence that trimming is going on.

    Leading spaces are not affected. The more complex @string with @find = ' i' results in:

    This *s  a  test

    as expected.

    The replicate inside the replace works because replicate returns a varchar and so actually falls into the first case you mentioned which, as you noted, works correctly.

    Finally, it is not a problem with char variables in general, because the following code honors the ANSI_PADDING setting while the replace issue does not:

    set ansi_padding on/off
    declare @string char(2), @string2 varchar(2)
    set @string = 's '
    set @string2 = @string
    select '-' + @string2 + '-'

    In my work with extended procedures I have found that string constants appear to be passed as varchars and the string functions also return varchars, so unless you are reading the values from a table (in which case the variable should match the column definition) using varchars for string variables might be a good practice anyway by matching the source type (don’t even get me started on the implicit type conversions. Oh, for a strongly typed language…) At worst even if the constant string completely fills the varchar variable the extra two (or four) bytes of storage are only two bytes, not gigabytes as it would be in a billion row table.

    John Miller

    21 Nov 05 at 12:26 am

  2. The documentation for space states the return type is char but the following still succeeds:

    if replace('two  spaces', space(2), space(1)) = 'two spaces'
        print 'Replacement worked'
    else
        print 'Replacement failed'

    Apparently the return value of the function is either not as stated, or the input to the replace function isn’t where it’s being trimmed.

    Xaprb

    23 Nov 05 at 1:40 pm

  3. Microsoft’s Abdy Iman has responded to tell me someone is looking into the bug. They’ve confirmed it and that it is still present in SQL Server 2005.

    Xaprb

    15 Dec 05 at 5:41 pm

  4. I tripped over this bug too when trying to “unabbreviate” data in a view. The following failed because of the bug:

    SELECT DISTINCT
         REPLACE(REPLACE(department, 'sch of ', 'School of '), 'sch ', 'School of ') AS  deptExpanded, [position]
    FROM         vWebDirectory
    ORDER BY deptExpanded, [Position]

    All instances of ’sch’ were replaced by ‘School of’, including within ‘School’ etc.

    An ugly workaround was developed that may be useful in similar circumstances, see below;

    SELECT DISTINCT
                          CASE WHEN charindex('sch of ', department)  0 THEN replace(department, 'sch of ', 'School of ') WHEN charindex('sch ', department)
                           0 THEN replace(department, 'sch ', 'School of ') ELSE department END AS deptExpanded, [position]
    FROM         vWebDirectory
    ORDER BY deptExpanded, [Position]

    Phred

    21 Mar 06 at 11:32 pm

  5. This solved the problem for me…

    select replace(fld, N’ ‘, N’ ‘)
    from table

    This would help when fld is an nvarchar instead of varchar.

    Anon

    17 May 07 at 1:20 pm

Leave a Reply