Monday, February 18, 2019

How To, and Not To, Escape a String In TSQL

You are never too old to learn an important lesson. Like most programmers, I read programming language documentation for one of two reasons. 1. You have just found out about something new 2. You find out that you were wrong about something you thought you had read the first time, probably 10 years ago. Today’s blog came from an idea that arose from a mistake I made where I made a faulty assumption, and kinda exploded from there into multiple possibilities. This mistake was followed up by reading the documentation for the statement I used incorrectly.

One of my favorite SQL commands has been QUOTENAME. A common programming need when generating code is the need to surround a string value with quotes, and escape any characters that are the same as you are surrounding the string with, with doubles (and if you need doubles of the character in the string, you then need four of the characters.) For example, to surround the following string with single quotes (‘):

Mr. O’Malley

Possibly to use in a dynamic statement or variable declaration, you need to double up on the single quotes in the string, as in:

‘Mr. O”Malley’

Or if you are Rob Volk (@sql_r on Twitter), and you want to create an annoying database on your best frenemy’s SQL Server that includes brackets in the name, like:

This [database] Is Awesome

You will need to do:

CREATE DATABASE [This [database]] Is Awesome];

Doubling up the closing brackets, but not the leading ones. A way to do this is to use QUOTENAME, which has parameters of: string, and separator, by default it works with double brackets like most of SQL Server’s names, but you can give it any character to double up for you. So for our string:

DECLARE @Value nvarchar(15) = 'Mr. O''Malley';
SELECT @Value, QUOTENAME(@value) AS objectName, 
       QUOTENAME(@value,'''') AS string;

This returns:

objectName          string
--------------- ------------------- --------------------
Mr. O'Malley    [Mr. O'Malley]      'Mr. O''Malley'

I cut back on the size of the two items to the right in the output, which were both quite long characters in the textual output (* more details later). This seems to work great, so you feel like if you needed to generate some code, that you might want to do this:

DECLARE @Value nvarchar(200) = 'Mr. O''Malley';
SELECT  CONCAT('SELECT ',QUOTENAME(@Value,''''));

The output is

SELECT ‘Mr. O”Malley’

Paste this in a query execution window and execute. You will see the output is the string we started with. Problem is, QUOTENAME (,bum bum buummmmm, sung with dramatic effect), is there to quote a SQL Server Name value, and since names are not longer than 128 characters in SQL Server…the input you are quoting must be 128 characters or less (*the output, as I demonstrate later can be greater). So:

SELECT QUOTENAME(REPLICATE('a',200));

Where the input is greater than 128 characters, returns NULL, and it does it quite silently. Which is generally not what you will want. In my case, I was designing an extended property script generator, which will take the sql_variant and cast the value to an nvarchar(max). I didn’t test with 129 and greater character inputs, but a coworker had graciously provided me with a string that was like 8000 characters. Luckily this wasn’t in production where my on-call phone was ringing, but still annoying enough/

So, I would suggest to only use QUOTENAME for code destined for production use when actually quoting name values for a SQL Statement, and use the more awkward to code method of:

DECLARE @Value nvarchar(15) = 'Mr. O''Malley'
SELECT @Value, 
       '''' + REPLACE(@value,'''','''''') + '''' AS              
                                      AwkwardButRightWay, 
       QUOTENAME(@Value,'''') AS QUOTENAME;

Which, if you execute it, will have the same output for both of the second columns.

One thing to note is that this (and really any simple repetitive scalar function) looks ripe for a user defined function. In versions prior to 2019, the advice had always been clearly to avoid them because they were, to say the least…performance limiting. In some cases, they caused a small slowdown. In others, massive performance holes. In SQL Server 2019, Microsoft has changed how some scalar functions can be used, and will “inline” the code of the function into the query plan that is using it. The net effect is that the performance is on par with re-coding the statement over and over instead of using a reusable function (even for some complex functions!).

For example, let’s build the function as such, in the WideWorldImporters database:

1234567890123456789012345678901234567890123456789012345678901234567890123
USE WideWorldImporters
GO
CREATE SCHEMA Utility
GO
--always doing single quote for my function
CREATE OR ALTER FUNCTION Utility.EscapeString ( @string nvarchar(max) ) 
RETURNS nvarchar(max)
WITH RETURNS NULL ON NULL INPUT
AS
  BEGIN
     --uncommenting this makes it not inlineable 
        --DECLARE @datevalue date = GETDATE();    
     RETURN '''' + REPLACE(@string,'''','''''') + '''';
  END;

You can execute it as such:

SELECT Utility.EscapeString(CityName)
FROM   Application.Cities;

And you can check if it can be inlined using:

SELECT is_inlineable
FROM   sys.sql_modules
WHERE OBJECT_ID = OBJECT_ID('Utility.EscapeString');

Which for our new function, returns 1.

Inlineable or not, you won’t see any change in this simple usage, I just wanted to point this out as a more viable possibility in the future. For more details on inlining functions, check out Brent Ozar’s post here:https://www.brentozar.com/archive/2018/11/sql-server-2019-faster-functions/.


* Finally, the next bit will look a bit like a stupid code trick, but entirely useful for determining how large a quoted string could be. I wondered what the largest output could be of the QUOTENAME function. If you have the following statement:

SELECT QUOTENAME(']')

It returns: []]] And:

SELECT QUOTENAME(']]')

Returns: []]]]], which is 2 characters longer than the original string. If we have 128 ] characters, we should get SELECT 127 * 2 + 4 or 258 characters, which the following statement will bear out…

SELECT LEN(QUOTENAME(REPLICATE(']',128)))

As would have simply reading the documentation for QUOTENAME, where it says the output if nvarchar(258)! Hence, for handling what size variable/column you need to accommodate from the EscapeString function is basically:

((Max length of your input) – 1) * 2 + 4

Where the max length isn’t two billion like varchar(max) will accommodate, but the max length that your source data can be.  Note of course that when you use this value with all of the escaped strings, that it will look larger than the target value, but just because we need space for 258 characters to hold the name, it does not mean we need 258 characters when you use the name.  Just one more of the complexities you will run into when building a bulletproof code generation tool.

 

The post How To, and Not To, Escape a String In TSQL appeared first on Simple Talk.



from Simple Talk http://bit.ly/2X7uxie
via

No comments:

Post a Comment