I am in the middle of working on my DB Design conference that occurs in a bit over a week from now. But I had this come up in some work I was doing and wanted to put it down in a blog.
One of the most exciting features of SQL Server 2005 was the inclusion of Common Table Expressions (CTE). Code that often needed a tangle of temp tables could be now be done in a single query (Derived tables can be used too, but I can’t remember when derived tables started in SQL Server, but it may have been 2005, or perhaps 2000).
The problem is, often you want to write a query to look for bad data, fix the bad data in the table, then use the same base query in a procedure/trigger or testing/validation code. If you have used a CTE in your query, this can sometimes be tricky as they cannot be used in a conditional like IF EXISTS (queryWithCTE).
For example, say your query was the following (using WideWorldImporters), where you want to make sure that a customer only ordered one size of product (this predicate is clearly senseless (and no data meets this requirement), but uses it WWI data which is simple and something that everyone can easily attain and fiddle with.) So, you might write the following query:
WITH CustomerOrderedSizes AS ( SELECT DISTINCT Customers.CustomerID, StockItems.Size FROM Sales.Customers JOIN Sales.Orders ON Orders.CustomerID = Customers.CustomerID JOIN Sales.OrderLines ON OrderLines.OrderID = Orders.OrderID JOIN Warehouse.StockItems ON StockItems.StockItemID = OrderLines.StockItemID WHERE StockItems.Size IS NOT NULL ) SELECT CustomerId, COUNT(*) FROM CustomerOrderedSizes GROUP BY CustomerOrderedSizes.CustomerID HAVING COUNT(*) > 1;
Note: There is another, simpler way to write this query using COUNT(DISTINCT Size) in a HAVING clause without the CTE. The technique to find duplicates is not the point of this article, it is the technique of using a query that needs WITH in it in a conditional, and a real example would be a lot more complex to build (this query is unwieldy enough).
This finds that there are customers who have ordered more than one size (in fact that is the case for every customer in this database that ordered products that record a size). So you might clean up the data, and in your code, want to stop them from doing it again. More than once I have they tried to take the query I have written with the CTE, shove it in an IF EXISTS() construct, not even thinking whether it would run:
IF EXISTS ( WITH CustomerOrderedSizes AS ( SELECT DISTINCT Customers.CustomerID, StockItems.Size FROM Sales.Customers JOIN Sales.Orders ON Orders.CustomerID = Customers.CustomerID JOIN Sales.OrderLines ON OrderLines.OrderID = Orders.OrderID JOIN Warehouse.StockItems ON StockItems.StockItemID = OrderLines.StockItemID WHERE StockItems.Size IS NOT NULL ) SELECT CustomerId, COUNT(*) FROM CustomerOrderedSizes GROUP BY CustomerOrderedSizes.CustomerID HAVING COUNT(*) > 1) THROW 50000,'Customers can only order one size, for some weird reason',1;
Only to be greeted with:
Msg 156, Level 15, State 1, Line 18 Incorrect syntax near the keyword 'WITH'. Msg 102, Level 15, State 1, Line 32 Incorrect syntax near ')'.
If, like this query, you only have one, non-recursive CTE, the query could be easily rewritten with the CTE in a derived table such as:
SELECT … FROM (<CTE CODE>) as CTEName …
But often the query may have a CTE that has multiple parts, each referencing the previous CTE, and perhaps one or more of the CTEs being referenced multiple times. This cannot easily be rewritten. Instead, you can use the SQL statement to assign a value to a variable. In its simplest form, that might be either 1 or 0. For In the following format:
DECLARE @condition bit = 0; WITH CheckThis AS ( SELECT 1 AS Value --1 for fail, 0 for succeed ) SELECT TOP (1) @condition = 1 --if one row matches the criteria, then it should fail, so TOP 1 FROM CheckThis WHERE Value = 1; IF @condition = 1 THROW 50000,'Failed', 1;
This lets you keep the CTE oriented code and catch that at least one value has failed (which is to say, has succeeded from the query’s point of view). So expanding this to my previous example, the query would look like the following:
DECLARE @condition bit = 0; WITH CustomerOrderedSizes AS (SELECT DISTINCT Customers.CustomerID, Size FROM Sales.Customers JOIN Sales.Orders ON Orders.CustomerID = Customers.CustomerID JOIN Sales.OrderLines ON OrderLines.OrderID = Orders.OrderID JOIN Warehouse.StockItems ON StockItems.StockItemID = OrderLines.StockItemID WHERE Size IS NOT NULL) SELECT TOP(1) @condition = 1 FROM CustomerOrderedSizes GROUP BY CustomerOrderedSizes.CustomerID HAVING COUNT(*) > 1; IF @condition = 1 THROW 50000, 'Customers can only order one size, for some weird reason', 1;
You can test that it works by changing the HAVING clause to = 1, as there are no customers that have ordered a single size. No error message will be thrown.
Lastly, if you really want to get a bit more interesting with your error messages, you can grab some information in the query, for example a customer name that violated the rules. Also, to tell if multiple violations have occurred, instead of TOP (1), get TOP(2) and if the rowcount <> 1, you can know that multiple rows fail the check.
DECLARE @condition bit = 0, @CustomerName nvarchar(100), @rowCount int = 0, @msg nvarchar(1000); WITH CustomerOrderedSizes AS ( SELECT DISTINCT Customers.CustomerID, Size FROM Sales.Customers JOIN Sales.Orders ON Orders.CustomerID = Customers.CustomerID JOIN Sales.OrderLines ON OrderLines.OrderID = Orders.OrderID JOIN Warehouse.StockItems ON StockItems.StockItemID = OrderLines.StockItemID WHERE size IS NOT NULL ) SELECT TOP (2) @condition = 1, @CustomerName = MAX(CustomerName) --if there are more than one, that's enough to know there are multiples FROM CustomerOrderedSizes JOIN Sales.Customers ON Customers.CustomerID = CustomerOrderedSizes.CustomerID GROUP BY CustomerOrderedSizes.CustomerID HAVING COUNT(CustomerOrderedSizes.Size) > 1; SET @rowCount = @@ROWCOUNT; IF @Condition = 1 BEGIN IF @rowCount = 1 SET @msg = 'Customer: ' + @CustomerName + ' ordered > one size'; ELSE SET @msg = 'Multiple customers ordered > one size. Example Customer: "' + @CustomerName + '"'; THROW 50000,@msg,1; END;
Now you have an error message that gives you a meaningful place to look, AND indicates at least one example for you to check (you could use the PK of the customer to pass to an application in your error message.
This returns:
Msg 50000, Level 16, State 1, Line 30 Multiple customers ordered > one size. Example Customer: "Can ozcan"
Note that I still used the @condition variable, rather than using rowcount, or the name of the customer. Rowcount is not a bad choice, but it is easy for some other coder to inadvertently mess up fetching the rowcount, because you have to get @@ROWCOUNT in the very next statement or it can be cleared. CustomerName is tricky too, because you need to make sure that you pick a value that CustomerName can never be. Maybe NULL? Seen it. A customer named ‘’, possible if you do not have constraints to prevent empty data. So using a common variable that you trust to be set by a literal just feels safer, and adding on error message stuff is less dangerous in the long run. It all seems kind of annoying until it is 3 in the morning and you get an error message that you don’t have a value to start looking for in your source query.
Now, if we could just get a few horrifying error messages from SQL Server to do the same. I am looking at you truncation message:
Msg 8152, Level 16, State 4, Line 1 string or binary data would be truncated.
The post Using WITH in an IF Condition appeared first on Simple Talk.
from Simple Talk https://ift.tt/2wGcqn7
via
No comments:
Post a Comment