Tuesday, January 17, 2023

Manage Data Over Time with SQL Server MERGE Statement

This article is part of Greg Larsen's continuing series on Learning T-SQL. To see all 8 items in the series, click here.

The basic MERGE statement

The MERGE statement was introduced with SQL Server 2008. It provides a single statement to perform UPDATE, INSERT and/or DELETE operations against a target table. The syntax for the basic MERGE statement that will be discussed, in this article, can be found in Figure 1:

MERGE
[ INTO ] <target_table>

USING <table_source> [ [ AS ] table_alias ]
   ON <merge_search_condition>

[ WHEN MATCHED [ AND <clause_search_condition> ]
  THEN <merge_matched> ] [ ...n ]

[ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
  THEN <merge_not_matched> ]

[ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
  THEN <merge_matched> ] [ ...n ];

Figure 1: Basic Merge Statement

Where:

  • target_table – identifies the table or view from which rows will be matched, so the appropriate update, insert, or delete actions can be performed based on the matching criteria.
  • table_source –identifies the data source rows for which target_table rows are matched.
  • merge_search_condition – identifies the join criteria for matching target_table rows, with the source_table rows.
  • merge_matched – identifies either an UPDATE or DELETE statement to issue based on matching criteria.
  • merged_no_matched – identifies the INSERT statement to issue for each row not matched.
  • clause_search_condition – identifies any valid search condition.

For the complete syntax of the MERGE statement refer to the Microsoft documentation that can be found here.

To better understand how the MERGE statement maintains a target table, a few merge examples will be provided, in the sections below. But first some test data needs to be created.

Creating Initial Test Data

To show how the MERGE statement works a new database will be created. The new database will contain tables that I will use as a target and source for the examples, which will be populated with the initial test data. The examples in this article will show how a fictious manufacturing company named Widget Magic keeps this data synchronized using different MERGE examples.

The code in Listing 1 is used to create a database, the target and source table, and populates these tables with some initial rows of data. Note if you don’t want to create a new database the source and target tables can be created in the tempdb database.

-- Create repository for products
CREATE DATABASE MergeDB
GO

USE MergeDB;

SET NOCOUNT ON;

CREATE TABLE dbo.ProductInventory (
   ID int identity, 
   ProductName varchar(20),
   Qty int);

CREATE TABLE dbo.ProductRun(
   RunNum int,
   ProductName varchar(20), 
   Qty int);
-- Product Run #1

USE MergeDB;
GO

INSERT INTO dbo.ProductRun (RunNum, ProductName, Qty) VALUES
   (1,'Widget Basic',5), 
   (1,'Widget Pro',5);
GO

INSERT INTO dbo.ProductInventory (ProductName, Qty) VALUES
   ('Widget Basic',5),
   ('Widget Pro', 5);
GO

Listing 1: Create repository for products

In Listing 1 two tables are created: dbo.ProductInventory and dbo.ProductRun. The dbo.ProductInventory table tracks the inventory of products created by Widget Magic company. Whereas, the dbo.ProductRun table will be used to keep track of the products produced from each product run cycle. The rows of data inserted into these two tables represents the products produced from the first product run of widget products. During that first product run 5 Widget Basic and 5 Widget Pro items were produced.

For the purpose of this article the dbo.ProductInventory table will be used as the target table of the MERGE examples. Whereas, records from the dbo.ProductRun table will be used as the source table records.

Merge Example #1 – Inserting/Updating Using MERGE Statement

In order to show how the MERGE statement works to maintain the dbo.ProductInventory table, the Widget Magic company first needs to have a second product run cycle. The second run produced 5 more Widget Pro items and introduced a new product called Widget Super Pro, of which 5 are produced

The code in Listing 2 inserts two new rows into the dbo.ProductRun table to represent the new products produced by this second product run.

-- Product Run #2
USE MergeDB;
GO

INSERT INTO dbo.ProductRun (RunNum, ProductName, Qty) VALUES
   (2, 'Widget Pro',5),
   (2, 'Widget Super Pro',5);
GO

Listing 2: Inserting Production Run #2 Rows

To update the dbo.ProductInventory table, based on the products manufactured during the second product run, the MERGE statement in Listing 3 will be run.

-- Merge Example #1 
USE MergeDB;
GO

MERGE dbo.ProductInventory AS T  
USING (SELECT * FROM dbo.ProductRun WHERE RunNum = 2) AS S 
ON T.ProductName = S.ProductName 
-- Perform UPDATE when rows are matched 
-- between source and target
WHEN MATCHED THEN 
   UPDATE SET T.Qty = T.Qty + S.Qty
-- Perform INSERT when rows are not matched 
--between source and target
WHEN NOT MATCHED BY TARGET THEN 
   INSERT (ProductName,Qty) VALUES (S.ProductName, S.Qty);

-- Review Inventory
SELECT * FROM dbo.ProductInventory;
GO

Listing 3: Merge Example #2 – Inserting/Updating ProductInventory table

Report 1 shows the rows in the dbo.ProductInventory table after the MERGE statement was executed.

Report 2: Output of SELECT statement in Listing 5

The MERGE statement in Listing 3 compares the dbo.ProductInventory records, with the widgets products produced during the second product run. The dbo.ProductInventory table is the target table, and the rows returned from the subquery “SELECT * FROM dbo.ProductRun WHERE ProductRun = 2” identifies the source rows. The target and source rows are joined based on the merge_search_condition specified with the “ON” clause, which in this case was “T.ProductName = S.ProductName”.

A word of caution here, target table columns should only be compared with source columns. Do not use any target table columns to filter out rows as part of the “ON” clause. Doing so could produce unexpected or incorrect results.

When rows are matched between target and source rows, SQL Server assigns a matching condition for each row, based on the merge_search_condition. When this condition equates to true, the source row is known as a MATCHED with a target row. When the merge search condition is false the source table row is considered “NOT MATCHED”. NOT MATCHED then is be broken into two different conditions “NOT MATCHED BY TARGET” and “NOT MATCHED BY SOURCE”.

The “NOT MATCHED BY TARGET” in this example means a row exists in the source table that did not match a row in table based on the join criteria. Whereas the “NOT MATCHED BY SOURCE” condition means that a row exists in the target table that has no matching rows in the source table (look at Example 2 for to see a “NOT MATCHED BY SOURCE” example). Additionally, when both source and target table comparison columns contain NULL values, the matching value is considered “NOT MATCHED” for both target and source tables.

For each row in source table from second product run, that got a MATCHED condition an UPDATE operation against the target table was performed. The UPDATE operation increases the inventory value of the Qty column, on the matched target row, based on the Qty column value, of the matching row in the source table.

An INSERT operation was performed for each source row, from the second product run that got a NOT MATCHED BY TARGET condition. The INSERT operation uses the source table’s ProductName and Qty values to insert a new rows into the target table for each new product produced.

Report 2 shows that 5 more Widget Pro and Widget Super Pro items were added to the dbo.ProductInventory table. This MERGE statement showed how to insert and update rows into the target table, based on a matching condition with the source table. The MERGE statement can also perform DELETE statements.

Merge Example #2 – Inserting/Updating/Deleting using MERGE Statement

The Widget Magic company so far has had two different product runs of their widget products. The Widget Basic product has not been well received, as the product has been demonstrated to potential customers. Therefore management has decided that this product should be removed from the inventory, once the next product run has completed.

For the third product run, only the Widget Pro, and Widget Super Pro items are produced. The code in Listing 6 inserts these products from product run #3 into the dbo.ProductRun table.

-- Product Run #3
USE MergeDB;
GO

INSERT INTO dbo.ProductRun (RunNum, ProductName, Qty) VALUES
   (3, 'Widget Pro',100),
   (3, 'Widget Super Pro',50);
GO

Listing 4: Inserting Product Run #3 Rows

To show how the MERGE statement can insert, update, and delete rows from a target table, the code in Listing 5 merges will be run.

-- Merge Example #2 
USE MergeDB;
GO

MERGE dbo.ProductInventory AS T  
USING (SELECT * FROM dbo.ProductRun WHERE RunNum = 3) AS S 
ON T.ProductName = S.ProductName 
AND S.RunNum = 3
-- Perform UPDATE when both target and source MATCHED 
WHEN MATCHED THEN 
   UPDATE SET T.Qty = T.Qty + S.Qty
-- Perform INSERT when NOT MATCHED BY SOURCE
WHEN NOT MATCHED BY TARGET THEN 
   INSERT (ProductName,Qty) VALUES (S.ProductName, S.Qty)
-- Perform DELETE when NOT MATCHED BY SOURCE
WHEN NOT MATCHED BY SOURCE THEN
   DELETE;

-- Review Inventory
SELECT * FROM dbo.ProductInventory;
GO

Listing 5: Inserting/Updating/Deleting ProductionInventory table

The code in Listing 5 now contains a third WHEN condition, “NOT MATCHED BY SOURCE”. This WHEN condition is followed when there is a row in the target table that does not exist in the source table. This is the opposite of the insert where there were rows that existed in the source table that did not exist in the target. When no matching rows are found a DELETE statement is executed. Or in this case, the discontinued product name “Widget Basic” will be deleted from the target table.

Additionally, when target and source rows are MATCHED the inventory Qty column of the target table is updated, based on the number of widgets produced during the third product run. The NOT MATCHED BY TARGET condition is not fired because no new products were introduced during the third product run. The resulting rows in the dbo.ProductInventory after Listing 5 is executed, can be seen in Report 3.

Report 3: Output from SELECT statement in Listing 5.

The MERGE example in Listing 5 had single MATCHED, NOT MATCHED BY TARGET and NOT MATCHED BY SOURCE clauses. The MERGE statement can support multiple MATCH and NOT MATCHED conditions, as well be shown in the next example.

MERGE Example 3 – Multiple MATCH and NOT MATCH conditions

To show how multiple MATCH and NOT MATCH conditions can be used on a MERGE statement, some new data management requirements are needed.

The first new requirement is a soft delete. To implement the soft delete the dbo.ProductInventory table needs to be modified to contain a soft delete flag.. Plus the Widget Basic inventory row that was deleted in the last MERGE example, will need to be re-inserted into the changed dbo.ProductInventory table. The code in Listing 6 will implement these two different changes to the target table.

-- Implement Soft Delete Requirements
USE MergeDB
GO

-- Add Soft Delete column
ALTER TABLE dbo.ProductInventory ADD
        ToBeDeleted char(1) NULL
GO

-- Re-introduce Widget Basic product
INSERT INTO dbo.ProductInventory (ProductName, Qty, ToBeDeleted) VALUES
   ('Widget Basic',5,'Y')
GO

Listing 6: Changes needed to implement new soft delete requirements

The second requirement is to perform a hard delete of a dbo.ProductInventory record, if a dbo.ProductRun record is found to have a Qty value of zero (0).

For the fourth product run only the Widget Pro items is produced. Additional a Widget Super Pro record with a Qty value of 0 will be added to the dbo.ProductRun table. This record was added so a hard delete of the dbo.ProductInventory record could be performed. Listing 7 contains the INSERT statements for these two products, for the 4th product run.

-- Product Run #4
USE MergeDB;
GO

INSERT INTO dbo.ProductRun (RunNum, ProductName, Qty) VALUES
   (4, 'Widget Pro',100),
   (4, 'Widget Super Pro',0);
GO

Listing 7: Product Run #4

To implement the two new data processing requirements the MERGE statement in Listing 8 will be used. This statement contains multiple MATCHED and NOT MATCHED BY merge options.

USE MergeDB;
GO

MERGE dbo.ProductInventory AS T  
USING (SELECT * FROM dbo.ProductRun  WHERE RunNum = 4) AS S 
ON T.ProductName = S.ProductName 
WHEN MATCHED and S.Qty = 0 THEN 
   DELETE
WHEN MATCHED THEN
   UPDATE SET T.Qty = T.Qty + S.Qty
WHEN NOT MATCHED BY TARGET AND S.QTY <> 0 THEN 
   INSERT (ProductName,Qty) VALUES (S.ProductName, S.Qty)
WHEN NOT MATCHED BY SOURCE and T.ToBeDeleted = 'Y' THEN
   DELETE
WHEN NOT MATCHED BY SOURCE THEN
   UPDATE SET T.ToBeDeleted = 'Y';

-- Review Inventory
SELECT * FROM dbo.ProductInventory;
GO

Listing 8: Merge data with multiple MATCH and NOT MATCHED conditions

The SELECT statement at the bottom of Listing 8 created the output in Report 4.

Report 4: Output from final SELECT statement in Listing 8.

When there are 2 MATCHED BY clauses the first match clause requires an AND condition. The second MATCHED BY condition is only applied if the first MATCHED BY condition is not met. Additionally, when there are 2 MATCHED BY clauses one must perform an UPDATE and the other a DELETE.

Gotcha’s

Here are a few things that you might run into while using the MERGE statement. The first two were already mentioned, but are worth repeating.

  • Unexpected results can occur if the ON clause tries to filter rows based on target columns.
  • If no rows are returned from the source table when joining the target and source then the UPDATE or DELETE cannot reference a source column because there are no matching rows between target and source.
  • Therefore, care needs to be taken when joining target and source rows, so multiple source rows are not matched with a single target row. This usually means that the join between the target and source data sets needs to be one of equality.

    If more than return more than one source rows is matched then SQL Server will throw the following error:

    Msg 8672, Level 16, State 1, Line 110
    The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

  • There are limitations to the clauses you can include. For example, when there are two different NOT MATCHED BY SOURCE conditions. When there are two NOT MATCHED BY SOURCE conditions the first one needs to have an AND condition, and the condition can only reference target columns. The second NOT MATCHED BY SOURCE condition is only applied if the first NOT MATCHED BY SOURCE is not met. Also one of the NOT MATCHED BY conditions needs to perform a DELETE action and the other an UPDATE action.

For a complete list of requirements and other considerations when using the MERGE statement please refer to the Microsoft Documentation referenced in The Basic MERGE Statement section above.

Cleanup

If you ran the example code in this article it created the MergeDB database on your test instance of SQL Server. The script in Listing 9 can be used to remove this database from your test instance

USE TempDB

DROP DATABASE MergeDB;
GO

Listing 9: Removing the MergeDB database

Manage Data Over Time Using the Basic MERGE Statement

If you are using SQL Server 2008 or above the MERGE statement is supported. A single MERGE statement can be used to replace individual INSERT, UPDATE, and/or DELETE statements. The MERGE statement is ideal for helping manage source and target tables for complex extract, transform, and load situations. Next time an INSERT, UPDATE and DELETE statements are required to maintain the data in a target table, consider using the MERGE statement, to support these requirements.

 

The post Manage Data Over Time with SQL Server MERGE Statement appeared first on Simple Talk.



from Simple Talk https://ift.tt/vGkbAVf
via

No comments:

Post a Comment