Tuesday, August 18, 2020

Trigger Validations and Isolation Levels

Writing data integrity code in TRIGGER objects is generally is pretty simple. Typically you write a query to see “is there a row in inserted/deleted that is not set up as desired for the current transaction?” and no bad data is found, keep moving. And because MOST of the time, you would only use a TRIGGER object where transactions are quick coupled with the reality that a lot of the issues like I will describe would have to occur within milliseconds… concurrency issues seem to rarely occur. But…anything that isn’t impossible, is possible. Because of the rarity of the issue, if your code isn’t running a life of death system, when the anomalies happen, they are brushed off as “just something weird occurred”. 

This blog will cover one such weird occurrence that can happen when your trigger validation code queries another table in the default isolation level. You expect that your data is protected until your transaction ends, but it may not be.

As an example scenario, consider something like a Payment being processed for a Sale. For example, say the payment was set to ‘Processed’, but the user asked for a refund.  The users sets the Payment status to ‘Refunded’ and expects that there are no Sale rows recorded for that Payment. (The sale could be a shipment, etc.. Admittedly this scenario is very thin, and is really not a life or death scenario, but it should do for an example scenario to make the concurrency concerns clear.) Yet, somehow when the process has completed, it turns out that a Sale has been recorded.

To demonstrate the issue, I will first create two barebones tables, one for the Payment, and one for the Sale: 

CREATE SCHEMA Example;
GO

CREATE TABLE Example.Payment
(
    PaymentId int CONSTRAINT PKPayment PRIMARY KEY,
    StatusCode char(10) NOT NULL
)

--insert a Payment row
INSERT INTO Example.Payment(PaymentId, StatusCode)
VALUES (1,'Processed'), (2,'Processed');
GO

CREATE TABLE Example.Sale
(
    SaleId int CONSTRAINT PKSale PRIMARY KEY,
    PaymentId int NOT NULL
    CONSTRAINT FKSale$References$PaymentId 
       REFERENCES Example.Payment(PaymentId)
);

Next, I created a TRIGGER object to check to see if the Payment is Processed before inserting the Sale row: 

CREATE TRIGGER Example.Sale$InsertCheckStatusTrigger
ON Example.Sale
AFTER INSERT AS
BEGIN
    SET NOCOUNT ON; --to avoid the rowcount messages
    SET ROWCOUNT 0; --in case the client has modified the rowcount

    --use inserted for insert or update trigger, deleted for update or 
    --delete trigger count instead of @@ROWCOUNT due to merge behavior that 
    --sets @@ROWCOUNT to a number that is equal to number of merged rows, 
    --not rows being checked in trigger
    DECLARE @msg varchar(2000), --used to hold the error message

    --use inserted for insert or update trigger, deleted for update or 
    --delete trigger count instead of @@ROWCOUNT due to merge behavior that 
    --sets @@ROWCOUNT to a number that is equal to number of merged rows, 
    --not rows being checked in trigger
    @rowsAffected int = (SELECT COUNT(*) FROM inserted);

    --no need to continue on if no rows affected
    IF @rowsAffected = 0 RETURN;
    BEGIN TRY
      --[validation section]
      IF EXISTS (SELECT *
                 FROM inserted
                       JOIN Example.Payment
                         ON inserted.PaymentId = Payment.PaymentId
                 WHERE Payment.StatusCode <> 'Processed') --we already know PaymentId exists
        BEGIN
           IF @rowsAffected = 1 --custom error message for single row
              SELECT @msg = CONCAT('The status code of PaymentId ', inserted.PaymentId, 
                                                               ' is not ''Processed''.')
              FROM inserted; --and/or deleted, depending on action
           ELSE
              SELECT @msg = 'A PaymentId in the inserted Sale rows is not Processed';

           --in the TRY . . . CATCH block, this will redirect to the CATCH
           THROW 50000, @msg, 16;
        END;

      --[modification section]
    END TRY
    BEGIN CATCH
       IF @@trancount > 0
           ROLLBACK TRANSACTION;
       THROW; --will halt the batch or be caught by the caller's catch block
    END CATCH;
 END;

Now, a few basic tests:

--Works, real Payment, Payment is Processed
INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(1,1)
GO

INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(2,3) --not a real PaymentId

Hence the error message comes from the FOREIGN KEY constraint:

Msg 547, Level 16, State 0, Line 107
The INSERT statement conflicted with the FOREIGN KEY constraint "FKSale$References$PaymentId". The conflict occurred in database "tempdb", table "Example.Payment", column 'PaymentId'.

Now we set the Payment to Refunded and try again.

UPDATE Example.Payment
SET Payment.StatusCode = 'Refunded'
WHERE PaymentId = 1;

Execute the following statement.

INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(3,1);

And as expected, it returns the following error:

Msg 50000, Level 16, State 16, Procedure Sale$InsertCheckStatusTrigger, Line 40 [Batch Start Line 125]
The status code of PaymentId 1 is not 'Processed'.

Perfect. Now insert 2 rows:

INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(3,1) , (4,2) --2 is Processed, 1 is not

This shows us the multi-row error message. (Note: Always test multiple rows. I have been working on TRIGGER examples the past few days and I got something wrong in the object on my first try because I didn’t adequately allow for rows that have good values in them mixed with bad values.)

Msg 50000, Level 16, State 16, Procedure Sale$InsertCheckStatusTrigger, Line 40 [Batch Start Line 134]
A PaymentId in the inserted Sale rows is not Processed

However, what if you are in the process of inserting a Sale row when the status is changed. For example, since 2 is still Processed, lets start a transaction and insert a new Sale row, without committing the transaction

BEGIN TRANSACTION
INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(3,2);

Then on another connection, without starting a transaction, change the status:

UPDATE Example.Payment
SET StatusCode = 'Refunded'
WHERE PaymentId = 2;

This succeeds immediately. Now check the data:

SELECT SaleId, Payment.StatusCode, @@TRANCOUNT AS TranCount
FROM Example.Sale
        JOIN Example.Payment
            ON Payment.PaymentId = Sale.PaymentId
WHERE Sale.SaleId = 3;

This returns ‘Refunded’, even while we are in the active transaction.

SaleId      StatusCode TranCount
----------- ---------- -----------
3           Refunded   1

Neither transaction has been blocked from doing what they want, but if I commit this transaction, the data is not in the status that we desire, and we have presumably just shipped product that will have real costs to ship and get back. Obviously this is not a BIG issue for something like this, but it is not what you expect when you build the code this way. So I will rollback:

ROLLBACK;

Rather, what is needed is to change the isolation level of the validation code to REPEATABLEREAD. You can do this with an isolation level hint, or SET TRANSACTION ISOLATION LEVEL statement. I will change the trigger to use a REPEATABLEREAD hint on the query to Example.Payment:

ALTER TRIGGER Example.Sale$InsertCheckStatusTrigger
ON Example.Sale
AFTER INSERT AS
    BEGIN
    SET NOCOUNT ON; --to avoid the rowcount messages
    SET ROWCOUNT 0; --in case the client has modified the rowcount

    --use inserted for insert or update trigger, deleted for update or 
    --delete trigger count instead of @@ROWCOUNT due to merge behavior that 
    --sets @@ROWCOUNT to a number that is equal to number of merged rows, 
    --not rows being checked in trigger
    DECLARE @msg varchar(2000), --used to hold the error message

    --use inserted for insert or update trigger, deleted for update or 
    --delete trigger count instead of @@ROWCOUNT due to merge behavior that 
    --sets @@ROWCOUNT to a number that is equal to number of merged rows, 
    --not rows being checked in trigger
    @rowsAffected int = (SELECT COUNT(*) FROM inserted);

    --no need to continue on if no rows affected
    IF @rowsAffected = 0 RETURN;
    
    BEGIN TRY
      --[validation section]
      IF EXISTS (SELECT *
                 FROM inserted
                        JOIN Example.Payment WITH (REPEATABLEREAD)
                           ON inserted.PaymentId = Payment.PaymentId
                 WHERE Payment.StatusCode <> 'Processed') --we already know PaymentId exists
       BEGIN
          IF @rowsAffected = 1 --custom error message for single row
              SELECT @msg = CONCAT('The status code of PaymentId ', inserted.PaymentId, 
                                                                ' is not ''Processed''.')
              FROM inserted; --and/or deleted, depending on action
          ELSE
              SELECT @msg = 'A PaymentId in the inserted Sale rows is not Processed';

              --in the TRY . . . CATCH block, this will redirect to the CATCH
              THROW 50000, @msg, 16;
        END;

     --[modification section]
     END TRY
     BEGIN CATCH
       IF @@trancount > 0
           ROLLBACK TRANSACTION;
       THROW; --will halt the batch or be caught by the caller's catch block
     END CATCH;
END;

Reset the status of Example.Payment to Processed.

UPDATE Example.Payment
SET Payment.StatusCode = 'Processed'
WHERE PaymentId = 2;

Now start a transaction, and insert a new sale:

BEGIN TRANSACTION
INSERT INTO Example.Sale(SaleId, PaymentId)
VALUES(3,2);

Worked as expected. Then on another connection, without starting a transaction, change the status:

UPDATE Example.Payment
SET StatusCode = 'Refunded'
WHERE PaymentId = 2;

This statement is now blocked because the REPEATABLEREAD hint prevents rows that have been used in the transaction from changing, and hence it holds the shared locks until we commit our transaction. Look at the data:

SELECT SaleId, Payment.StatusCode, @@TRANCOUNT AS TranCount
FROM Example.Sale
       JOIN Example.Payment
           ON Payment.PaymentId = Sale.PaymentId
WHERE Sale.SaleId = 3;

This returns processed, and we won’t let them set the refunded status until we are finished with this row.

SaleId      StatusCode TranCount
----------- ---------- -----------
3           Processed  1

COMMIT the transaction and the UPDATE statement in the other connection will clear.

COMMIT;

Of course, after the status update on the Example.Payment row goes through, your data is still in the situation you didn’t wish it to be in. Naturally this is because the update of the Payment should have its own checks to make sure that if the order has shipped, you can’t change the status (which in this particular scenario, should have caused the Payment update to be blocked if it was coded to look at all of the data that used the PaymentId.)  That refund process probably would do something to try to cancel the Sale rows.

The basics here are that time you use a trigger (and really any code) to validate some piece of data, consider if you need to use REPEATABLE READ isolation level (to stop rows you have read from being changed or deleted) or SERIALIZABLE (to add to REPEATABLE READ that no new rows could be introduced that affect your validation either.)

Concurrency is really the hardest part of building T-SQL code. There are just so many tiny details like this that complicate matters. That one hint took 10 seconds to code, but without it, we might randomly send product to customers that they didn’t pay for… Which never happens, right? I mean, who hasn’t gotten some form of messed up order?. I know I have gotten double my order on occasion. Most of those problems likely come down to concurrency issues, or poor transaction management (or both.)

The post Trigger Validations and Isolation Levels appeared first on Simple Talk.



from Simple Talk https://ift.tt/3awVeDY
via

No comments:

Post a Comment