Friday, June 16, 2023

Working with MySQL transactions

MySQL transactions provide an effective method for executing multiple statements as a single unit, making it possible to safely modify data, while preventing concurrent users from updating the same data or overwriting changed data. If any statements within a transaction fail or do not achieve their desired results, the changes can be rolled back and the database returned to its original state, as it existed prior to starting the transaction. Otherwise, MySQL commits the changes to the database and then terminates the transaction.

Transactions play a pivotal role in ensuring the integrity of MySQL data, which is why database developers should have a good foundation in how they work. There are many aspects to transactions, however, far more than can be covered adequately in a single article. The purpose of this article is to introduce you to the basics of transactions, so you have a better understanding of how to get starting using them when writing your SQL code.

To this end, I provide you with several examples that demonstrate how MySQL transactions work and why they’re important. Although the examples are fairly basic, they should help you understand what it takes to create a transaction and what types of behavior you can expect. Transactions are an invaluable tool when trying to maintain data consistency and reliability across your database. The better you understand how to use them, the more effectively you can ensure the integrity of your data.

Note: The examples in this article are based on a local instance of MySQL that hosts a very simple database and tables. The last section of the article— “Appendix: Preparing your MySQL environment”—provides information about how I set up my system and includes a SQL script for creating the database and tables on which the examples are based.

Transaction basics in MySQL

MySQL transactions maintain consistency across the database regardless of the complexities of your operations. They help achieve a state of compliance referred to as ACID, an acronym for atomicity, consistency, isolation, and durability:

  • Atomicity. The statements within a transaction are treated as a single unit. Either all changes made by the statements are applied, or none of the changes are applied.
  • Consistency. The database is returned to a consistent state after the transaction runs, whether the transaction is committed or rolled back. The transition should not introduce data anomalies or inconsistencies.
  • Isolation. All operations within that transaction are completely isolated from and unaffected by other transactions.
  • Durability. All changes applied by a transaction are permanently persisted to the database after the transaction has successfully completed.

In this article, I focus on how to define explicit transactions that run multiple SQL statements. I say “explicit” because MySQL also supports implicit transactions.

By default, the database engine treats each SQL statement as an atomic unit that runs as its own transaction. For example, if you execute an UPDATE statement and there are no errors, the changes are committed to the database and made permanent. If an error occurs, the statement is rolled back, and no data is changed.

You can override this behavior by running a SET statement that disables the autocommit setting, but if you do, you must then manually commit or roll back your statements each time you run any statements. You might prefer the added control this approach offers (for example on a production instance where you are doing ad-hoc work), but that will depend on your specific requirements.

A more common approach is to define explicit transactions, especially in stored programs such as functions, triggers, and stored procedures. Stored programs often use transactions to ensure that data changes are committed as an atomic operation. An explicit transaction temporarily disables autocommit, while providing greater control over your transactional logic.

An explicit transaction typically includes the following three SQL statements:

  • START TRANSACTION. Begins a new transaction. The statement disables autocommit for the duration of the transaction. It also implicitly ends any transactions active in the current session.
  • COMMIT. Commits changes made in the current transaction to the database and terminates the transaction.
  • ROLLBACK. Rolls back changes made in the current transaction according to the code’s logic and terminates the transaction.

The key to using these statements is to define the correct logic in your SQL code to ensure that your transactions protect against possible data integrity issues by committing or rolling back operations according to that logic.

Processing queries in MySQL

Before I demonstrate how to define a transaction, I first want to show you how MySQL processes a set of statements that are not defined within an explicit transaction. The following stored procedure enables a user to add a new manufacturer and its first airplane to the database:

DROP PROCEDURE IF EXISTS add_first_plane;

DELIMITER //

CREATE PROCEDURE add_first_plane(
  IN mfc_name VARCHAR(50), IN plane_name VARCHAR(50), 
    IN engine_type VARCHAR(50))
BEGIN
  DECLARE mfc_id INT;
  DECLARE pln_id INT; 
  INSERT INTO manufacturers (manufacturer)
    VALUES(mfc_name);
  SET mfc_id = 
    (SELECT manufacturer_id FROM manufacturers
      WHERE manufacturer = mfc_name);
  INSERT INTO airplanes 
    (plane, manufacturer_id, engine_type)
    VALUES (plane_name, mfc_id, engine_type);
END//

DELIMITER ;

If you’ve been following along with this series, many of the elements in the procedure definition should be familiar to you. You can refer back to the article Working with MySQL Stored Procedures if something is unclear. In this case, the procedure definition defines three input parameters: mfc_name, plane_name, and engine_type, which are followed by a standard BEGIN…END block.

The procedure also includes elements you might not have seen before. First, the BEGIN…END block starts with a DECLARE statement that declares the mfc_id variable, which is defined with the INT data type. This is followed by the pln_id variable, also defined with the INT data type.

Notice that you don’t precede the variable name with the at (@) symbol when declaring it or using it throughout the procedure, as you would outside the procedure. In addition, the DECLARE statements must come at the beginning of the procedure’s BEGIN…END block before the other statements.

The variable declarations are then followed by an INSERT statement that adds a row to the manufacturers table, using the mfc_name variable as the row’s manufacturer value.

Next comes a SET statement that assigns a value to the mfc_id variable. The value is determined by the SELECT statement in the subquery, which retrieves the manufacturer_id value for the newly inserted row.

The last statement in the procedure’s BEGIN…END block is an INSERT statement that adds a row to the airplanes table, using the plane_name, mfc_id, and engine_type variables to provide the row’s values.

After you add the stored procedure to the database, you can use a CALL statement to run the procedure, passing in the necessary parameter values. In the following CALL statement, I specify Airbus as a manufacturer, A340-600 as the first plane model, and Jet as the engine type:

CALL add_first_plane ('Airbus', 'A340-600', 'Jet');

When you run this statement, MySQL processes the procedure’s statements and commits the changes to the database one statement at a time. In this case, a row is first added to the manufacturers table and then a row to the airplanes table. You can confirm these results by running the following SELECT statements:

SELECT * FROM manufacturers;

SELECT * FROM airplanes;

This returns the following two result sets:

A screenshot of a computer Description automatically generated with low confidence

And:

Ideally, the add_first_plane stored procedure should work when adding other data, but that’s not always the case. For example, if you were to rerun the same CALL statement, you would receive the following error:

Error Code: 1242. Subquery returns more than 1 row

The problem is that the first INSERT statement worked and a second row with the same manufacturer name has been added to the manufacturers table, but it wasn’t until the procedure tried to retrieve a single manufacturer_id value that the error was generated. This is because you now have duplicate data in that table. If you query the manufacturers table you will see this:

A screenshot of a computer Description automatically generated with low confidence

Typically, you’ll want to avoid such scenarios. Not only can they make the data confusing (while undermining the tenets of a relational model), but they also require you or someone else to fix the problem. In this case, the error occurred before inserting a duplicate row into the airplanes table, so only one table needs to be corrected (although this is still one table too many).

But the problem doesn’t stop there. Suppose some well-meaning DBA comes along and updates the name of the plane column in the airplanes table:

ALTER TABLE airplanes 
  RENAME COLUMN plane TO airplane;

Perhaps this was done for good reasons, but if the change was made without updating the stored procedure, you would run into additional issues when trying to run the procedure. For example, the following CALL statement tries to add Beechcraft as a manufacturer and Baron 58 as the first plane model, which has a piston engine:

CALL add_first_plane ('Beechcraft', 'Baron 58', 'Piston');

If you try to run this statement, you’ll receive the following error:

Error Code: 1054. Unknown column 'plane' in 'field list'

Not surprisingly, the error resulted from trying to run the second INSERT statement, which is defined with the original column name. As a result, the row was not inserted into the airplanes table. However, the Beechcraft row was still added to the manufacturers table.

In some cases, it might be okay for a row to be created when another statement fails. Usually though, you might not want a row added unless one was also added to the airplanes table. Fortunately, you can address issues such as these by defining a transaction within your stored procedure.

Creating a transaction in a stored program

Before we get into the details of adding a transaction, you might want to clean up your database so you’re starting with a clean state for the next example. One way to do this is to simply rerun the database creation script in the appendix. You can also either truncate or delete the data in the tables and change the airplane column name back to plane. If you truncate the data, be sure to disable foreign key checks during statement execution.

With that in mind, let’s look at how to include a transaction in our stored procedure. The first step is to add the START TRANSACTION and COMMIT statements into the definition to form a block around some of the other SQL statements:

DROP PROCEDURE IF EXISTS add_first_plane;

DELIMITER //

CREATE PROCEDURE add_first_plane(
  IN mfc_name VARCHAR(50), IN plane_name VARCHAR(50), 
    IN engine_type VARCHAR(50))
BEGIN
  DECLARE mfc_id INT;
  DECLARE pln_id INT; 
  DECLARE mfc_count INT;
  START TRANSACTION;
    SET mfc_count = (SELECT COUNT(*) FROM manufacturers 
      WHERE manufacturer LIKE CONCAT('%', mfc_name, '%') 
                                              FOR SHARE); 
    -- if manufacturer does not exist, add manufacturer; 
    -- otherwise, roll back
    IF mfc_count = 0 THEN
      INSERT INTO manufacturers (manufacturer)
        VALUES(mfc_name);
      SET mfc_id = (SELECT manufacturer_id 
                    FROM manufacturers
                    WHERE manufacturer = mfc_name FOR SHARE);
      INSERT INTO airplanes 
        (airplane, manufacturer_id, engine_type)
        VALUES (plane_name, mfc_id, engine_type);
    ELSE 
      ROLLBACK;
      SELECT CONCAT('Manufacturer \'', mfc_name, 
        '\' might already exist.') AS Warning;
    END IF;
  COMMIT;
END//

DELIMITER ;

The START TRANSACTION statement begins the transaction after the variable declarations. The transaction remains active until it is committed or rolled back.

I’ve also added the mfc_count variable to the procedure definition. I first declare the variable along with the other variable declarations. Then, after the START TRANSACTION statement, I include a SET statement that assigns a value to the variable. The variable value is determined by a subquery that returns the number of existing rows in the manufacturers table that have a manufacturer value similar to the mfc_name value. In this way, you can get a count of the number of rows with a similar name.

In addition to the SET statement, I added an IF statement that checks the value of the mfc_count variable. If the mfc_count value equals 0 (meaning that no similar rows exist) the statements in the IF block run. The IF block statements first add a row to the manufacturers table, then runs a SET statement to assign a value to the mfc_id variable, and finally inserts a row to the airplanes table, using the mfc_id value. This is mostly like what you saw in the previous example.

One thing different about the SET statements in this example is that the subqueries include the FOR SHARE option, which causes MySQL to lock the applicable rows in the referenced table so they’re read-only during the transaction (more info here on MySQL site). This is important when it’s possible for another operation to change the table in a way that could affect other statements. For example, someone might try to delete the row for that manufacturer in the middle of this transaction. Because the FOR SHARE option is included, the applicable rows cannot be updated until this transaction is committed or rolled back.

If the manufacturers table already includes a row for the manufacturer specified in the mfc_name variable (that is, the mfc_count value does not equal 0), MySQL does not run the statements in the IF block but instead runs the statements in the ELSE block. The first of these statements is ROLLBACK, which rolls back and ends the transaction without inserting any data into the database. The ELSE block also includes a SELECT statement that returns a message stating that the manufacturer might already exist.

Before going any further, I want to point out that not all SQL statements can be rolled back, particularly data definition language (DDL) statements such as CREATE TABLE or DROP TABLE. If you include such a statement in your transaction and a subsequent statement fails, the DDL changes are retained, preventing a full rollback. (More details here)

The final COMMIT statement commits the changes to the database and terminates the transaction, unless the IF block was skipped and the statements in the ELSE block ran, in which case the transaction was terminated by the ROLLBACK statement.

Before starting these next steps, I will reset the tables we have been working with.

SET SQL_SAFE_UPDATES = 0;
DELETE FROM airplanes;
DELETE FROM manufacturers;
SET SQL_SAFE_UPDATES = 1;

After you create this procedure, you can run the same CALL statement as earlier:

CALL add_first_plane ('Airbus', 'A340-600', 'Jet');

The first time you run this statement, it inserts a row into each of the two tables, just like you saw in the previous example. However, if you run the statement a second time, it will return the following message, without adding a new row to either table:

Manufacturer 'Airbus' might already exist.

Now suppose our trusty DBA tries to change the column name in the airplanes table once again:

ALTER TABLE airplanes 
  RENAME COLUMN airplane TO plane;

After the table definition has been modified, you can try to run the stored procedure just like before:

CALL add_first_plane ('Beechcraft', 'Baron 58', 'Piston');

Unfortunately, MySQL again returns error 1054 and inserts the row into the manufacturers table. (Execute the procedure again and you will get the error about manufacturer ‘Beechcraft’ might already existing.)

This is because the procedure runs the ROLLBACK statement only if the mfc_count value does not equal 0. If the variable does equal 0, the statements that ran successfully are not rolled back, even if one of them generated an error. For this, we need to modify our procedure definition once more.

Adding exception handling to a transaction

Exception handling can help catch the type of error described above, as well as catch other types of errors. To include it to your procedure, you should add a DECLARE…HANDLER statement that defines a handler for responding to one or more conditions. The statement will then carry out a specific action if one of those conditions occurs. It can also execute other SQL statements related to that action.

For example, the following procedure includes a DECLARE…HANDLER statement that specifies two conditions: SQLEXCEPTION and SQLWARNING:

DROP PROCEDURE IF EXISTS add_first_plane;

DELIMITER //

CREATE PROCEDURE add_first_plane(
  IN mfc_name VARCHAR(50), IN plane_name VARCHAR(50), 
    IN engine_type VARCHAR(50))
BEGIN
  DECLARE mfc_id INT;
  DECLARE pln_id INT; 
  DECLARE mfc_count INT;
  -- roll back if an error occurs
  DECLARE EXIT HANDLER FOR SQLEXCEPTION, SQLWARNING
    BEGIN
      ROLLBACK;
      SELECT ('An error occurred. Contact your administrator.') 
                                                      AS Warning;
    END;
START TRANSACTION;
    SET mfc_count = (SELECT COUNT(*) FROM manufacturers 
                     WHERE manufacturer LIKE CONCAT('%', mfc_name, '%') 
                                               FOR SHARE); 
    -- if manufacturer does not exist, add manufacturer; 
    -- otherwise, roll back
    IF mfc_count = 0 THEN
      INSERT INTO manufacturers (manufacturer)
        VALUES(mfc_name);
      SET mfc_id = (SELECT manufacturer_id FROM manufacturers
                    WHERE manufacturer = mfc_name FOR SHARE);
      INSERT INTO airplanes 
        (plane, manufacturer_id, engine_type)
        VALUES (plane_name, mfc_id, engine_type);
    ELSE 
      ROLLBACK;
      SELECT CONCAT('Manufacturer \'', mfc_name, 
        '\' might already exist.') AS Warning;
    END IF;
  COMMIT;
END//

DELIMITER ;

I’ve defined the DECLARE…HANDLER statement after the other variable declarations. The statement includes EXIT as the action, which means that MySQL will terminate the procedure’s execution if one of the conditions is met. The first condition, SQLEXCEPTION, is met if MySQL encounters an error when executing the procedure’s statements, and the second condition, SQLWARNING, is met of MySQL receives a warning when running the statements.

The DECLARE…HANDLER statement also includes a BEGIN…END block that contains statements that run if MySQL encounters an error or warning. The first statement, ROLLBACK, rolls back any changes if one of the conditions is met and then terminates the transaction. The second statement, SELECT, returns a message indicating that an error occurred. You can also include other information in the message, or you can create a DECLARE…HANDLER statement for each condition and then define a more specific message based on that condition.

Before starting these next steps, I will reset the tables we have been working with.

SET SQL_SAFE_UPDATES = 0;
DELETE FROM airplanes;
DELETE FROM manufacturers;
SET SQL_SAFE_UPDATES = 1;

Now, try to run the same CALL statements as in the previous examples, starting with the one that adds Airbus as a manufacturer and its first plane:

CALL add_first_plane ('Airbus', 'A340-600', 'Jet');

When you run this statement the first time, MySQL should add the data as expected. When you run the statement the second time, MySQL should return the following message without adding a row to either table:

Manufacturer 'Airbus' might already exist.

All this is just like the previous example, but now let’s assume that our wayward DBA once again changes the airplanes column name from plane to airplane:

ALTER TABLE airplanes RENAME COLUMN plane TO airplane;

With the change in place, you then try to add the next manufacturer and airplane, just like you did previously:

CALL add_first_plane ('Beechcraft', 'Baron 58', 'Piston');

This time around, MySQL returns the following message, without adding rows to either table:

An error occurred. Contact your administrator.

With the additional exception handling, the stored procedure should now be able to accommodate errors much more efficiently and without causing additional work for you or someone else.

Working with MySQL transactions

In this article, I’ve given you a basic overview of MySQL transactions so you can begin to understand how they work. Not surprisingly, there’s a lot more to transactions than what I can cover in a single article. You can, for example, create savepoints within a transaction and specifically roll back to them. You can also specify additional characteristics that control how the transaction runs, and you can set the transaction isolation level at a global or session level to better accommodate your workloads. For more information about transactions, see the MySQL topic Transactional and Locking Statements.

In the meantime, what I’ve provided in this article should help you start building a foundation in MySQL transactions so you can begin incorporating them into your code, particularly your stored programs. Although the examples I’ve shown you are fairly basic, they demonstrate how to start, commit, and roll back a transaction—the three pillars of most transactions. From this foundation, you can start building transactions that incorporate more sophisticated statement logic. Just make sure you do your homework before journeying down that path.

Appendix: Preparing your MySQL environment

When creating the examples for this article, I used a Mac computer that was set up with a local instance of MySQL 8.0.29 (Community Server edition). I also used MySQL Workbench to interface with MySQL.

If you want to try out the examples in this article, you need only a minimal database setup, which includes the travel database and the manufacturers and airplanes tables. This is the same database you’ve seen in previous articles in this series, except pared down for this article (especially the airplanes table). To set up the data, run the following script against your MySQL instance:

DROP DATABASE IF EXISTS travel;

CREATE DATABASE travel;

USE travel;

CREATE TABLE manufacturers (
  manufacturer_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  manufacturer VARCHAR(50) NOT NULL,
  create_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  last_update TIMESTAMP NOT NULL 
    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (manufacturer_id) )
ENGINE=InnoDB AUTO_INCREMENT=1001;

CREATE TABLE airplanes (
  plane_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  plane VARCHAR(50) NOT NULL,
  manufacturer_id INT UNSIGNED NOT NULL,
  engine_type VARCHAR(10) NOT NULL,
  create_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  last_update TIMESTAMP NOT NULL 
    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (plane_id),
  CONSTRAINT fk_manufacturer_id FOREIGN KEY (manufacturer_id) 
    REFERENCES manufacturers (manufacturer_id) ) 
ENGINE=InnoDB AUTO_INCREMENT=101;

That’s all there is to the travel database. The manufacturers table must be created first because a foreign key is defined on the airplanes table that references the manufacturers table. You’ll be inserting rows into the tables as you work through these exercises. I suggest you keep this script handy in case you want to recreate your database between examples.

The post Working with MySQL transactions appeared first on Simple Talk.



from Simple Talk https://ift.tt/UE8Yozt
via

No comments:

Post a Comment