Monday, October 26, 2020

Reordering Deployments in Database DevOps

We recently received an interesting question in our Redgate forums from Peter Daniels about altering the order of deployments in database DevOps. The question includes the following scenario:

  1. Developer A makes changes – feature 1.  This set of changes makes its way into the dev integration environment.
  2. Developer B makes changes – feature 2.  This set of changes makes its way into the dev integration environment, too – after feature 1.
  3. We decide that we want feature 2 to go to prod, but NOT feature 1.

For the sake of simplicity, Peter mentions that for this example, “we know that feature 2 is schema-isolated and does NOT depend on feature 1.”

Peter is primarily asking about Git branching workflows in his question. In my opinion, branching workflows are only one of several strategies that may help regarding this situation, so I’m going to zoom out a little bit and consider branching in the context of other practices that can help.

Essentially, I believe some alternative strategies flow more easily long term and support a simpler branching strategy than you might arrive at by thinking about this problem only through the lens of branching workflows.

A quick overview

In this post I’ll cover four big picture strategies that may help, then dig into some technical details for managing this in SQL Change Automation (the tool our question is about). All strategies listed may be used individually or combined with one another.

Strategy 1: The Expand/Contract model (“Parallel Change”)

Strategy 2: Flexible database provisioning / integration and QA databases on demand

Strategy 3: A Hybrid state-and-migrations solution to versioning database changes

Strategy 4: Managing environment-specific branches

A Tool-Specific Example – Redgate’s SQL Change Automation

Conclusions

Strategy 1: The Expand/Contract model (“Parallel Change”)

One useful strategy in authoring database changes is to use a design pattern that focuses on always writing backwards compatible changes. This model known by a few names:

  • Expand / Contract – the approach was popularized under this name in Evolutionary Database Design by Pramod Sadalage and Martin Fowler
  • Parallel Change – Danilo Sato and others have written about the model under this name
  • The “N-1 model”

With this design pattern, database deployments include small changes which are designed to not impact the applications using the database. This is the “expand” phase.

When features enabled by the change are ready to go live, a feature flag / application toggle, or application deployment is done to make new functionality visible to users of the database. This is the “cutover” phase. If there’s a need to undo the change, then this is done by disabling the feature flag or reversing the application deployment — but leaving the data structures in place. 

Finally, after enough time has elapsed to be confident that the new functionality is working well, older structures that are no longer needed may be cleaned up. This is the “contract” phase.

This pattern tends to work best for teams who are able to deploy frequent sets of small changes to their databases using Agile methodologies.

The Expand/Contract model has multiple benefits, as it:

  1. Simplifies the problem of re-ordering planned deployments. In the scenario above, if the database changes for feature 1 are backwards compatible changes which will not be noticeable to customers, we may decide to go ahead and let the changes for feature 1 flow through the pipelines and be deployed. This is because the changes are safe to deploy, even if we don’t plan to light up that functionality for some time.
  2. Simplifies the problem of rollbacksSee this related post for more information on that topic.
  3. Makes database deployments ‘boring’: An important goal of a DevOps process is to have utterly boring, unremarkable deployments. You want to save the excitement for when you “release” functionality to your users, and have a way to control that “release” so that it’s simple to undo the release when something goes wrong. The Expand/Contract model helps you separate database deployments (the Expand and Contract phases) from “releases” (the transition of a feature from off to on). 

The more skilled your team becomes at using the Expand/Contract pattern, the less often you are likely to need to “pull” a deployment from the planned queue.

Expand/Contract (Parallel Change)
Pros Cons
Does not require purchase of a specific technology Requires building skills to deploy small “slices” of changes
Works for teams of all sizes and any database platform Works best when you deploy frequently, which not all teams can do
Combines easily with other strategies listed in this article  

Strategy 2: Flexible database provisioning / integration and QA databases on demand

Another strategy that helps with this problem is the ability to quickly and easily re-provision the environments involved in your database DevOps pipelines.

If we have the ability to quickly create the databases for our integration environment, this gives us the ability to handle the scenario above with the following workflow:

  1. Remove the code for feature 1 from the branch associated with the integration environment — let’s say this branch is named ‘develop’ — but capture the code in a feature branch if we may use it in the future. This might be done by reverting the Pull Request that was used to merge the changes into ‘develop’ earlier (if that was our workflow to get changes into this branch). Alternatively, we could create a new feature branch including the code from feature 1 if one does not exist, then adjust the code in ‘develop’ not include that code.
  2. Deploy a new implementation of the integration database environment reflecting the current (or a very recent) state of production
  3. Deploy ‘develop’ to the reset integration database environment

Having the flexibility to recreate environments quickly has a lot of perks: if we can easily “reset” an environment to a production-like state then this gives us a lot of confidence in exactly how our modified code branch is going to deploy. If we make any mistakes in “removing” code from the branch, then we are very likely to be able to find that in the process of doing this fresh deployment to the environment. 

In order to implement this strategy, we need some tooling to help us implement the database provisioning in a way that isn’t slow or a lot of work. This can be done with data virtualization technology like Redgate’s SQL Clone, with writable SAN snapshots from storage vendors– many of which have APIs which support automation, or with related technology from other storage or software providers.

Like the Expand/Contract model, flexible database provisioning combines well with many tools and processes. Embracing either (or both) of these strategies can help you implement a simpler branching model with fewer long-lived branches to manager and simpler deployment pipelines.

Flexible database provisioning / databases on demand
Pros Cons
Works for teams of all sizes Requires technology and processes to implement (not free)
Provides significant increased flexibility for pipelines and automation  
Combines easily with other strategies listed here  

Strategy 3: A Hybrid state-and-migrations solution to versioning database changes

Sometimes customers can’t meet their needs with the strategies above for a variety of reasons. Perhaps they can’t work in an agile fashion due to political or implementation constraints and need to periodically deploy larger groups of changes in a waterfall pattern. Or perhaps an organizational separation between development and operations is preventing them from being able to embrace on-demand database environments.

Some of these customers find success with using an approach to database versioning where they maintain their source code in a “hybrid” fashion, using two folders in a single source control repository. Redgate offers a solution which provides this approach (video, article), but others might be able to create a similar approach on their own by combining other tooling.

In this hybrid approach:

  • One folder tracks the “state” of current work of the integration environment.
  • Another folder tracks the code which will be deployed. Changes are imported into this second folder as “migrations” when the deployment order is decided. 

This model may suit teams for a few reasons. There may be a large number of team members who prefer a low-code approach where they primarily work with GUIs and use lightweight tools to capture database state on a regular basis, and only worry about what the deployment code will look like when they are getting ready to release. Perhaps the deployment code is primarily ordered and crafted by a smaller group of developers who specialize in implementation details for that database platform. Or there may not be much visibility at all regarding the order of deployments for many team members during a large part of the development process.  

Hybrid State-and-Migrations
Pros Cons
Enables database code to be stored in version control, built, and tested, while selectively identifying code that is ready to deploy Requires storing both “state” and “migrations” (deployment code) in the version control repository, which adds some complexity
Allows deployment code (migrations) to be customized for high-availability requirements and seamlessly integrating data modification changes Tooling must keep both “state” and “migrations” in sync— or if it allows drift it should make the drive visible to the team
Supports a simpler branching workflow that a non-hybrid model Reordering changes after migrations are generated adds notable complexity: you want the path to deployment after migration generation to be as fast as possible (this means that the ability to have database environments provisioned on-demand is still desirable)

Strategy 4: Managing environment-specific branches

Another strategy is to maintain environment specific branches. An example of this strategy is described in “ENV Branching with Git.”

In this branching model, feature branches are used in combination with environment specific branches. An example of how code might flow through this model is:

  • Developers Lexi and Freyja commit database changes to a branch FeatureWidget, perhaps working first in individual sandbox environments where they have their own copy of the database. When working in this feature branch, each of them uses temporary branches off of FeatureWidget to draft work, and regularly merge their changes back into FeatureWidget to collaborate. 
  • When ready, FeatureWidget is merged to the Integration environment branch with a Pull Request. The Pull Request process includes an automated build and deployment of the database for initial validation, then when approved automation deploys the changes from FeatureWidget to the Integration environment database.
  • Important note: the FeatureWidget branch is not deleted at this time, as it will continue to be used
  • When this feature is ready to proceed to the QA environment, the FeatureWidget branch is merged to that environment via a Pull Request
  • If changes need to be made to this feature, changes are made to the FeatureWidget branch (perhaps by using temporary branches and Pull Requests) and Pull Requests are used to re-deploy this code to all branches and related database environments where it has been deployed. (Note: this may require you to make a plan to remove a deployment/reset a lower database environment, depending on your tooling choices. More on this in the example below.)

Although this model may sound simple at first, complexities creep in when you begin to look closely at the challenge of maintaining database state. While this model works well for some teams, it is not a “simple trick” to solving this problem by itself. Both the Expand/Contract model and the ability to provision databases on demand help keep this model simple.

Environment Specific Branches
Pros Cons
Enables database code to be stored in version control, built, and tested, while selectively identifying code that is ready to deploy If changes are reordered after they reach “upper” environments late in the pipeline, resetting environments is still needed if you want your deployments to simulate what the later production deployment will be like (this means that the ability to have database environments provisioned on-demand is still desirable).
Does not require release branches to be used Requires maintaining branches for each environment
  Scoping is important: splitting apart or reordering the changes inside an individual feature branch adds complexity. (Dividing changes into small slices in a way similar to the Expand/Contract model mitigates this risk.)
  If you decide to cancel deployment of a feature branch altogether, you need to remove the code from “lower” environment branches as well as have a method of removing the code from the database or resetting the environment and redeploying to it.

A Tool-Specific Example – Redgate’s SQL Change Automation

While it’s really valuable to consider the strategies listed above, how changes flow through your pipelines will also vary based on the tools you’re using to version control your database code.

Our forum question is about Redgate’s SQL Change Automation, so I’ll dig into that here. If you are using a different tool, it’s quite likely that some of these considerations will still apply (particularly #3 and #4), so it may still be worth your while to think through these in the context of your tooling.

1. Group your versioned migration scripts in folders per deployment (or at a more granular level)

When setting up your SQL Change Automation project, one key thing to consider is whether or not you’d like to use folders to separate your migration scripts. Folders are extremely helpful in separating migration scripts into individual groups, which can make your life much easier when planning and managing deployments.

If you need to re-order a deployment and you have captured all the scripts for that deployment in a folder, it is much more intuitive to manage order changes by removing or renaming folders.

Note that a folder can hold either other folders OR it may hold scripts.

This means you can use subfolders, which is very convenient! You simply cannot combine scripts and folders at the same folder level. 

2. Do you prefer to disable “programmable objects” so that all changes for a deployment are in a single folder?

SQL Change Automation’s programmable object feature treats stored procedures, views, user defined functions (UDFs), and DDL triggers in a special way: it automatically manages these objects as repeatable migration scripts. This is possible because the entire code for these objects is always included in a command to either create or alter them — unlike, for example, an ALTER TABLE statement which only contains details about the specific parts of the object being altered.

This can be advantageous for teams who frequently make changes to these types of objects. The benefits of enabling programmable objects are that:

  • Programmable Objects are stored using a consistent filename based on the object name, making them easy to identify and simplifying the process of handling conflicts when merging branches 
  • Programmable Objects simplify change ordering within a deployment: when changed, programmable objects are automatically executed at the end of the deployment, which easily ensures that dependencies in scripts changing table schema are executed prior to programmable object changes

However, the consistent naming and location of programmable objects means that they are not stored in the same folders as versioned migrations.

This adds some complexity to the process of identifying which changes are grouped together for a planned deployment. If you would like this to be a simple process and to use only versioned migrations in folders for all database changes, you may do this by disabling the programmable object feature.

3. Make a plan on how to remove a deployment from an environment

There will be times when you decide to “back out” changes from a branch and the related environment. It’s worth thinking through your strategy on how to accomplish this.

If you have adopted the strategy of provisioning database environments on demand, then your plan for this will likely to be to reset the environment, either by reverting it to a prior state (if you are using a snapshot or clone related technology that has this feature), or by deploying a fresh, production-like copy/clone of the database environment. Once you have done this, you may make changes to your version control branch to remove code as needed and re-deploy to the database.

If you are not able to quickly reset or re-provision the database environment, you may leverage comparison tools to remove some code changes. To do this, while still maintaining the ability to simulate a production-like deployment:

  • Use SQL Compare to reset the database: you may compare the database to a branch in version control which contains the code representing the state in production, or against another database which has the current state of production. This can be done via the GUI or the command line. Note that SQL Compare is now available as a Docker container for convenient automation. 
  • Make changes to your branch in version control to remove the code you do not wish to deploy using SQL Change Automation
  • Delete rows from the dbo.__MigrationLog table related to the code which you have removed from the environment
  • Redeploy the modified branch

Please note that I don’t recommend modifying the dbo.__MigrationLog table in production. This table contains the history of your deployments, and you want to protect this in your production environments!

If you require a simpler approach to this: you may simply modify the code in your version control branch and execute SQL Compare between that branch and the SQL Change Automation project and the target database. However, consider that this simpler approach does not fully test the way in which your code will be later executed to higher environments. Perhaps that risk is acceptable if the code will still be deployed to several environments later in the pipeline.

4. Make a plan on how to selectively promote a change to the next environment

Many teams have a need to quickly move certain changes — notably hotfixes — through a development pipeline. 

As in previous examples, if you have the ability to quickly provision a production-like database environment for testing, this helps your ability to work with hotfixes. You may create a feature branch for the hotfix which is based upon the point in version control which represents the current stage of production, then deploy fresh production-like environments and test the change.

The feature branch is typically then moved throughout the deployment process using Pull Requests, whether you are using a branching model like Microsoft’s Release Flow, GitFlow, or the Environment Branching model discussed above. 

If you are not able to quickly reset or re-provision the database environment, for speed purposes you may be forced to test the hotfix in environments which contain a significant amount of code that has not yet been released to production. This is an area where you need to weigh the pros and cons for your organization of speed vs quality of testing. The number of environments which you maintain in your pipeline will factor into this: many teams maintain at least one “staging” database environment in a state as close as possible to production, and work hard to ensure that they are able to mimic production deployments to this environment for hotfixes for safety purposes.

Conclusions

When planning or improving your development process for database changes, it’s helpful to step back and look at the larger picture and think of these questions:

  • Can your team move toward adapting the Expand/Contract model and embrace small “slices” of changes for deployment?
  • Does your team have the ability to begin provisioning databases on demand so that you may easily create and reset environments for initial testing, integration, and testing?
  • Is a hybrid approach to capturing database code in version control combining state and migration approaches suitable for your team?
  • Does an environment based branching strategy suit the way your team works?

After you decide on these strategies, those choices will help inform how you proceed in implementing projects with your toolkit of choice. 

Thanks for the great question, Peter!

The post Reordering Deployments in Database DevOps appeared first on Simple Talk.



from Simple Talk https://ift.tt/3jv0f2S
via

No comments:

Post a Comment