Saturday, April 29, 2023

Introduction to Using Bitbucket and CI/CD for Your App

If you have been in the DevOps space, you should know about version control and must have worked with 0ne. But, have you heard of BitBucket? No? Well, this tutorial is for you.

Bitbucket is a web-based platform that provides hosting services for version control repositories, specifically for source code and development projects. It offers support for two different version control systems: Mercurial and Git. It is owned by Atlassian and used for source code and development projects that use either Mercurial or Git version control systems. It provides a centralized location for storing and managing code, facilitates team collaboration and version control, and includes features such as pull requests, code reviews, and issue tracking. Bitbucket is widely used by software development teams for both personal and enterprise projects.

Bitbucket integrates well with many third-party continuous integrations (CI) tools, including Jenkins, Travis CI, CircleCI, Bamboo, and more. This integration allows developers to automate the build, test, and deployment processes for their software projects, making it easier to identify and resolve bugs and deliver high-quality software more quickly. Bitbucket provides a variety of APIs and integrations that allow developers to connect it with other tools and systems, including issue trackers, chat apps, and other development and collaboration tools.

This tutorial ‌will teach you how to get started with Bitbucket. You will start with learning how to push an HTML project to a Bitbucket repository, configure Bitbucket pages for that respository so it can be viewed over the web browser, create and push a VueJS project to the same Bitbucket repository and then see how to configure CI/CD workflows with CircleCI for that Bitbucket repository.

In this article I am going to go through the follow steps to demonstrate using Bitbucket in your projects

  • Prerequisites
  • Initializing a Hello_World project.
  • Creating a Bitbucket workspace and repository.
  • Pushing your code to your Bitbucket repository.
  • Hosting repository with Bitbucket pages.
  • Using a third-party CI tool with Bitbucket.

Prerequisites

These are the things you will need to have (or expect to get if you want to follow along with the examples in this article)

  • This tutorial uses a Linux machine with An Ubuntu 22.04 LTS OS. Every other OS works fine.
  • A text editor (E.g VScode).
  • A Bitbucket Cloud account. You can register here if you don’t have one.
  • An Atlassian account. If you don’t have one, sign up for a free Atlassian account first.
  • Git installed locally on your machine, otherwise, you can follow this guide to get it installed depending on your OS.
  • Circle CI account created with your Atlassian Bitbucket account

Initializing a Hello_World project

Prior to getting started with Bitbucket, you’ll need to first have a project to work with. For demo and simplicity’s sake, you will create a project that simply outputs “Hello World” to the browser.

Create a project directory to store the files for the hello_world project. You can name this directory whatever you like – this tutorial uses hello_world.

mkdir hello_world #creates the hello_world directory 

cd hello_world #goes into the hello_world directory 

code . #opens up the hello_world directory via your 
       #default code editor

Fig 1: Creating project directory “hello_world”

Create an index.html file and paste in the code snippet below:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Bitcket App</title>
    <!-- Google fonts -->
    <link rel="preconnect" href="https://fonts.googleapis.com">
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
    <link href="https://fonts.googleapis.com/css2?family=IBM+Plex+Sans:wght@100&display=swap" rel="stylesheet">
    <!-- Internal styles -->
    <style>
        * {padding: 0; margin: 0; box-sizing: border-box;}
          body {background-color: hsl(46, 100%, 50%);  
          Font-family:'IBM Plex Sans', sans-serif; margin: 
          60px; display: flex; justify-content: center; 
          align-items: center;}
        h1   {color: hsl(0, 4%, 10%); text-align: center;}
        div { border: 5px solid hsl(318, 73%, 22%); width: 500px;
              margin: 0 auto; padding: 10px;}
    </style>
</head>
<body>
    <div>
        <h1>Hello World</h1>
    </div>
</body>

When you view this over a web browser, this is how the web page will look like:

Fig 2: Viewing hello_world project over a web browser

The output above simply displays Hello World to the browser.

Creating a Bitbucket Workspace and Repository

Just like GitHub, you need to have a repository to house the application you’d like to save on Bitbucket. If you are logging in to Bitbucket for the first time, you will be asked to first create a workspace.

A Bitbucket workspace is a container for projects and repositories in Bitbucket, where users can store, manage, and collaborate on their source code. It allows teams to organise their work, assign access permissions to different users, and track changes to code in a centralised location. A workspace can contain multiple repositories, each with its own set of files and folders.

Be sure to click on the create workspace button to create your Bitbucket workspace, just like what the output below does:

Fig 3: Initializing creating a bit bucket workspace

A new page will show up asking you to add the name of your workspace. This tutorial uses my-helloworldworkspace as the name of the workspace. You can name yours whatever you like. Additionally, you have the option of making this workspace public or private. In this tutorial, you’ll on-check the checkbox to enable the workspace to be public as shown in the output below:

Fig 4: Creating a Bitbucket workspace

Once you have created your workspace, you will be directed to your Bitbucket dashboard.

This dashboard provides a centralized view of your repositories, projects, pull requests, issues, and more. It also allows you to manage your code, collaborate with others, and track the progress of your development projects.

To create a repository, click on the create button on the navigation menu and select Repository from the drop-down menu. This will initialize the process of creating a Bitbucket repository in your current workspace:

Fig 5: Initialising creating a Bitbucket repository

The page will then redirect to where you’ll need to provide some information about the repository you are trying to create which is as follows:

  • Workspace: This field will be automatically populated for you if you need to select a different workspace click on the dropdown icon.
  • Project name: Projects are a way to organize and manage repositories within a Bitbucket workspace. Provide a project name – this tutorial uses a project name First-bitbucket-project.
  • Repository name: Give your repository a name, this tutorial sets the repository name to – hello-world.
  • Access level: This field requires you to either make your repository private or public. This tutorial sets this field to – public.
  • Include a README: Just like GitHub, you have the option to create a ReadMe as you create your repository. This tutorial sets this field to Yes.
  • Default branch name: This field requires you to specify the default branch you’d like to push your code to. This tutorial selects the main branch as the default branch.
  • Include .gitignore: This field asks you if you’d like to add a *.gitignore* file to your repository, this tutorial selects Yes.

You have the option to add a description and select the language your project comprised of. These settings are located in the advanced settings option.

Go ahead and create this repository. This should take less than a minute:

Fig 6: Creating a Bitbucket repository

Once your repository has been created successfully, you should see your repository presented to you on your Bitbucket dashboard like the output below:

Fig 7: Viewing Bitbucket repository

Pushing your code to your Bitbucket repository

Since Bitbucket supports the Git version control system, you can upload and transfer the code from your local machine up to your Bitbucket remote repository using Git commands.

Open up your project directory via a terminal or command prompt and execute the following Git commands to commit your changes and push your project up to your Bitbucket repository.

First, execute the command below to initialize a new Git repository in the current directory, This command will create a new directory named ".git" within the current directory, which contains all the necessary metadata and objects to manage version control for the project:

git init

Fig 8: Initialising a new Git repository

Display the current status of the local repository, including information about changed files that have not yet been committed to the repository, untracked files, and the current branch using the following command:

git status

From the output below, you can see that you are on the master branch and haven’t made any commits yet.

Fig 9: Viewing current branch, commits and untracked files

Now execute the following Git command to track the files in your current directory. This prepares them to be committed to the Git repository:

git add .

Record changes to the local repository by specifying commit, which describes the changes being committed using the following Git command:

git commit -m "my first commit"

The commit message provides context and helps to understand the purpose of the changes.

Fig 10: Making a commit message

Since you already have the Main branch as your default branch for your BitBucket remote repository, execute the Git command below to rename the current branch to "main".

The "-M" option specifies that it’s a branch rename operation.

This can be useful if you want to change the name of your branch to better reflect the work you’re doing, or to align with best practices or conventions within your organisation.

git branch -M main

Create a new remote named origin and set its URL to a specified URL using the Git command below. This allows you to push and pull changes from the remote repository located at the specified URL, which in this case is your repository on Bitbucket.

git remote add origin "YOUR_BITBUCKET_REPOSITORY_HERE"

To get the URL of your Bitbucket repository, Click on the clone button from the top right side, and copy the URL after the `git clone` command highlighted below:

Fig 11: Copying Bitbucket repository URL

Since you allowed Bitbucket to add a ReadME and a .gitignore file while creating your repository, you’ll need to execute the following Git pull command to fetch and download the content from the remote repository and merge it with your local repository:

git pull --rebase origin main

You should have the ReadME and the .gitignore file as shown in the below screenshot:

Fig 12: Viewing pulled files from remote Bitbucket repository

Create an App password on your Bitbucket account that you will use when you want to push your code up to your Bitbucket repository. Click on your avatar from your repository page and select personal settings from the dropdown menu:

Fig 13: Initialising creating an app password

The page will redirect to the personal settings page. On this page, select the App passwords tab from the left panel. A new screen will be shown to you, click on the create app password button to create an app password:

Fig 14: Creating app password

Give your app password a label (this tutorial uses hello-world) as the label, check all the check-boxes under the permissions section and click on the create button to create this password.

Fig 15: Adding app-password label and permissions

A pop-up window will be shown to you containing your app password, copy the app password. Be sure to save this password somewhere you can access it, you will always need it whenever you try to push your code to any repository on your Bitbucket account.

Fig 16: Copying app password

Close the pop-up window once you have successfully copied the app password for your repository. You should be redirected to the below screen:

Fig 17: Viewing app-passwords

Last, execute the following Git command to push the local main branch to a remote repository. This command pushes the local main branch to a remote repository named "origin" with the "u" option set to track the remote branch as the local main branch.

git push -u origin main

This command will prompt you to input your Bitbucket username and password. Make sure you provide the correct username and app password as your password.

Once you have successfully pushed to your remote Bitbucket repository, all you now have to do is refresh your Bitbucket repository page. You should see the latest push:

Fig 18: Pushing commits to remote Bitbucket repository

Hosting repository with Bitbucket pages

Bitbucket has a pages feature that allows you to host static websites directly from your Bitbucket repository. Bitbucket Pages provides a simple, fast, and flexible way for developers, teams, and organizations to create and host websites without the need for dedicated hosting infrastructure.

First, you will need to rename your repository to include the following naming convention – username/workspaceID.bitbucket.io. This is the required naming convention to setup pages for your Bitbucket repository.

On your repository page, select the repository settings tab from the left panel:

Fig 19: Viewing repository settings

Edit the name of your repository to follow the naming convention stated above and scroll down to the bottom of the page to save this setting:

Fig 20: Renaming repository name

Once the change has been saved, navigate to the following web address over your preferred browser to see your static site:

https://workspace_ID.bitbucket.io
https://my-helloworldworkspace.bitbucket.io/

Fig 21: Viewing repository

Using a third-party CI tool with Bitbucket

As I mentioned earlier Bitbucket integrates very well with third-party continuous integration and delivery tools. For this tutorial, you’ll be integrating Bitbucket with Circle CI.

First, execute the Git command below to create an orphan branch – you can name this branch whatever you like. This tutorial uses circle-ci as the branch name. This branch will contain a VueJS project, this is the project we will be using for this demo. The reason you’ll need to create this branch is that we only want to configure continuous integration and delivery for only the circle-ci branch.

git checkout --orphan circle-ci
git rm -rf .

This will result in something like the following.

Fig 22: Checking out to orphan branch (circle-ci)

Create a VueJS project using the following command. This command creates a VueJS project using the Vite build tool.

npm create vue@3

This command outputs some questions, you’ll provide the following answers:

  • Project name: type in (.) so it builds the project in the current directory. (type in enter to continue)
  • Package name: vue (you can name it whatever you like)
  • Add TypeScript? … No / Yes (NO is the default so, type enter)
  • Add JSX Support? … No
  • Add Vue Router for Single Page Application development? … No
  • Add Pinia for state management? … No
  • Add Vitest for Unit Testing? … Yes (Use your left-arrow key to select yes)
  • Add an End-to-End Testing Solution? › No
  • Add ESLint for code quality? … No

Once the VueJS project has been created successfully, you should have the following output:

Fig 23: Creating VueJS project with Vite

Run the below command to install all the dependencies needed for the project:

npm install

Fig 24: Installing all project dependencies

Since you accepted to add Vitest for unit testing, you have an already configured example test in the project src folder, named HelloWorld.spec.js. This is the sample test we will instruct CircleCI to check for before pushing our code up to our Bitbucket repository:

Fig 25: Viewing sample test (HelloWorld.spec.js)

Run the following command below to check if this test passes:

npm run test:unit

If it passes you should have output similar to below:

Fig 26: Viewing successful test

Create a folder in the current directory called .circleci, in there, create a file called config.yml This is the exact naming style for the CircleCI configuration file.

Add the below code snippets:

version: 4.0
jobs:
  build:
    working_directory: ~/repo
    branch: circle-ci
    docker:
      - image: cimg/node:19.4.0
    steps:
      - checkout
      - run:
          name: update-npm
          command: sudo npm install -g npm@latest
      - restore_cache:
          key: dependency-cache-
      - run: 
          name: install-all-packages
          command: npm install
      - save_cache:
          key: dependency-cache-
          paths:
           - ./node_modules
      - run: 
          name: run tests
          command: npm run test:unit
      - run:
          name: generate static files #dist
          command: |
            npm run build
            echo "Pushed to Bitbucket"

Now push your changes up to the circle-ci branch using the below commands:

git add .
git commit -m 'first commit for the circle-ci branch'
# you'll need to do this since your renamed your Bitbucket repository
git remote remove origin
# Be sure to add your own repository url
git remote add origin "git@bitbucket.org:my-helloworldworkspace/my-helloworldworkspace.bitbucket.io.git"
git push origin circle-ci

Once you have successfully pushed your code to the circle-ci branch, when you select the Branches tab from the left panel on your Bitbucket repository, you should see that you now have two branches. The main branch and the circle-ci branch:

Fig 27: Viewing branches on Bitbucket repository

Select the Source tab from the left panel to confirm that you also have the circle-ci branch created:

Fig 28: Viewing branches on Bitbucket repository

Now you can view the code pushed by selecting the circle-ci branch, from the above image:

Fig 29: Viewing source code on circle-ci branch

Since you are using CircleCI as the CI/CD tool, you’ll need to generate an SSH key pair and add the public key in your Bitbucket repository else your build will fail. To achieve this, click on your avatar and select the personal settings from the dropdown menu:

Run the following command to generate a ssh key pair - ssh-keygen -t rsa -b 4096 -C YOUR_BITBUCKET_EMAIL

Fig 30: Selecting personal settings to add ssh key on bitbucket

Select the SSH keys from the left panel and click on the Add key button:

Fig 31: Initialising add ssh key on bitbucket

A pop-up window will be shown to you, input the name of the key as the label, add the public key and then click on the Add key button to add this key.

Fig 32: Adding ssh key to Bitbucket (circle ci)

Head over to CircleCI and sign up with your Bitbucket account and be sure to select the Bitbucket workspace housing the repository you are trying to build with CircleCI:

Fig 33: Selecting Bitbucket workspace

Once you have selected your workspace you should be taken to your dashboard to set up a project. From your dashboard, click on the Add project button highlighted below, to add a project:

Fig 34: Viewing CircleCI dashboard

Since CircleCI has access to your Bitbucket workspace, it will automatically output all the repositories you have on that workspace. In this case, the only repository available in the my-helloworldworkspace is my-helloworldworkspace.bitbucket.io. So, click the Set up project button beside the repository to set up a CircleCI project for that repository:

Fig 35: Setting up a CircleCI project

This will output a pop-up asking you to select the branch you would like to build, it also checks if there is a .circleci/config.yml file available in that branch and if there’s you’ll have a green check mark indicating it has found the .circleci/config.yml file , else it provides you with other options to create a .circleci/config.yml file for it to use.

Now type in circle-ci as the branch and click on the Set up project button to proceed:

Fig 36: Selecting config.yml file for CircleCI

The page will then navigate to your dashboard for you to view your pipeline. The following output shows that the pipeline was successful:

Fig 37: Viewing successful pipeline

If you also head to your Bitbucket branches page for your repository, you will see the below output showing that the build on CircleCI was successful:

Fig 38: viewing build on Bitbucket repository

Conclusion

Bitbucket is as you have seen just like GitHub, you can do the basic things you do on GitHub on Bitbucket too. You have seen how to create a workspace, project, and repository on Bitbucket. You have also seen how to use Bitbucket pages to view your repositories over a browser and ultimately seen how to integrate Bitbucket with CircleCI.

I believe you can agree with me that Bitbucket is easy to use out of the box just like GitHub. You can take a tour to test other Bitbucket features like pull requests, Bitbucket pipelines, and more.

 

The post Introduction to Using Bitbucket and CI/CD for Your App appeared first on Simple Talk.



from Simple Talk https://ift.tt/0XpIeZL
via

Tuesday, April 25, 2023

PostgreSQL Basics: Getting started with psql

PostgreSQL has a separate command-line tool that’s been available for decades and is included with any installation of PostgreSQL. Many long-term PostgreSQL users, developers, and administrators rely on psql to help them quickly connect to databases, examine the schema, and execute SQL queries.

Knowing how to install and use basic psql commands is an essential skill to have for anyone that will connect to PostgreSQL.

A Brief History of psql

To understand why PostgreSQL has its own command line utility and why it is so integrated into how developers work with PostgreSQL, we need to take a quick step into the time-travel machine and remind ourselves how PostgreSQL became PostgreSQL.

1970s-1985 INGRES, the grandfather of PostgreSQL, developed and used a proprietary query language called QUEL. One of the terminal programs that was used to interact with INGRESS was called monitor
1985-1994 POSTGRES (after INGRES) was started and developed. It maintained the QUEL query language now called POSTQUEL, and so the monitor application could continue to be used with POSTGRES
1995 Postgres95 is released with a very liberal PostgreSQL License, based on POSTGRES source code, but QUEL is replaced with SQL! The monitor terminal application will no longer work, and psql was provided as the new terminal tool for interacting with Postgres95
1996-present PostgreSQL 6.0 was released under the guidance of the new PostgreSQL Global Development Group and amazing work hasn’t stopped since. With every release of PostgreSQL, additional features and improvements are made to the psql utility to provide more power to end users.

Why Should You Use psql

If you’re new to the PostgreSQL space, using a command line utility to connect to and query the database might feel outdated. As a former, long-time SQL Server developer I know the lack of a consistent, standardized GUI tool felt really frustrating at times.

After a while, however, I realized that it is often much easier to jump into a database with psql when I just need to connect, look at schemas, and run simple queries. Using the meta-commands becomes second nature for getting details about database objects. It’s like having a tool with hundreds of efficient shortcuts to quickly work with the database.

Even more powerful is that you can configure psql to return the SQL that it runs for each command along with the results. This can be a gateway into learning many of the underlying catalog tables in PostgreSQL that control every aspect of how the database is used.

And last, but certainly not least, is that many (many!) articles, tutorials, books, and videos that show you how to use PostgreSQL features will often demonstrate concepts using psql. Knowing how to connect with the tool and run basic meta-commands will quickly help you follow along.

TL;DR;psql is a powerful tool to have in your PostgreSQL toolbelt. Knowing how to connect to a database and run basic commands will quickly pay dividends in your PostgreSQL work. In total, I think learning some of the basics is worth the time of investment.

Installing psql

The psql command line tool (CLI) is available on Linux, MacOS, and Windows. It comes pre-bundled with the PostgreSQL server installation package or it can be installed independently as a stand-alone CLI application.

Also, psql is versioned alongside PostgreSQL because the queries that it runs when you use meta-commands need to work with newer catalog schemas. There is a lot of effort taken to keep psql backwards compatible within the supported PostgreSQL versions, so a newer version of psql should work with at least the last five major releases.

Check For psql

Because psql comes bundled with the PostgreSQL server, there’s a possibility you already have it available on your computer if you’ve ever installed PostgreSQL. From a terminal or Windows command prompt, type the following:

$> psql --version
psql (PostgreSQL) 15.0 (Ubuntu 15.0-1.pgdg20.04+1)

If you see a response that lists the psql version, then you already have psqlinstalled. If the version is older, consider updating it.

A good rule of thumb is to try to have a psqlversion that matches the newest version of PostgreSQL you’re using across environments. For the basic commands we cover below there’s a high likelihood that any version you have will work, but keeping psqlupdated is a good practice to follow.

Install on Linux

All major Linux distributions should have a package for the postgresql-client. This can be used to only install the PostgreSQL tools apart from the server, including psql, pg_dump pg_restore, and others.

We’ll use Ubuntu as an example, but the package manager of your distribution should have identical (or very similar) package names to do the same thing. The commands below will install the latest version of the tools.

From a terminal prompt, run the following commands:

$> sudo apt-get update

$> sudo apt-get install postgresql-client

Once completed, verify installation by checking for the psql version as shown previously.

Install on Windows

On Windows, you have two options.

If you want to use psql through a Windows command prompt, then you need install PostgreSQL and all tools using the Windows installation package found on PostreSQL.org. Through the installation process, you can decide which parts of the server and tools to install.

On Windows 10 or greater, it’s possible to install the tools in a Windows Subsystem for Linux 2 (WSL2) environment and use the normal Linux package management to install the components as shown above. Again, depending on the distribution you’re using for the WSL host, the package commands may be slightly different.

Install on MacOS

There is a plethora of ways to install PostgreSQL on MacOS, many of which include the full server and associated tools.

The fastest method for getting PostgreSQL setup on MacOS is to use the Postgres.app. This provides a full installation of the PostgreSQL server that can be started and stopped at will, and the simple instructions on their homepage show you how to make sure the PostgreSQL CLI tools are in the path to use from the Terminal.

Alternatively, you can just install the PostgreSQL using Homebrew. Installing the libpq package only installs the tooling (not the server).

ryan@mba-laptop % brew doctor
ryan@mba-laptop % brew update
ryan@mba-laptop % brew install libpq

Because this isn’t the full PostgreSQL server, you’ll also need to setup the path correctly so that you can use psql from the Terminal. The easiest way to do that is to let Homebrew update the linking as follows.

ryan@mba-laptop % brew link --force libpq

Docker

Finally, I’d be remiss if I didn’t mention Docker as an alternative. Any of the official PostgreSQL containers can be used, through the interactive Docker shell, to connect to and use psql. Whether you’re attempting to connect to PostgreSQL on the Docker instance itself or remotely, the installed psqlapplication can be used (assuming networking is setup correctly).

As an example, the following commands could be used to download and start a PostgreSQL Docker container, connect to the running shell, and use psqlinside of the container.

ryan@redgate-laptop:~$ docker run --name pg15 -p 5432:5432 -e POSTGRES_PASSWORD=password -d postgres

ryan@redgate-laptop:~$ docker exec -it pg15 bash

root@80049acea1c0:/# psql -h localhost -U postgres

Connecting to PostgreSQL

Once you have psql installed there are two ways to specify the basic connection parameters for the target database. You will need the following information to connect to PostgreSQL.

  • Hostname
  • Port (5432 by default)
  • Username
  • Password
  • Database name

Specifying Individual Parameters

With the above information, connect to the database using these parameters. If the PostgreSQL server is running on the default port of 5432, you can omit the -p switch and psql will attempt to connect to that port automatically.

psql -h [hostname] -p [port] -U [username] -d [database name]

If all the information is correct, you will be prompted for a password. When using the individual parameters for connecting to PostgreSQL, there is no method to supply the password on the command line. You will always be prompted, or you could specify it in the .pgpass file.

Using a PostgreSQL Connection URI

Alternatively, you can use the connection parameters to create a PostgreSQL connection URI to connect to the database.

psql postgresql://[username]:[password]@[hostname]:[port]/[database name]

With this form, you can specify the password in the connection string as long as it does not contain a semi-colon (;) or ampersand (@) as those interfere with the connection URI parsing. If the password is omitted, then PostgreSQL will prompt you.

Conclusion

The psql command-line application is the go-to tool for anyone working with PostgreSQL. With decades of development and hundreds of built-in meta-commands to help developers and administrators work with PostgreSQL quickly and efficiently, knowing how to install and use it to connect to databases is an essential skill for PostgreSQL users. As an added bonus, recent psql versions can be used with at least the most recent five major releases of PostgreSQL because of the tireless work by the community to maintain backwards compatibility.

Of course, knowing how to get it installed and connecting is only part of the puzzle. In our next article, we’ll give you a tour of how to use psql and the major meta-commands you should know to accelerate your development and management of PostgreSQL databases.

 

 

The post PostgreSQL Basics: Getting started with psql appeared first on Simple Talk.



from Simple Talk https://ift.tt/gpLnM5T
via

Thursday, April 20, 2023

One use case for NOT using schema prefixes

I’ve long been a huge advocate for always referencing objects with a schema prefix in SQL Server.

In spite of what may be a controversial title to many of my regular blog readers, I don’t really want you to stop that practice in most of your T-SQL code, because the schema prefix is important and useful most of the time. At Stack Overflow, though, there is a very specific pattern we use where not specifying the schema is beneficial.

Background

The Stack Exchange Network is comprised of many web sites with the same question & answer model, and you’ve almost certainly heard of “the big three” – Stack Overflow, Server Fault, and Super User (all with spaces!). There are many other sites, covering everything from cooking and physics to science fiction and woodworking.

The way the public network sites work is that each web site has a domain name and a connection string to an individual database. Every database is basically identical in structure – each site has a Posts table, a Users table, and all the usual suspects.[ * ]

How sites interact with tables in the Stack Exchange Network

Our Stack Overflow for Teams offering is a private version of the public network – you create your own team, and you get your own private Q & A web site, where you can share information within your organization or even just among your immediate teammates.

The database structure works slightly differently here; in most tiers, each team does not have its own database, but rather the security boundary is by schema, as explained in this security overview. There are multiple reasons for that, and they pre-date my tenure here. But certainly, as a reliability engineer, I do not want to have to manage thousands and thousands of databases, especially since even at 100 databases, we bump up into both enforced limits (e.g. Managed Instance) and practical limits (see our well-documented and oft-lamented performance problem around AG failovers).

But each team still has its own connection string. And each team’s schema still has a Posts table, a Users table, and all the usual suspects.[ * ]

How teams interact with tables in Stack Overflow for Teams

The structure is the same because large portions of the codebase are shared across all of the applications and services, and they issue all kinds of queries – often dynamically and conditionally constructed within Dapper or the bowels of C#. The problem is clear: we can’t obey the pattern of always specifying the schema without complicating every single query and injecting either dbo. in the case of public network or Schema00xxx. in the case of Teams. So the code is full of queries that simply don’t reference any schema at all, to make them the most portable and compatible with all flavors of the software:

SELECT {columns} FROM Posts WHERE ...

For Teams, we take advantage of the fact that the connection string for any site specifies a user that owns that single schema, uses it as their default, and can’t access any others. The potential trade-off there is something I mentioned in my earlier rant: every time a user from a different schema executes any specific version of a query, another entry is added to the plan cache (even though the query text and all other attributes are identical).

This isn’t a problem in the public network side because there are only dbo versions of the tables, and no app users or service accounts have anything else as their default schema. And it’s actually not the huge problem you might assume in Teams, either, because teams come in all shapes and sizes. It can actually be beneficial for each team to generate their own unique plan, with estimates based on their specific table cardinalities. But it is worth mentioning because it can come as a surprise.

Other Challenges

This solution is not without its challenges aside from “plan cache stuff.”

  • Balance: It takes time and experimentation to balance the right number of schemas across the right number of databases. If you have to create a million tenants, what’s best? Ten databases with 100,000 schemas each? 100,000 databases with 10 schemas each? Somewhere in the middle? Considerations include how many are supported (I mentioned the 100-database limits above) and, even where there is no explicit limit, there are other questions like, “how long am I willing to wait for Object Explorer to render 10,000 schemas?” And how many backups are you taking vs. how big are your backups vs. how long does it take you to recover a single tenant (or many tenants). We started with 10,000 schemas per database, and have since adjusted that to 1,000.
  • Server-level logins: If your SQL Server is domain-joined, you will find there is a practical upper bound where creating the next login (even a SQL Authentication login!) will take longer and longer and, eventually, will effectively grind to a halt. One adjustment we made – which had zero impact on the behavior of the application, APIs, or supporting services – is to switch from matching server-level logins and database-level users to just a single set of contained database users. As the documentation suggests, this also makes it easier to move a database with a noisy neighbor to a different server, though we haven’t done that yet.
  • Muscle memory: It has been very hard to get into the habit of not doing something I’ve been training myself to always do for about two decades now, and have been influencing others to always do through writing, speaking, peer review, and answers. I have submitted several PRs that were very quickly rejected because an errant dbo. snuck in there out of habit.

Conclusion

This solution works great for our specific use case because the application codebase is very large and managing at the schema level is the lesser of several possible evils. In a multi-tenant solution, you will always have trade-offs – not just between performance and management overhead, but also between where performance is most important and where management overhead hurts the most.

For Stack Overflow, I’m convinced we’ve made the right compromise, and happy that the architecture is built in such a way that we can easily tweak configuration like “number of schemas per database.”

This is not to suggest, necessarily, that you should use this model in your environment. There are many other factors involved, including where you want more or less complexity, and how you want to scale tenancy. Those criteria are much more important than whether you are following or ignoring what some old-timer rants about in his blog posts and presentations, and I just thought it would be useful to have a counter-example where I admit that almost every rule has justifiable exceptions.

* Further Reading

To learn more about the Stack Overflow schema, and the tables I lump into “the usual suspects,” see Erik Darling’s gentle introduction. If you’re a more hands-on person, you can snoop around a rough subset of the schema in the Stack Exchange Data Explorer, or download the data dump from archive.org.

The post One use case for NOT using schema prefixes appeared first on Simple Talk.



from Simple Talk https://ift.tt/mlg7Tf8
via

Monday, April 17, 2023

Is Artificial Intelligence another way to spread misinformation?

I use Chat-GPT occasionally, instead of Google, to get answers to technical questions. The trouble is that it currently uses authoritative sources with the same eagerness as blog posts written by newly-trained wannabe database devs who seek to get their names known, and I have to keep pointing out errors: ‘No, you can’t run JDBC from PowerShell that way’, ‘Github doesn’t support that way of accessing its REST API’ and so on. When noted, it does express remorse and is fulsome in its apologies. however, I regard Chat-GPT as an indiscriminate conveyor of existing information, not artificial intelligence.

To work quickly, I need information that is correct.

I am intrigued, however. One of my hobbies has been in using a computer to generate what we used to call Gobbledygook; wordy and generally unintelligible jargon based on someone else’s work. Now Computer Science does this for you. Unless you are wary, you can mistake this for intelligence: it isn’t. It is, rather, a mashup of other sources of knowledge and insight, providing plagiarism on an industrial scale. To rephrase someone else’s intellectual property uncredited, as if it were your own, isn’t intelligence but merely theft. To make it worse Chat-GPT cannot, in its present state, distinguish bad data.

Psychologists who are assessing neurological problems have a colloquial term for this, ‘Cocktail-party Syndrome’, where a person whose mental faculties aren’t up to demand will produce very plausible sentences that seem to reflect an intelligent and ordered mind. The brain can, almost effortlessly, pull in trains of thought and reconfigure them into a plausible ‘Word Salad’. It is a good technique, much used by ‘content-providers’, conspiracy theorists, and managers. Never mind the quality, feel the width. It is easy to bluff the listener into believing that there is deep thought behind the blather. No. it is an artificial language model for which we have few natural defenses. Even I found myself apologising when I pointed out Chat GPT’s errors.

Chat GPT is currently like the keen schoolchild in the front of the class who is eager to answer every question with uncrushable confidence, but rather shaky knowledge. Some of the proudest and most valuable signs of real human intelligence is to realise and admit when you have no knowledge or expertise in a topic, because there is nothing more damaging than transmitting incorrect information.

There was once a culture where what appeared in print was edited and checked. It was published by an organisation that was obliged to stand by what they published. Now we have insufficient attribution to give us confidence that information comes from a reliable source, has been tested against reality, and peer reviewed. At this point, any database developer will get a nagging doubt. Is this a possible vector for bad data, a strain of the worst ailment that can ever overtake a data system? We’re all ill-equipped to judge information that is presented slickly with apparent confidence. That’s how wars start.

The post Is Artificial Intelligence another way to spread misinformation? appeared first on Simple Talk.



from Simple Talk https://ift.tt/mn50Bef
via

Thursday, April 13, 2023

How to secure legacy ASP.NET MVC against Cross-Site (CSRF) Attacks

Cross-Site Request Forgery (CSRF) attacks are widespread, and even some BigTech companies suffer from them.

  • Netflix suffered in 2006 with CSRF vulnerabilities. Attackers could change login credentials, change the shipping address and send DVDs to a newly set address.
  • YouTube suffered from CSRF attacks where an attacker could perform actions of any user
  • ING Direct Banking has lost money to CSRF attackers who used their web application to do illicit money transfers
  • McAfee Secure’s vulnerability allowed attackers to change their company system.

How were the attackers able to do this? What were some of the techniques they used? To understand this, we need to understand CSRF in detail.

What is a Cross-Site Request Forgery (CSRF) attack?

CSRF is when an attacker submits unauthorized commands to a website user already logged in. In layman’s terms, When you click on a malicious hyperlink, it triggers scripts that perform actions on your behalf to your logged-in bank website. Viola, the attackers, have your money.

The malicious link would look like these.

  • “You are a winner.”
  • “You placed order # 648722.”

Why are CSRF vulnerabilities a must-fix in the ASP.NET MVC world?

ASP.NET MVC and ASP.NET Core are traditionally some of the most used platforms to build financial web applications, such as banks and hedge funds. From a statistical standpoint, these platforms are trusted more than their counterparts, such as Express or NodeJS, for financial web applications. In addition, it is easier to fix CSRF issues in ASP.NET Core than in ASP.NET MVC because of the better tools and support available. We will investigate techniques to fix CSRF issues in ASP.NET MVC.

How to attack an ASP.NET MVC website using a CSRF attack vector

Before we understand how to fix CSRF issues, we need to know how they happen in the first place. For example, suppose you log into your bank website at onlinebank.com. And you are visiting a malicious website on another tab, which looks like this. (This is an actual screenshot of a spam email I received.).

The “Get Started” button could be hiding a script and Html form like this.

<h1>Affordable Medicare Plans are available. Hurry now!</h1>
<form action="http://onlinebank.com/transfer?account=76543865&amount=15000" method="post">
  <input type="submit" value="Get Started"/>
</form>

When you click the “Get Started” button, you can see the network traffic originating from it using the developer console built into your browser by pressing the F12 key on most any browser (Developer console is essential for web development and almost any engineers who work on front end browser code should know about it.(

If you are already logged into the onlinebank.com website, this traffic would include the session cookies. Now the attacker has all the pieces needed to solve the puzzle:

  1. a logged-in user
  2. users’ session cookies
  3. a URL that steals the victims’ money.

No wonder CSRF attacks are also called session-riding and one-click attacks.

There are other ways the attack can also happen, using img tags.

Now that we have seen how the attack can happen let’s discuss our prevention strategies.

Enter Anti-Forgery Tokens. Drum roll, please 🥁🥁🥁!

The suggested way to prevent CSRF attacks is to use tokens that you would only know. Your ASP.NET MVC web app generates the tokens, and we verify these tokens on relevant requests to the server. Since GET requests are not supposed to alter the persisted information, it is ideal to use and verify this token on POST, PUT, PATCH, and DELETE requests.

Let’s outline the steps needed.

  • User visits a page
  • On page request, ASP.NET MVC generates two tokens. A cookie token and a hidden form field token. The server embeds both tokens in response.
  • When the user does an action that alters data, such as a form submission, the request should contain both of these tokens.
  • Server verifies if the action request has both tokens; if not, the server says ‘no’ to the request.

In short, think of this as accessing a bank locker, but you can only do it in the presence of a bank manager(anti-forgery tokens) sent by the bank(server) even though you have the key to the locker(session cookie).

Enough talk, Show me the code for ASP.NET MVC.

You would add `@Html.AntiForgeryToken()` to your Html forms.

@using (Html.BeginForm("Transfer", "Account", FormMethod.Post))
{
    @Html.AntiForgeryToken()
    <input type="submit" value="Submit Form" /></td>
}

It would translate to:

<form action="Account/Transfer" method="post">
  <input name="__RequestVerificationToken" type="hidden" value="CfDJles20ilG0VXAiNm3Fl5Lyu_JGpQDA......"/>    
  <input type="submit" value="Submit Form" />
</form>

As you can see, ASP.NET MVC has added `__RequestVerificationToken` to this form token as a hidden field. This token is generated at the server.

Now, when this form is submitted, this form token will be submitted along with form data, and the cookie token will make it to the server as part of the request as well.

Now we have only one job left; verify both tokens at the server.

To do that, a method like `AntiForgery.Validate(cookieToken, formToken); ` will do the job. But for ASP.NET MVC, there is a built-in attribute that would do this job for you – `ValidateAntiForgeryToken`.

using System.Web.Mvc;
namespace Online.Bank.App.Controllers
{
    public class AccountController : Controller
    {
        [HttpPost]
        [ValidateAntiForgeryToken] // This attribute will do the Anti-Forgery token validation for you.
        public ActionResult Transfer()
        {
            return Json(true); // This line is added for brevity. You would be doing good stuff here.
        }
    }
}

What about AJAX and jQuery?

AJAX primarily uses JSON data instead of HTML form data. It poses a problem for us. The above code won’t work. How do we solve this?

Let’s assume we need to call an endpoint Account/Manage using AJAX.

As the first step, using razor syntax, we can ask the server to generate the tokens and give them to us like this.

<script>
    @functions{
        public string GetAntiForgeryTokens()
        {
            string cookieToken, formToken;
            AntiForgery.GetTokens(null, out cookieToken, out formToken);
            return cookieToken + ":" + formToken;
        }
    }
</script>

And then send the tokens to the server using AJAX.

<input type="button" onclick="manageAccount()" value="Manage Account" />
<script>
    function manageAccount() {
        $.ajax("<baseurl>/Account/Manage", {
            type: "post",
            contentType: "application/json",
            data: {  }, // JSON data
            dataType: "json",
            headers: {
                '__RequestVerificationToken': '@GetAntiForgeryTokens()'
            },
            success: function (data) {
                console.log(`Account updated: ${data}`);
            }
        });
    }
</script>

Finally, verify these tokens on the server side. The `Manage` action would be similar to this.

using System;
using System.Linq;
using System.Web.Helpers;
using System.Web.Mvc;
namespace Online.Bank.App.Controllers
{
    public class AccountController : Controller
    {
        [HttpPost]
        public ActionResult Manage()
        {
            string cookieToken = "";
            string formToken = "";
            var tokenHeaders = Request.Headers.GetValues("__RequestVerificationToken");
            string[] tokens = tokenHeaders?.First()?.Split(':');
            if (tokens.Length == 2)
            {
                cookieToken = tokens[0].Trim();
                formToken = tokens[1].Trim();
            }
            try
            {
                AntiForgery.Validate(cookieToken, formToken);
            }
            catch (Exception ex)
            {
                // Alert folks that someone is trying to attack this method.
            }
            return Json(true); // This line is added for brevity. You would be doing good stuff here.
        }
    }
}

That’s it. Now our Account/Manage endpoint is protected.

But do you see a problem? Do we have to repeat this in every action method? How to avoid that?

Custom Anti-Forgery Attribute to the rescue!

What if we create a custom ASP.NET MVC attribute with the above code? Then we can decorate action methods or controllers with that attribute, right?

using System;
using System.Web;
using System.Web.Helpers;
using System.Web.Mvc;
namespace Online.Bank.App.Attributes
{
  [AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, AllowMultiple = false, Inherited = true)]
  public sealed class ValidateHeaderAntiForgeryTokenAttribute : FilterAttribute, IAuthorizationFilter
  {
    // Bonus method
    private void TellEveryoneWeBlockedAttacker(HttpContextBase httpContext, Exception ex)
    {
      string controllerName = httpContext?.Request?.RequestContext?.RouteData?.Values["controller"]?.ToString();
      string actionName = httpContext?.Request?.RequestContext?.RouteData?.Values["action"]?.ToString();
      ex.Data.Add("ControllerName", controllerName);
      ex.Data.Add("ActionName", actionName);
      // Use the exception we created to notify the logging or error reporting system.
      // You may alternatively send an email message or slack message here.
    }
    public void OnAuthorization(AuthorizationContext filterContext)
    {
      if (filterContext == null)
      {
        throw new ArgumentNullException("filterContext");
      }
      var httpContext = filterContext.HttpContext;
      var cookie = httpContext.Request.Cookies[AntiForgeryConfig.CookieName];
      try
      {
        AntiForgery.Validate(cookie != null ? cookie.Value : null, httpContext.Request.Headers["__RequestVerificationToken"]);
      }
      catch (Exception ex)
      {
        TellEveryoneWeBlockedAttacker(httpContext, ex);
      }
    }
  }
}

And decorate the action method with the ValidateHeaderAntiForgeryToken attribute.

using System.Web.Mvc;
using Online.Bank.App.Attributes;
namespace Online.Bank.App.Controllers
{
    public class AccountController : Controller
    {
        [HttpPost]
        [ValidateHeaderAntiForgeryToken] // This is where we put the Anti-Forgery attribute we just created
        public ActionResult Manage()
        {
            return Json(true); // This line is added for brevity. You would be doing good stuff here.
        }
    }
}

See how clean the code got. Now we can use this attribute across the entire ASP.NET MVC web application to decorate action methods and controllers. The beauty of attributes, isn’t it.

Can we optimize the client-side code as well?

ASP.NET MVC generally has a _Layout.cshtml file. We can get the Anti-Forgery tokens there using JavaScript and @Html.AntiForgeryToken(). Though _Layout.cshtml would be the ideal spot, it can be done anywhere in your razor files.

<script>
        function getAntiForgeryToken() {
          let token = '@Html.AntiForgeryToken()';
          token = $(token).val();
      return token;
        }
</script>
Usage
<input type="button" onclick="manageAccount()" value="Manage Account" />
<script>
    function manageAccount() {
        $.ajax("<baseurl>/Account/Manage", {
            type: "post",
            contentType: "application/json",
            data: {  }, // JSON data
            dataType: "json",
            headers: {
                '__RequestVerificationToken': getAntiForgeryToken()
            },
            success: function (data) {
                alert(`Account Updated: ${data}`);
            }
        });
    }
</script>

That’s good. What if we have a lot of AJAX calls, and we want the Anti-Forgery token to be present in every request?

You can set up jQuery requests to have the tokens added to the request header by default using $.ajaxSetup().

Usage is like this.

$.ajaxSetup({
    headers: {
      '__RequestVerificationToken': getAntiForgeryToken()
    }
});

That’s it. We made the life of the attacker a bit more complicated. Isn’t that a good thing?

Wrap up

Cross-Site Request Forgery (CSRF) vulnerabilities are not easily detectable without security scans. Implementing a technique presented here (or any technique for that matter) would save numerous heads, pain, and suffering. When it comes to application performance, reactive will do just fine. But in application security, proactive is always better than reactive! CSRF is just one attack vector; there are others like XSS, SQL Injections, and many others. More on that later! In the meantime, feel free to check out this example project demonstrating the code and concepts described above.

The post How to secure legacy ASP.NET MVC against Cross-Site (CSRF) Attacks appeared first on Simple Talk.



from Simple Talk https://ift.tt/k3CPKQS
via

Monday, April 10, 2023

PostgreSQL Schema: Learning PostgreSQL with Grant

An important aspect of building and managing a database is organizing the objects within your database. You may have tables that support different functions, for example, a schema for warehousing operations and another for sales. Some logins may need access to some tables, but not others. You might want to isolate one set of objects within a database from other sets of objects. All of this, and more, can be accomplished using schemas within a database and PostgreSQL supports the use of schema for just these types of functions.

In the sample database I’ve created as a part of this ongoing series, I created a couple of schemas and organized my tables within them. You can see the database here in the CreateDatabase.sql script. The rest of the code in this article is in the folder 08_Schema.

Managing Schema

A schema is used first and foremost as an organizing mechanism on your database. From there, you can get into using schema to help design security, manage access, and generally control what users can see and do within your database. When you create a blank database, it comes with a default schema, public.

When you create an object, like a table, it will automatically go into your default schema unless you specify otherwise. By default, all logins to the database have access to the public schema (PostgreSQL 15 has changed this default, so now users do not have rights to create objects in the public schema). Other than these default behaviors, the public schema is just another schema in the database and most of the functions and rules we’ll discuss in the rest of the article apply.

To get stated creating your own schemas, the syntax is very straightforward:

CREATE SCHEMA mytestschema;

This creates a schema called mytestschema. To create a table within that schema, you simply use a two part name (schema_name.table_name) for the table within the CREATE TABLE command like this:

create table mytestschema.testtable
(id int,
somevalue varchar(50));

It’s the same with any queries too:

select id from mytestschema.testtable;

You can think of the schema as the owner of the table (the owner of the schema is technically the owner of the table). Defining the owner in all your code helps ensure that accidents don’t happen. Because, when you start using schema, you can define the same object names if they’re in different schemas. Keeping object names distinct is a good practice, but sometimes the same name is the best name in different schemas):

create schema secondschema;

create table secondschema.testtable
(insertdate date,
someothervalue varchar(20));

This is perfectly valid. If I were to write, what I consider poor code, like this:

select * from testtable;

This likely results in the following error:

ERROR: relation "testtable" does not exist

LINE 2: select * from testtable;

Initially, is seems like it’s an error because PostgreSQL can’t figure out which of the two testtable tables to pull from. Rather, it’s because logins have a default schema. When I run a query like the one immediately preceding, without a schema identifying where the table lives, PostgreSQL looks in my default search path. If it’s not there, well, that table doesn’t exist. This is true even though, I’ve got two with that name. PostgreSQL doesn’t check other schemas “just in case.”

Later in the article, I will discuss how to manage the schema defaults.

If the schema is empty, you can drop it:

drop table if exists secondschema.testtable;

drop schema if exists secondschema;

If I do not drop the table first, an error will occur:

SQL Error [2BP01]: ERROR: cannot drop schema mytestschema because other objects depend on it

Detail: table mytestschema.testtable depends on schema mytestschema

Hint: Use DROP ... CASCADE to drop the dependent objects too.

In the error message, there is a hint for how to get around this. I could rewrite my query like this:

drop schema if exists mytestschema cascade;

The beauty of this is that it will remove all tables, views, etc., within the given schema. That is also the horror of this syntax, that it’s going to remove all the tables, views, etc., without consulting with you in any way.

There is a default schema in every database that gets created, public. However, it is just the default and like most defaults, it can be changed. In fact, you can even drop the public schema if you so choose. I started this section detailing how to create your own schema because, I think it’s a good practice to organize your data storage into a defined schema that you directly control, as opposed to just accepting the defaults.

Controlling the Default Search Path

In addition to helping you organize your database objects; schema helps you control access to those objects. I haven’t delved into security yet in this series, and it’s likely a little way off still. However, I will talk some about how schema helps you manage security on your database. (My teammate Ryan Booz did recently publish an article on the subject, “PostgreSQL Basics: Roles and Privileges”.)

In this section I want to detail some of the ways you can manage the default schema.

In the last example of the section above, I showed how you can have duplicate table names between schema, but, that you must reference the schema name to access those tables. However, this isn’t the whole story.

There’s actually a defined search list for schema that you can see using this query:

show search_path;

If you haven’t changed anything in your server, the default results are:

"$user",public

Each user has a schema of their own, like SQL Server. That’s the $user schema you see above. However, if you don’t specify the schema, it’ll default to the first one in the search list, public in this case. We can add schema to the search list for the current connection:

SET search_path TO radio,public;

That will not only add the radio schema to the the search_path, it will rearrange the order on the search_path so that the radio schema is searched before the public schema. If you disconnect and come back to the connection, you will have to reset the path when using the SET command.

If you want to make the changes to the path the default, you can use the ALTER ROLE to set any role to have a specific search path. For example:

ALTER ROLE scaryDba SET search_path = 'radio,public,$user';

If you want to set the default for the server\cluster\database, you can modify the search_path in the postgressql.cnf file, or you can use:

ALTER ROLE ALL SET search_path = '$user';

This will not override the individual path’s set but will make every login that does not override the search path need to specify the schema name with referencing any object. (Which as noted, is a best practice.)

Ownership and Basic Privileges

When you create a schema, you can define a schema owner as other than the login that is running the command:

CREATE SCHEMA secureschema AUTHORIZATION radio_admin;

A schema I haven’t created yet, secureschema, would be created with the owner being the radio_admin login role (also not yet defined because I am not digging into security yet). That will ensure that only that the radio_admin login, and any accounts defined as superuser, of course, can work within that schema.

You can also control behaviors on schema. For example, since I have set up independent schema in this database and intend to use it in that manner, I can revoke access for all logins to create objects on the public schema (This is only necessary in Postgres 14 and earlier, in 15 create is not granted by default.):

REVOKE CREATE ON SCHEMA public FROM PUBLIC;

This is using two different meanings of the word “public”. In the first, ‘public’, we’re referring to the schema of that name. In the second, ‘PUBLIC’, we’re talking about the role that contains all users within the database. This is a shorthand mechanism to make sure no one accidentally puts things into the public schema. I’d say it’s a good practice to follow if you’re going to use other schemas, especially if you’re using them to help properly secure your database.

You can grant various privileges between schema and users such that a given user can read data from tables within a schema, but not modify the data contained in the tables (read only access). In this way, you can combine multiple types of data within one database but isolate them from each other as needed. This is a principal reason for using schema in your database.

If you’re not isolating storage and access between schema, it may make less sense to use a schema other than public in the first place. However, most applications have varying levels of access they would like to manage, and schema will lend themselves to appropriately implementing that type of security. If security isn’t a concern, using schema names instead of putting all objects in the public schema can be advantageous for documentation as well.

Conclusion

Schemas are containers that allow you to segment objects and security at a lower level than a database. Using schemas other than public has good benefits. In PostgreSQL there are several methods of setting the default schema if your users are averse to using two-part names.

If you are familiar with schemas from SQL Server, the core functionality of schema is roughly the same as in SQL Server. However, there is added functionality like the ability to control a search list changes the types of behaviors you can control within PostgreSQL.

 

The post PostgreSQL Schema: Learning PostgreSQL with Grant appeared first on Simple Talk.



from Simple Talk https://ift.tt/keVPj9N
via

A Beginners Guide To MySQL Replication Part 2: Configuring Source and Replica Servers

A Beginners Guide To MySQL Replication Part 2: Configuring Source and Replica Servers

In the first part of this series, we spoke about MySQL Replication, the different types, replication formats, benefits, and downsides, as well as a brief introduction to the requirements needed for setting up MySQL Replication. To ensure we stay on track with our intended topic, it’s important to have certain prerequisites in place before we begin. While I won’t be able to provide an in-depth explanation, I recommend conducting research and preparing these environments beforehand.

1. Setup two computers, ideally virtual machines, one for the source server, and the other for the replica server.

2. Install MySQL server software on the virtual machines and make sure they are both running the same version of the operating system and MySQL software.

3. Ensure there is a network connectivity available. This is to help the source and replica servers communicate with each other.

Once these three prerequisites are set, then we can finally dive into setting up our replication servers.

Configure the source server

We would be configuring our source server using the binary log file position based replication. This is essential for the replica servers to receive updates to the database, if binary logging isn’t setup, replication becomes unfeasible. In order to set up binary logging in MySQL replication, we need to modify the MySQL configuration file (my.cnf or my.ini) on the source server to enable binary logging. Here’s what we need to do:

  • Locate the MySQL configuration file: The location of the configuration file varies depending on the operating system you’re using. On Linux systems, it is typically located in /etc/my.cnf or /etc/mysql/my.cnf. On Windows systems, it is typically located in the MySQL installation directory under the subdirectory named “my.ini”.
  • Modify the ini file: Add the following lines to the [mysqld] section of the configuration file:
server-id=1

log-bin=mysql-bin

To identify servers within a replication topology, a server ID is assigned. MySQL 8.0 assigns a default server-id value of 1, however this value can be changed dynamically and must be a positive integer ranging from 1 to (232)−1. You can modify the server-id value easily using the following query:

SET GLOBAL server_id = your-prefered-integer-value;

In this article, I will be using the default server-id value of 1. The `server-id` value should be unique for each server in the replication setup. Also, the `log-bin` value sets the name of the binary log file that will be created on the source server. According to the official documentation, it is important to note that the following has an effect on the source:

To achieve the highest level of durability and consistency in a replication configuration that uses InnoDB with transactions, it is recommended to set the following values in the source’s my.cnf file:

innodb_flush_log_at_trx_commit=1

And:

sync_binlog=1

Verify that the skip_networking system variable is disabled on the source. If this variable is enabled, communication between the replica and the source will be impossible, resulting in a replication failure.

For the changes to take effect, restart the MySQL service.

Configure the replication server

Remember that in order for our replication to work, each server must have a unique server-id value. For example, if you have two servers, one source(or primary) and one replica, you can set `server-id` to 1 on the source and 2 on the replica. It is worthy to note that each replica server must be configured with a unique server-id that is greater than the source’s server-id to which it is connected to. It is possible to also have multiple replica servers, this isn’t a problem as long as you choose a server-id value that does not conflict with existing replicas.

  • Stop the MySQL service on the replica server.
  • Create a backup of the MySQL data directory (my.ini) on the replica server. This is to ensure that you can recover the database if something goes wrong during the replication process.
  • Edit the MySQL configuration file (my.cnf or my.ini) to enable replication and set a unique server-id. This can be done by adding the following lines to the configuration file:
[mysqld]
server-id= unique_integer_value

Or by using the MySQL command:

SET GLOBAL server_id=unique_integer_value;

Make sure that the `unique_integer_value` is a unique integer that is different from the server ID of the master server and any other slave servers.

  • Start the MySQL service on the replica server.

Create a MySQL user with replication privileges on the replica server

To establish a connection between the source and its replicas, each replica requires a unique MySQL username and password. Hence, it is necessary to create a corresponding user account on the source. You can create a new user account by employing the CREATE USER command.

CREATE USER 'replication_user'@'localhost' 
                     IDENTIFIED BY 'password';

Replace `replication_user` with the desired username and `password` with the desired (far stronger) password for the new user.

Additionally, to authorize this account for replication purposes, use the GRANT statement to confer the requisite privileges.

GRANT REPLICATION SLAVE ON *.* TO 'replication_user'@'%'

Note that in this example, the privilege is granted to a user named `replication_user` from any host (represented by `%`)

Get Binary Log Coordinates

In order for the replica to start the replication process at the accurate time, it is important to get the source’s current coordinates within its binary log. This is to ensure that the source server’s data is synchronized with the replication servers. To do this, set the ‘Flush tables’ option with a read lock and stop executing any commands on the source server.

mysql> FLUSH TABLES WITH READ LOCK;

Note: If you intend to export data without locking your tables, please skip this step.

To determine the current binary log file name and position, use the SHOW MASTER STATUS statement on a separate session of the source.

mysql> SHOW MASTER STATUS

Then, record the current binary log coordinates (file name) for the source server and position, as they will be necessary when configuring the replication settings later.

Create snapshots of existing data in the source server

We are going to copy the existing data from the source server into the replica server. This can be easily done using the mysqldump tool, which is used to create a dump of all the databases we want to replicate. It is the most preferred method when using the InnoDB

mysqldump --all-databases --master-data > dbdump.db

This dumps all databases to a file named dbdump, and locks the table which might result to a downtime. To avoid this, you can use this command like this:

mysqldump --skip-lock-tables --single-transaction --flush-logs 
--hex-blob --master-data=2 -A -uroot -p > C:\dump.sql

The –master-data statement starts the replication process on the replica server. If you do not want to start the replication process yet, then start the replica with `--skip-slave-start`

Use the `CHANGE MASTER TO` statement to configure the replica.

This statement helps us configure the replica in order to connect to the source server by specifying some information. It includes information such as the source server’s hostname or IP address, the replication username and password, and the binary log file and position from which replication should start.

For example, the `CHANGE MASTER TO` statement might look like this:

CHANGE MASTER TO
      MASTER_HOST='master_hostname',
      MASTER_USER='replication_user',
      MASTER_PASSWORD='password',
      MASTER_LOG_FILE='mysql-bin.000001',
      MASTER_LOG_POS=12345;

Make sure to replace `master_hostname`, `replication_user`, `password`, `mysql-bin.000001`, and `12345` with the appropriate values for your setup.

Lastly, start the replication process by executing the `START SLAVE` statement on the slave server.

mysql> START SLAVE;

Test MySQL Replication

You can test that your MySQL replication is working by executing the following steps:

1. Login to the MySQL on your source server, create a database and insert some data into a table.

mysql -u root -p

Then:

mysql> CREATE DATABASE testdb;

Exit from the source server using the EXIT statement:

mysql> EXIT;

2. Login to the replica server and verify the the replication status using the following command:

mysql> SHOW SLAVE STATUS\G

Look for the `Slave_IO_Running` and `Slave_SQL_Running` fields in the output to ensure that replication is running without errors.

3. Check for all existing databases in the replica server using the following statement:

mysql> SHOW DATABASES;

The newly created testdb should have been replicated and exists on the replica server. If the data is not replicated, check the replication status on the slave server for any error messages or warnings.

Yayy! You just learnt how to replicate servers using MySQL Replication techniques.

 

The post A Beginners Guide To MySQL Replication Part 2: Configuring Source and Replica Servers appeared first on Simple Talk.



from Simple Talk https://ift.tt/pwVe5O3
via