An important aspect of building and managing a database is organizing the objects within your database. You may have tables that support different functions, for example, a schema for warehousing operations and another for sales. Some logins may need access to some tables, but not others. You might want to isolate one set of objects within a database from other sets of objects. All of this, and more, can be accomplished using schemas within a database and PostgreSQL supports the use of schema for just these types of functions.
In the sample database I’ve created as a part of this ongoing series, I created a couple of schemas and organized my tables within them. You can see the database here in the CreateDatabase.sql
script. The rest of the code in this article is in the folder 08_Schema
.
Managing Schema
A schema is used first and foremost as an organizing mechanism on your database. From there, you can get into using schema to help design security, manage access, and generally control what users can see and do within your database. When you create a blank database, it comes with a default schema, public.
When you create an object, like a table, it will automatically go into your default schema unless you specify otherwise. By default, all logins to the database have access to the public
schema (PostgreSQL 15 has changed this default, so now users do not have rights to create objects in the public schema). Other than these default behaviors, the public schema is just another schema in the database and most of the functions and rules we’ll discuss in the rest of the article apply.
To get stated creating your own schemas, the syntax is very straightforward:
CREATE SCHEMA mytestschema;
This creates a schema called mytestschema
. To create a table within that schema, you simply use a two part name (schema_name.table_name
) for the table within the CREATE TABLE
command like this:
create table mytestschema.testtable (id int, somevalue varchar(50));
It’s the same with any queries too:
select id from mytestschema.testtable;
You can think of the schema as the owner of the table (the owner of the schema is technically the owner of the table). Defining the owner in all your code helps ensure that accidents don’t happen. Because, when you start using schema, you can define the same object names if they’re in different schemas. Keeping object names distinct is a good practice, but sometimes the same name is the best name in different schemas):
create schema secondschema; create table secondschema.testtable (insertdate date, someothervalue varchar(20));
This is perfectly valid. If I were to write, what I consider poor code, like this:
select * from testtable;
This likely results in the following error:
ERROR: relation "testtable" does not exist
LINE 2: select * from testtable;
Initially, is seems like it’s an error because PostgreSQL can’t figure out which of the two testtable
tables to pull from. Rather, it’s because logins have a default schema. When I run a query like the one immediately preceding, without a schema identifying where the table lives, PostgreSQL looks in my default search path. If it’s not there, well, that table doesn’t exist. This is true even though, I’ve got two with that name. PostgreSQL doesn’t check other schemas “just in case.”
Later in the article, I will discuss how to manage the schema defaults.
If the schema is empty, you can drop it:
drop table if exists secondschema.testtable; drop schema if exists secondschema;
If I do not drop the table first, an error will occur:
SQL Error [2BP01]: ERROR: cannot drop schema mytestschema because other objects depend on it
Detail: table mytestschema.testtable depends on schema mytestschema
Hint: Use DROP ... CASCADE to drop the dependent objects too.
In the error message, there is a hint for how to get around this. I could rewrite my query like this:
drop schema if exists mytestschema cascade;
The beauty of this is that it will remove all tables, views, etc., within the given schema. That is also the horror of this syntax, that it’s going to remove all the tables, views, etc., without consulting with you in any way.
There is a default schema in every database that gets created, public. However, it is just the default and like most defaults, it can be changed. In fact, you can even drop the public schema if you so choose. I started this section detailing how to create your own schema because, I think it’s a good practice to organize your data storage into a defined schema that you directly control, as opposed to just accepting the defaults.
Controlling the Default Search Path
In addition to helping you organize your database objects; schema helps you control access to those objects. I haven’t delved into security yet in this series, and it’s likely a little way off still. However, I will talk some about how schema helps you manage security on your database. (My teammate Ryan Booz did recently publish an article on the subject, “PostgreSQL Basics: Roles and Privileges”.)
In this section I want to detail some of the ways you can manage the default schema.
In the last example of the section above, I showed how you can have duplicate table names between schema, but, that you must reference the schema name to access those tables. However, this isn’t the whole story.
There’s actually a defined search list for schema that you can see using this query:
show search_path;
If you haven’t changed anything in your server, the default results are:
"$user",public
Each user has a schema of their own, like SQL Server. That’s the $user
schema you see above. However, if you don’t specify the schema, it’ll default to the first one in the search list, public in this case. We can add schema to the search list for the current connection:
SET search_path TO radio,public;
That will not only add the radio
schema to the the search_path
, it will rearrange the order on the search_path
so that the radio
schema is searched before the public
schema. If you disconnect and come back to the connection, you will have to reset the path when using the SET
command.
If you want to make the changes to the path the default, you can use the ALTER ROLE
to set any role to have a specific search path. For example:
ALTER ROLE scaryDba SET search_path = 'radio,public,$user';
If you want to set the default for the server\cluster\database, you can modify the search_path
in the postgressql.cnf
file, or you can use:
ALTER ROLE ALL SET search_path = '$user';
This will not override the individual path’s set but will make every login that does not override the search path need to specify the schema name with referencing any object. (Which as noted, is a best practice.)
Ownership and Basic Privileges
When you create a schema, you can define a schema owner as other than the login that is running the command:
CREATE SCHEMA secureschema AUTHORIZATION radio_admin;
A schema I haven’t created yet, secureschema
, would be created with the owner being the radio_admin
login role (also not yet defined because I am not digging into security yet). That will ensure that only that the radio_admin
login, and any accounts defined as superuser, of course, can work within that schema.
You can also control behaviors on schema. For example, since I have set up independent schema in this database and intend to use it in that manner, I can revoke access for all logins to create objects on the public schema (This is only necessary in Postgres 14 and earlier, in 15 create is not granted by default.):
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
This is using two different meanings of the word “public
”. In the first, ‘public’
, we’re referring to the schema of that name. In the second, ‘PUBLIC’
, we’re talking about the role that contains all users within the database. This is a shorthand mechanism to make sure no one accidentally puts things into the public schema. I’d say it’s a good practice to follow if you’re going to use other schemas, especially if you’re using them to help properly secure your database.
You can grant various privileges between schema and users such that a given user can read data from tables within a schema, but not modify the data contained in the tables (read only access). In this way, you can combine multiple types of data within one database but isolate them from each other as needed. This is a principal reason for using schema in your database.
If you’re not isolating storage and access between schema, it may make less sense to use a schema other than public in the first place. However, most applications have varying levels of access they would like to manage, and schema will lend themselves to appropriately implementing that type of security. If security isn’t a concern, using schema names instead of putting all objects in the public schema can be advantageous for documentation as well.
Conclusion
Schemas are containers that allow you to segment objects and security at a lower level than a database. Using schemas other than public has good benefits. In PostgreSQL there are several methods of setting the default schema if your users are averse to using two-part names.
If you are familiar with schemas from SQL Server, the core functionality of schema is roughly the same as in SQL Server. However, there is added functionality like the ability to control a search list changes the types of behaviors you can control within PostgreSQL.
The post PostgreSQL Schema: Learning PostgreSQL with Grant appeared first on Simple Talk.
from Simple Talk https://ift.tt/keVPj9N
via
No comments:
Post a Comment