Best practice for creating tablespaces in Postgresql

Question 1

PostgreSQL version : 11.2
OS : RHEL or Oracle Linux 7.6 (Yet to be decided)

I am at the design stage of setting up a production database. In production, the DB will be around 300GB to 400GB in size.

This is what I have in mind. Please let me know if this is a good idea for a PostgreSQL production deployment.

I will have the following file system mounted with the following sizes:

/db ----> 50 GB
/pgdata ---> 500 GB

I will initialize the database cluster in the custom location /db/postgres/pg11/data .
And right from the start I will start creating tablespaces like this:

CREATE TABLESPACE orders_tbs LOCATION '/pgdata/<db_name>/orders_tbs';

...and place business objects in these tablespaces like this:

CREATE TABLE orders (id int, order_item text) tablespace orders_tbs;

Thank You Arkhena, CL

In the above case, I thought of creating a separate filesystem for datafiles (/pgdata) and keeping the config files and logs in /db. So, my idea was bad.

Since I am in RHEL/Oracle Linux , by default, my $PGDATA will be /var/lib/pgsql/11/data .
But, I prefer to have my $PGDATA in a custom location like /db/postgres/pg11/data

Since the datafiles reside in $PGDATA/base directory , how about creating a disk layout like below using LVM?

A 50GB filesystem for $PGDATA's top parent directory /db and a separate 500 GB filesystem for $PGDATA/base directory ?

[root@localhost ~]# df -Ph
Filesystem Size Used Avail Use% Mounted on
<output snipped>
.
.
/dev/mapper/VolGroup1-LogVol02 50G 23M 49.9G 1% /db
/dev/mapper/VolGroup1-LogVol04 500G 2.7M 499.9G 1% /db/postgres/pg11/data/base

I need to check with our Linux Admin on how the above disk layout can be optimally created without causing any storage bottlenecks.

Question 2

Why do you want to separate the table data files from the rest of the database?

Question 3

@CL. one good reason to separate data and indexes in different tablespaces is to improve access to both data and indexes: indexes can be read while data is accessed if there are 2 tablespaces created in different physical discs.

Question 4

Tablespaces in PostgreSQL exist for some really particular needs (and I doubt a less than 500 GB is in that case) and for SQL compliance. If you plan to create tablespaces to store eventually everything on the same disk, please don't. If you plan to create tablespaces inside $PGDATA, please don't.

Tablespaces lead to more complex recovery operations (if you need one). You'll curse yourself later, trust me.

You'll find a lot of excellent advices in Christophe Pettus slides (PostgreSQL when it's not your job). The slide 27 is about tablespaces and why not using them.

Question 5

When "don't use tablespaces" is said, it's omitted the situation where tablespaces are the only solution (and the main reason for tablespaces to exist): when you run out of space in your $PGDATA and the only way to expand your cluster is add another hard disk but not increasing the quota of your $PGDATA location

Question 6

I don't know what solution have you adopted finally but if you don't want to have your $PGDATA in the default location in this article you can find how to create a custom $PGDATA. The article is about CentOS 7 with Postgresql 10:

If you wish to place your data in (e.g.) /pgdata/10/data, create the directory with the good rights (I must add this is really important: owner and rights. This is postgres:postgres and 700):

# mkdir -p /pgdata/10/data
# chown -R postgres:postgres /pgdata

Then, customize the systemd service:

# systemctl edit postgresql-10.service

Add the following content:

[Service]
Environment=PGDATA=/pgdata/10/data

This will create a /etc/systemd/system/postgresql-10.service.d/override.conf file which will be merged with the original service file.

To check its content:

# cat /etc/systemd/system/postgresql-10.service.d/override.conf
[Service]
Environment=PGDATA=/pgdata/10/data

Reload systemd:

# systemctl daemon-reload

Initialize the PostgreSQL data directory:

# /usr/pgsql-10/bin/postgresql-10-setup initdb

Start and enable the service:

# systemctl enable postgresql-10
# systemctl start postgresql-10

If you don't or can't touch the postgresqlXX.service file to change location of $PGDATA you can start your cluster manually instead of using systemctl:

cd /<location_of_your_PGDATA>
su postgres -c 'pg_ctl start -D <location_of_your_PGDATA> -l <name_of_file_to_log_startup>'

Arkhena Arkhena 1,61010 silver badges15 bronze badges · Answer 1 · 2019-02-25 14:03:34Z

Tablespaces in PostgreSQL exist for some really particular needs (and I doubt a less than 500 GB is in that case) and for SQL compliance. If you plan to create tablespaces to store eventually everything on the same disk, please don't. If you plan to create tablespaces inside $PGDATA, please don't.

Tablespaces lead to more complex recovery operations (if you need one). You'll curse yourself later, trust me.

You'll find a lot of excellent advices in Christophe Pettus slides (PostgreSQL when it's not your job). The slide 27 is about tablespaces and why not using them.

When "don't use tablespaces" is said, it's omitted the situation where tablespaces are the only solution (and the main reason for tablespaces to exist): when you run out of space in your $PGDATA and the only way to expand your cluster is add another hard disk but not increasing the quota of your $PGDATA location

EAmez EAmez 1751 silver badge10 bronze badges · Answer 2 · 2019-12-19 14:42:20Z

I don't know what solution have you adopted finally but if you don't want to have your $PGDATA in the default location in this article you can find how to create a custom $PGDATA. The article is about CentOS 7 with Postgresql 10:

If you wish to place your data in (e.g.) /pgdata/10/data, create the directory with the good rights (I must add this is really important: owner and rights. This is postgres:postgres and 700):

# mkdir -p /pgdata/10/data
# chown -R postgres:postgres /pgdata

Then, customize the systemd service:

# systemctl edit postgresql-10.service

Add the following content:

[Service]
Environment=PGDATA=/pgdata/10/data

This will create a /etc/systemd/system/postgresql-10.service.d/override.conf file which will be merged with the original service file.

To check its content:

# cat /etc/systemd/system/postgresql-10.service.d/override.conf
[Service]
Environment=PGDATA=/pgdata/10/data

Reload systemd:

# systemctl daemon-reload

Initialize the PostgreSQL data directory:

# /usr/pgsql-10/bin/postgresql-10-setup initdb

Start and enable the service:

# systemctl enable postgresql-10
# systemctl start postgresql-10

If you don't or can't touch the postgresqlXX.service file to change location of $PGDATA you can start your cluster manually instead of using systemctl:

cd /<location_of_your_PGDATA>
su postgres -c 'pg_ctl start -D <location_of_your_PGDATA> -l <name_of_file_to_log_startup>'

Stack Exchange Network

Best practice for creating tablespaces in Postgresql

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Best practice for creating tablespaces in Postgresql

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions