3

PostgreSQL version : 11.2
OS : RHEL or Oracle Linux 7.6 (Yet to be decided)

I am at the design stage of setting up a production database. In production, the DB will be around 300GB to 400GB in size.

This is what I have in mind. Please let me know if this is a good idea for a PostgreSQL production deployment.

I will have the following file system mounted with the following sizes:

/db ----> 50 GB
/pgdata ---> 500 GB

I will initialize the database cluster in the custom location /db/postgres/pg11/data .
And right from the start I will start creating tablespaces like this:

CREATE TABLESPACE orders_tbs LOCATION '/pgdata/<db_name>/orders_tbs';

...and place business objects in these tablespaces like this:

CREATE TABLE orders (id int, order_item text) tablespace orders_tbs;

Thank You Arkhena, CL

In the above case, I thought of creating a separate filesystem for datafiles (/pgdata) and keeping the config files and logs in /db. So, my idea was bad.

Since I am in RHEL/Oracle Linux , by default, my $PGDATA will be /var/lib/pgsql/11/data .
But, I prefer to have my $PGDATA in a custom location like /db/postgres/pg11/data

Since the datafiles reside in $PGDATA/base directory , how about creating a disk layout like below using LVM?

A 50GB filesystem for $PGDATA's top parent directory /db and a separate 500 GB filesystem for $PGDATA/base directory ?

[root@localhost ~]# df -Ph
Filesystem Size Used Avail Use% Mounted on
<output snipped>
.
.
/dev/mapper/VolGroup1-LogVol02 50G 23M 49.9G 1% /db
/dev/mapper/VolGroup1-LogVol04 500G 2.7M 499.9G 1% /db/postgres/pg11/data/base

I need to check with our Linux Admin on how the above disk layout can be optimally created without causing any storage bottlenecks.

John K. N.
18.9k14 gold badges56 silver badges117 bronze badges
asked Feb 25, 2019 at 10:53
2
  • Why do you want to separate the table data files from the rest of the database? Commented Feb 26, 2019 at 8:10
  • @CL. one good reason to separate data and indexes in different tablespaces is to improve access to both data and indexes: indexes can be read while data is accessed if there are 2 tablespaces created in different physical discs. Commented Dec 19, 2019 at 14:15

2 Answers 2

3

Tablespaces in PostgreSQL exist for some really particular needs (and I doubt a less than 500 GB is in that case) and for SQL compliance. If you plan to create tablespaces to store eventually everything on the same disk, please don't. If you plan to create tablespaces inside $PGDATA, please don't.

Tablespaces lead to more complex recovery operations (if you need one). You'll curse yourself later, trust me.

You'll find a lot of excellent advices in Christophe Pettus slides (PostgreSQL when it's not your job). The slide 27 is about tablespaces and why not using them.

answered Feb 25, 2019 at 14:03
1
  • 5
    When "don't use tablespaces" is said, it's omitted the situation where tablespaces are the only solution (and the main reason for tablespaces to exist): when you run out of space in your $PGDATA and the only way to expand your cluster is add another hard disk but not increasing the quota of your $PGDATA location Commented Dec 19, 2019 at 14:12
0

I don't know what solution have you adopted finally but if you don't want to have your $PGDATA in the default location in this article you can find how to create a custom $PGDATA. The article is about CentOS 7 with Postgresql 10:

If you wish to place your data in (e.g.) /pgdata/10/data, create the directory with the good rights (I must add this is really important: owner and rights. This is postgres:postgres and 700):

# mkdir -p /pgdata/10/data
# chown -R postgres:postgres /pgdata

Then, customize the systemd service:

# systemctl edit postgresql-10.service

Add the following content:

[Service]
Environment=PGDATA=/pgdata/10/data

This will create a /etc/systemd/system/postgresql-10.service.d/override.conf file which will be merged with the original service file.

To check its content:

# cat /etc/systemd/system/postgresql-10.service.d/override.conf
[Service]
Environment=PGDATA=/pgdata/10/data

Reload systemd:

# systemctl daemon-reload

Initialize the PostgreSQL data directory:

# /usr/pgsql-10/bin/postgresql-10-setup initdb

Start and enable the service:

# systemctl enable postgresql-10
# systemctl start postgresql-10

If you don't or can't touch the postgresqlXX.service file to change location of $PGDATA you can start your cluster manually instead of using systemctl:

cd /<location_of_your_PGDATA>
su postgres -c 'pg_ctl start -D <location_of_your_PGDATA> -l <name_of_file_to_log_startup>'
answered Dec 19, 2019 at 14:42

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.