Postgresql RAM optimization for containers and kubernetes

Question 1

I wondered how to optimize the RAM usage of my PostgreSQL database deployed in Kubernetes.

Particularly I'm interested in configuring the shared_buffers and effective_cache_size values.

Typically the recommendation stated by various sources is to use:

If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. If you have less RAM you'll have to account more carefully for how much RAM the OS is taking up; closer to 15% is more typical there. There are some workloads where even larger settings for shared_buffers are effective, but given the way PostgreSQL also relies on the operating system cache, it's unlikely you'll find using more than 40% of RAM to work better than a smaller amount. (PostgreSQL Website)
The value should be set to 15% to 25% of the machine’s total RAM (EDB website)
No more than 50% of the available RAM (EDB PostgreSQL tuning)

Since containers use by design much less background RAM what would be a good recommendation for setting the shared_buffers value?

Right now the value is set to 25% (1GB of 4GB total container limit) and I notive that it get's filled up quite quickly.

Question 2

Impossible to answer because we have no idea how you're going to use the database. There is no one-size-fits-all solution. And you already found the 25% recommendation to get started. For the rest, you have to test and optimise your own environment.

Question 3

Thanks for the answer. Could you recommend me any good learning resources on Postgres optimization in general?

Question 4

shared_buffers vs. effective cache size

Within postgresql tuning these two numbers represent:

shared_buffers "how much memory is dedicated to PostgreSQL to use for caching data"
effective_cache_size "how much memory is available for disk caching by the operating system and within the database itself"

So repeated queries that are cached will work "better" if there is a lot of shared_buffers. But reading tables from the disk that underlies the database will work "better" if there is more effective_cache_size

If the workload is so varied and non repetitive that repeated disk reads are required then a higher effective_cache_size is better. If there are lots of queries that are the same from a small set then more shared_buffers is better

It's worth remembering that on a normally working Linux system the system will grab any "idle" memory for disk caching, on the basis that unexpected reads might happen and if they do this otherwise unused memory could speed them up

How to work out what settings are best

It's worth mentioning that the most often used "method" is to just not care and over provision the memory used. Hardware is cheap and time to sort it out properly is expensive. That said..

There are two approaches

run the application in prod and monitor the memory pressure
devise a test to find the limits of memory use

To monitor memory in prod, usually this is done with a third party package such as cAdvisor or Prometheus

Look at the graphs and compare them with your incoming requests. If there is clearly a lot of head room, then increase the parameters and see if this helps the system efficiency.

For testing, essentially you do the same thing but using an artificial test workload

To work out what memory sizes are best there are automated tools like senpai. Basically they will run a test workload repeatedly under various memory settings until it stalls

VorsprungVorsprung · Accepted Answer · 2021-11-26 10:18:43Z

shared_buffers vs. effective cache size

Within postgresql tuning these two numbers represent:

shared_buffers "how much memory is dedicated to PostgreSQL to use for caching data"
effective_cache_size "how much memory is available for disk caching by the operating system and within the database itself"

So repeated queries that are cached will work "better" if there is a lot of shared_buffers. But reading tables from the disk that underlies the database will work "better" if there is more effective_cache_size

If the workload is so varied and non repetitive that repeated disk reads are required then a higher effective_cache_size is better. If there are lots of queries that are the same from a small set then more shared_buffers is better

It's worth remembering that on a normally working Linux system the system will grab any "idle" memory for disk caching, on the basis that unexpected reads might happen and if they do this otherwise unused memory could speed them up

How to work out what settings are best

It's worth mentioning that the most often used "method" is to just not care and over provision the memory used. Hardware is cheap and time to sort it out properly is expensive. That said..

There are two approaches

run the application in prod and monitor the memory pressure
devise a test to find the limits of memory use

To monitor memory in prod, usually this is done with a third party package such as cAdvisor or Prometheus

Look at the graphs and compare them with your incoming requests. If there is clearly a lot of head room, then increase the parameters and see if this helps the system efficiency.

For testing, essentially you do the same thing but using an artificial test workload

To work out what memory sizes are best there are automated tools like senpai. Basically they will run a test workload repeatedly under various memory settings until it stalls

Stack Exchange Network

Postgresql RAM optimization for containers and kubernetes

1 Answer 1

shared_buffers vs. effective cache size

How to work out what settings are best

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Postgresql RAM optimization for containers and kubernetes

1 Answer 1

shared_buffers vs. effective cache size

How to work out what settings are best

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions