We are developing a prototype for a BIG data product. We have almost 2 billion records. We have used PostgresSQL 9.5 as a back-end and Python as front-end.
We are using a 16*2.4 GHz processors with 160 GB RAM on Amazon servers.
Our bench mark for query results are 10 seconds maximum, although simple count queries on tables with index are taking approximately 30 minutes to 1 Hour .
To overcome our performance issues I required changes in configuration file:
max_connections = 20
shared_buffers = 14GB
effective_cache_size = 42GB
work_mem = 367001kB
maintenance_work_mem = 2GB
checkpoint_segments = 128
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
I created the monthly partition on sale_date
column. Even though performance are still bad. I was reading some articles and found out that PostgreSQL uses a single CPU for query processing from a single connection.
To use all the CPUs for query processing we can use sharding, and pg_shard
is the opensource component created by cistusdata for this exact purpose.
I have installed the citusdb as explained on multi-node-setup-page.
When I execute CREATE EXTENSION pg_shard;
, I get this error:
ERROR: could not open extension control file "/usr/share/postgresql/9.5/extension/pg_shard.control": No such file or directory
I manually copied the pg_shard.control
file in the extension folder and I then started getting this error:
pg_shard--1.2.sql not found.
Any help is appreciated. How can I solve this problem?
-
2What kind of queries need whole 10 seconds to return? Is your application doing OLTP or OLAP? I'd first investigate and optimize indexing and queries, before considering any sharding or similar technology.ypercubeᵀᴹ– ypercubeᵀᴹ2016年04月06日 09:57:16 +00:00Commented Apr 6, 2016 at 9:57
-
we are using simple aggregate queries . we have done indexing and and set other optimization parameters as well.Khan Aamir– Khan Aamir2016年04月07日 09:32:50 +00:00Commented Apr 7, 2016 at 9:32
1 Answer 1
It looks like pg_shard files are not installed. You will need to build pg_shard from sources and install it using make install within pg_shard source folder.
Please be aware that pg_shard has reached end of life, and it is included in recently released (Citus 5.0).
You can also get Citus from PGDG if you are on RPM based system.
sudo yum install -y https://download.postgresql.org/pub/repos/yum/9.5/redhat/rhel-6-x86_64/pgdg-ami201503-95-9.5-2.noarch.rpm
sudo yum install -y citus_95
-
Or one can now install Citus as an extension, too.András Váczi– András Váczi2016年04月07日 06:05:24 +00:00Commented Apr 7, 2016 at 6:05
-
Yes that's correct. Citus 5.0 is an extension.Murat Tuncer– Murat Tuncer2016年04月07日 08:34:30 +00:00Commented Apr 7, 2016 at 8:34
Explore related questions
See similar questions with these tags.