-
Notifications
You must be signed in to change notification settings - Fork 4.5k
-
Hi everyone,
I'm experiencing a problem at the moment where Redash seems to make it possible to open the maximum number of connections to my database as possible. The database is an AWS RDS database (I think it's a t3 small) but often times I catch around 60+ database connections which bricks my website since it literally prevents anything else from trying to connect or query data due to the massive queue.
Redash connects raw to this database and we do make a lot of use of query results in some dashboards for analytics but it's becoming a problem by blocking our admin frontend and storefront. We cannot afford to upgrade the database again and using alternatives like read replicas generally become too costly for it to be worth it. I see there is a WORKER_COUNT parameter/variable that I can set but I am not sure if that will help in this situation.
I do not want to completely slow Redash down but it is not the priority and should not be able to kill production like it currently is.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 3 replies
-
Thank you for your report.
I think the maximum number of the connection should be WORKER_COUNT ...
I'm not sure why it happens.
But, may I just hear your WORKER_COUNT and data source(database) ?
Beta Was this translation helpful? Give feedback.
All reactions
-
I'm unfortunately not the one that originally installed Redash but our ec2 box seems to have a setup.sh that seems to be running Redash in docker compose. This is the relevant bit that I see is commented out:
scheduler:
<<: *redash-service
command: scheduler
#environment:
# QUEUES: "celery"
# WORKERS_COUNT: 1
I don't know if it was previously turned on or if it's a template compose file that simply had it commented out, but should I be specifying it here? I've caught a few times where large queries on Redash caused this problem but it also randomly happens throughout the day; it's possible some queries are set to auto-refresh but I know that some of our systems also query data from Redash so that likely explains the randomness. Regardless, if I could force Redash to queue it's queries on it's side and not spam the database with queries I would prefer that as then I can trust the database will have enough IO to cater for both Redash (even though it might mean Redash executes them sequentially in a queue) that would be great.
Beta Was this translation helpful? Give feedback.
All reactions
-
Actually, I see there is a few more and there doesn't seem to be anything sensitive inside the file so here is the complete config:
version: "2"
x-redash-service: &redash-service
image: redash/redash:10.1.0.b50633
depends_on:
- postgres
- redis
env_file: /opt/redash/env
restart: always
services:
server:
<<: *redash-service
command: server
ports:
- "5000:5000"
environment:
REDASH_WEB_WORKERS: 4
scheduler:
<<: *redash-service
command: scheduler
#environment:
# QUEUES: "celery"
# WORKERS_COUNT: 1
scheduled_worker:
<<: *redash-service
command: worker
environment:
QUEUES: "scheduled_queries,schemas"
WORKERS_COUNT: 1
adhoc_worker:
<<: *redash-service
command: worker
environment:
QUEUES: "queries"
WORKERS_COUNT: 2
worker:
<<: *redash-service
command: worker
environment:
QUEUES: "periodic emails default"
WORKERS_COUNT: 1
redis:
image: redis:5.0-alpine
restart: always
postgres:
image: postgres:9.6-alpine
env_file: /opt/redash/env
volumes:
- /opt/redash/postgres-data:/var/lib/postgresql/data
restart: always
nginx:
image: nginx:latest
ports:
- "80:80"
- "443:443"
depends_on:
- server
links:
- server:redash
volumes:
- /opt/redash/nginx/nginx.conf:/etc/nginx/conf.d/default.conf
- /opt/redash/nginx/certs:/etc/letsencrypt
- /opt/redash/nginx/certs-data:/data/letsencrypt
restart: always
I believe we are 1 version behind from latest if I recall correctly - it should be difficult to upgrade if you believe it to be valuable.
Beta Was this translation helpful? Give feedback.
All reactions
-
Redash only executes one query at a time, and this is limited by the number of workers assigned to the queries task. If you go to Profile (bottom left) → System Status → RQ Status you can see this. In the following example Redash can run a maximum number of 4 concurrent queries:
Beta Was this translation helpful? Give feedback.
All reactions
-
Obviously it's starting to seem like individual queries that are killing performance so I might have to take a look at the ones listed here and re-evaluate them. Is this correct?
Beta Was this translation helpful? Give feedback.