sorry for this noob question. I do have an application which expects about 5000 users accessing it simultaneously, my current database is running on RDS and for each request, a query is called, it takes about 30 milliseconds to be executed.
The main caveat is when we open multiple connections the CPU database spikes to 100% and the app starts to getting timeout error.
What solution would be possible to handle so many requests?
Architecture:
- 3 EC2 (db.t3.large) running behind a load balancer
- 1 RDS (db.r5.16xlarge) MySql 5.7
- 1 Index Created for the AAA table, this table has about 17K rows, after index created the number of rows read was reduced to 6754 rows.
- When the connections go to spike it achieves up to 2K connections
- The Max connections for this instance type is 10K
- The time registered on profiling to execute is 0.03705 seconds (before the index it was 0.7 seconds)
SELECT Count(*)
FROM (SELECT `AAA`.id,
Degrees(Acos(-0.55754389032929 * Sin(Radians(ZZZ.latitude)) +
0.83014746301876 * Cos(
Radians(ZZZ.latitude)) *
Cos(Radians(151.211111 -
ZZZ.longitude)))) * 60 * 1.1515 * 1.609344 AS
distance
FROM `AAA`
LEFT JOIN 'ZZZ'
ON `AAA`.`userid` = 'ZZZ'.`id`
LEFT JOIN 'BBB'
ON 'ZZZ'.`id` = 'BBB'.`userid`
LEFT JOIN 'CCC'
ON 'ZZZ'.`id` = 'CCC'.`userid`
LEFT JOIN 'EEE'
ON 'CCC'.`badgeid` = 'EEE'.`id`
LEFT JOIN 'FFF'
ON 'ZZZ'.`id` = 'FFF'.`userid`
WHERE ( ( `AAA`.`active` = 1 )
AND ( `AAA`.`approved` = 1 )
AND ( 'ZZZ'.`active` = 1 )
AND ( 'ZZZ'.`verified` = 1 ) )
AND ( St_contains(St_geomfromtext( 'POLYGON((151.30120109009 -33.97620109009, 151.30120109009 -33.79602090991, 151.12102090991 -33.79602090991, 151.12102090991 -33.97620109009,151.30120109009 -33.97620109009 ))'), AAA.`position`) )
GROUP BY `AAA`.`id`
HAVING distance < 10) `c`;
-
Too low data for analysis. In general - optimize this problematic query. Think about pre-calculation and maybe about approximate data - I doubt that the user needs in 9-digit precise.Akina– Akina2020年12月12日 12:36:48 +00:00Commented Dec 12, 2020 at 12:36
-
Have you implemented connection pooling?Michael Kutz– Michael Kutz2020年12月12日 13:10:27 +00:00Commented Dec 12, 2020 at 13:10
-
Additional information request from your spiking server. any SSD or NVME devices on MySQL Host server? Post on pastebin.com and share the links. A) complete (not edited) my.cnf or my.ini From your SSH login root, Text results of: B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; F) SHOW ENGINE INNODB STATUS; G) SELECT name, count FROM information_schema.innodb_metrics ORDER BY name; for server workload tuning analysis to provide suggestions.Wilson Hauck– Wilson Hauck2020年12月12日 19:11:38 +00:00Commented Dec 12, 2020 at 19:11
-
1Show us the query; we can probably help you speed it up.Rick James– Rick James2020年12月13日 01:58:04 +00:00Commented Dec 13, 2020 at 1:58
-
"3M rows read" -- sounds like a missing index.Rick James– Rick James2020年12月13日 02:29:25 +00:00Commented Dec 13, 2020 at 2:29
1 Answer 1
According to your screenshot, it looks like the CPU is spiking before the connections grow. Not sure if you can conclude they are exactly correlated unless this is the only time you see the spiking occur and it's always reving up exactly just a couple minutes before the connections grow (then maybe something just before the connection is registered in your counter). But even your graph shows the lowest point of CPU utilization was at your highest point of connections, so I don't think that's your issue here.
As Michael Lutz mentioned, you should look into Connection Pooling on AWS to see if helps with the Timeouts you're experiencing. But it's also possible your query is not performant when ran concurrently, at high rates of calls, or for whatever varying predicates that could potentially be used.
Without seeing your query, the EXPLAIN AND ANALYZE, and information about your indexes, there's not much more information we can provide. If you can update your post with this information, I'll update my answer accordingly, should any other potential sources of issues become evident.
-
Do you have the web server on the same server as the database?Rick James– Rick James2020年12月13日 01:58:46 +00:00Commented Dec 13, 2020 at 1:58
-
@RickJames Did you mean to comment on OP's question or you really meant to ask me? Btw from OP's screenshot it looks like that's the Counter Metrics for his database server (since it mentions MySQL in the top left).J.D.– J.D.2020年12月13日 02:02:28 +00:00Commented Dec 13, 2020 at 2:02