I currently have all of my data in a MySQL database which handles my entire application system. My current setup is a Java desktop application that publishes data to the database through a NodeJS API. There is an iOS application that connects to this API as well as a website that allows people to view the data. The website updates the data on the screen every second for each person. I fear that I will not be able to keep my costs down as the users scale because the query that is run every second contains a lot of joins and sorts.
What I would like to do is add a Mongo database like a cache. Overtime certain data is inserted to MySQL from my API, I would have the API run that intensive query and store the results in the mongo database. I am currently having to take the results form the query and transform it into a lot of nested arrays and return it in son form. It seems like I could save a lot of time/resources by only running the query when data changes and then storing it already formatted into the Mongo database.
I actually started with using purely MongoDB a while back, but it was her to run reports and analyze the type of data I am storing. It really needs to be stored normalized for easier reports.
Does this idea seem feasible and practical? I feel like I will have to do a lot of scaling with the MySQL side of things as my user base grows.
1 Answer 1
What is the data like? Why does it need updating every second? Etc. (Your question is very vague; some of my Plans may not apply, but I can't tell without understanding the problem set.)
Plan A:
If you are SELECTing
identical queries more often than you are changing the underlying table(s), the MySQL's "Query cache" may be an excellent solution.
query_cache_type = ON
query_cache_size = 100M -- No more (unless using Aurora)
The rest is automagic.
Plan B:
Redis or Memcached make better caching tools than another database (eg, MongoDB).
Plan C:
Let's look at your queries and see if they can be sped up. Better indexes, rearrange the schema, better queries, etc.
Plan D:
Devise some scheme that "knows" whether the data has changed, and avoid re-performing the query.
-
So my data is results for racing brackets. The data is normalized such that driver id's are assigned to heats, which are linked to a sheet and then sheets are linked to a race. I will update my original post to include the query that I am using.Aaron– Aaron2017年04月14日 19:33:29 +00:00Commented Apr 14, 2017 at 19:33
-
So, during a race, lots of users are watching things change? Between races, there is little traffic? When a query comes in after 1 second, what are the odds of the resultset being different? Are different users issuing the same query? Or are they slightly different, such as driver_id?Rick James– Rick James2017年04月14日 19:41:34 +00:00Commented Apr 14, 2017 at 19:41
-
There might be 50 races occurring at the same time with people wanting updates about each race. The data would be changing every 2 minutes for each race, so overall there would be changes every 3 seconds. I could easily set the update time to a higher value, but the point of the service is for people to be at a race and be able to get updates if they can't hear the announcer. I realized that I actually simplified this query, but I have a live feed query that is very complex. That data is very unique to the user, so I think using query cache will have to suffice.Aaron– Aaron2017年04月14日 19:50:18 +00:00Commented Apr 14, 2017 at 19:50
-
Every 3 seconds (or so), all entries involving the changed table will be purged from the QC. (This is why I asked about read versus write frequency.)Rick James– Rick James2017年04月14日 22:17:24 +00:00Commented Apr 14, 2017 at 22:17
-
If I am not mistaken, MariaDB (not MySQL) has a "subquery cache". This might help with your 'very complex' queries, avoiding some of the purging.Rick James– Rick James2017年04月14日 22:18:56 +00:00Commented Apr 14, 2017 at 22:18