How can I divide up a database between multiple micro-services in an acceptable way when I have multiple constraints?

Question 1

We have a pretty big monolith. We're currently looking at carving it up into micro-services.

Right now we have a file server micro-service (FS) that handles delivering files to clients based on certain criteria.
FS has it's own database that contains meta-data relating to each file. So when a client requests a file FS processes the params, asks another service it's opinion, and decides which file to return. FS's database is quite comprehensive, it contains most of the information relating to each file.

We have another scheduler service (SS) that decides when a client needs to update it's files.
SS also has a database that duplicates a lot of the information that's already in the FS database.
Currently we are looking at moving SS into a micro-service.
Much of the information in the SS database can be queried from a new API in FS. But some of the information we need fast access to in SS and so don't want to add the latency of making API calls from SS to FS.
What are some options for solving this within micro-service best practice?
There is talk of allowing FS and SS to access the same database, but I'm against this idea. SS should query FS via an API for information it requires from that database.
The other option is to have some of the information duplicated in both databases and linked via and ID perhaps. What are your thoughts on this as a solution? I think this is the better option of the two.

I'm completely open to any other suggestions you may have.

Question 2

How are the services currently ensuring that the duplicated information is consistent? Or will the duplication be a result of the split into microservices?

Question 3

There is currently a lot of duplicated information. Right now a unique ID is created when someone adds a file to the system (via a webapp that shares the SS database) a unique ID is created, the same info and the new ID is passed to FS which stores it in it's database. If the attempt to save the data to FS fails then the record in SS is deleted and the end user is notified that the upload failed. It is possible for some one with direct db access to cause a mismatch by adding/deleting records manually. This is also something we want to address as much as possible with the fix.

Question 4

I'd be inclined to chop it up like:

Microservice 1: (FS) that handles delivering files to clients based on certain criteria.

Microservice A: Repository that returns meta-data relating to each file.

Microservice 2: (SS) that decides when a client needs to update it's files.

Microservice B: Repository that duplicates a lot of the information from A.

And the architecture looks something like:

[1] -- [A] -- ... [some data store, X]

[2] -- [B] -- ... [some data store, Y]

After you get that all set up, sure, you might decide, hey, maybe X and Y should be the same database. That isn't really anything that [A] or [B] need to care very much about, and it definitely isn't anything [1] or [2] need to care about.

Question 5

How the data is modeled and how you present access to your data can be two different things. You lose a lot of the power of a relational database if you don't actually store related information in the same database. There is no reason to have separate databases per service, in fact multiple services connecting to the same database can be a good way to expose different levels of access. Having separate scheduling and file services that both rely on the same underlying database lets you manage access to those features separately, while the data is available to the features so performance is maintained. If separate databases are truly required, then the next best solution is replication for information that needs to be available for reads. What needs to be avoided at essentially all costs is trying to allow multiple databases to read and write the same data as this comes with all sorts of syncing problems that aren't worth solving until you are at a scale that truly requires it.

Question 6

But isn't that against the principles of micro-services? For reference we could be dealing with tens of thousands of connections per second at some points.

Question 7

@bot_bot if your micro-services are micro to the point they are being a significant multiplier from requests to connections, it's time to rearchitect and bring them more in line with requests.

Roger Roger 2871 silver badge3 bronze badges · Answer 1 · 2019-10-31 14:48:46Z

I'd be inclined to chop it up like:

Microservice 1: (FS) that handles delivering files to clients based on certain criteria.

Microservice A: Repository that returns meta-data relating to each file.

Microservice 2: (SS) that decides when a client needs to update it's files.

Microservice B: Repository that duplicates a lot of the information from A.

And the architecture looks something like:

[1] -- [A] -- ... [some data store, X]

[2] -- [B] -- ... [some data store, Y]

After you get that all set up, sure, you might decide, hey, maybe X and Y should be the same database. That isn't really anything that [A] or [B] need to care very much about, and it definitely isn't anything [1] or [2] need to care about.

Ryathal Ryathal 13.5k1 gold badge36 silver badges48 bronze badges · Answer 2 · 2019-10-01 13:55:21Z

How the data is modeled and how you present access to your data can be two different things. You lose a lot of the power of a relational database if you don't actually store related information in the same database. There is no reason to have separate databases per service, in fact multiple services connecting to the same database can be a good way to expose different levels of access. Having separate scheduling and file services that both rely on the same underlying database lets you manage access to those features separately, while the data is available to the features so performance is maintained. If separate databases are truly required, then the next best solution is replication for information that needs to be available for reads. What needs to be avoided at essentially all costs is trying to allow multiple databases to read and write the same data as this comes with all sorts of syncing problems that aren't worth solving until you are at a scale that truly requires it.

But isn't that against the principles of micro-services? For reference we could be dealing with tens of thousands of connections per second at some points.
@bot_bot if your micro-services are micro to the point they are being a significant multiplier from requests to connections, it's time to rearchitect and bring them more in line with requests.

Stack Exchange Network

How can I divide up a database between multiple micro-services in an acceptable way when I have multiple constraints?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How can I divide up a database between multiple micro-services in an acceptable way when I have multiple constraints?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions